Ant Daniel wrote:

On 01/12/05, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:

No doubt most of that is taken up by the "previously recorded" data.
That could probably be reduced to just the series # and epsiode # if it
wasn't for the "forget old" feature.  The other thing that a good
clean-up would do is check the relationships so that there aren't
orphans in large tables like the markup table.  Another nice thing to do
would be to have an ability to remove old frontend data from the system.

Once again us poor non-US not considered ;)

On the bright side, I don't think this idea would be considered for inclusion--even for US users.

In the UK our listing data doesn't appear to give a decent series # /
epsiode #, I know myth's supposed to generate something when not
provided, but I haven't worked out if that works. (I guess with almost
2yrs data I should go and have a look).

However when first reading your post (again we don't tend to use # to
symbolise number), I read it as series hash, epsiode hash. Now that's
a good idea.

Why not hash the episode description rather than store the full length
text, wouldn't that make a significant impact on previously recorded?

OK, so why shouldn't this be considered?

My database:
Total number of records (SELECT COUNT(*) FROM ...):
   oldrecorded: 3787
   recorded: 189
   recordedmarkup: 1746031 (Yes, that is 1.75M records.)

SELECT MIN(starttime) FROM oldrecorded;  ->  2004-05-03 03:00:00
SELECT NOW();  -> 2005-12-02 13:07:4

So, my database is approximately 19 months old. In more than a year and a half, I've got 3787 records in oldrecorded.

SELECT MAX(LENGTH(description)) FROM oldrecorded;  ->  251

(and this is with DataDirect, which typically has much longer, more detailed descriptions than other grabbers).

So, let's assume that every description is this same length. That means I have 950537 bytes of description after 1 1/2 years.

Let's add in title and subtitle:

SELECT MAX(LENGTH(title)) FROM recorded;  ->  39
SELECT MAX(LENGTH(subtitle)) FROM recorded;  ->  58

Once again, assume all titles/subtitles are the as long as the longest, and we have an additional 147693 and 219646 bytes. Adding these up, we have 1317876 bytes. That's 1.26MiB. OK, so what about the actual storage requirements (length + 1 for the two VARCHAR types and length + 2 for the TEXT type). If we don't assume that's taken care of by the fact that most values are smaller than the max current value, we need to add an additional 15148 bytes, So, we're up to 1.27MiB...

Looking at the actual files:
# ls -al oldrecorded.*
-rw-rw----  1 mysql mysql 1004848 Dec  2 12:59 oldrecorded.MYD
-rw-rw----  1 mysql mysql  462848 Dec  2 12:59 oldrecorded.MYI
-rw-rw----  1 mysql mysql    9040 Nov  2 16:46 oldrecorded.frm

So, including other fields and the index, we have 1.41MiB (and notice that we've overestimated the table size by assuming the max length). Now, I could go down the road of comparing sizes when using the length of episode ID (12) or the length of a hash, but it all comes down to--even if we save 1.4MiB, have we accomplished anything?

To answer this question, let's look at recordedmarkup. As mentioned above, I have 189 recordings which give me 1.75 million records in recordedmarkup. Since recordedmarkup has a relatively simple structure (only one VARCHAR), we can figure the row size using column type storage requirements:

Field          Type                 Storage Required
-----          ----                 ----------------
chanid         int(10) unsigned      4
starttime      datetime              8
mark           bigint(20)            8
offset         varchar(32)           l+1 (we'll call it 11)
type           int(11)               4
-------        ----                 ----------------
Total                                35

and we get about 100 of these records per minute of an MPEG-2 recording.

From inside the directory containing the mythconverg binary files:
# du -sh
192M    .

# ls -al recordedmarkup.*
-rw-rw----  1 mysql mysql 106883020 Dec  2 13:45 recordedmarkup.MYD
-rw-rw----  1 mysql mysql  69468160 Dec  2 13:45 recordedmarkup.MYI
-rw-rw----  1 mysql mysql      8668 Jul 19  2004 recordedmarkup.frm

So, as Gregorio said, most of the storage (168MiB of 192MiB) required is in the recordedmarkup table. And, there are two ways for you to minimize the size of your recordedmarkup table:
   a) delete your recordings, or
   b) transcode all your recordings to NUV's with MPEG-4

Note that recordedmarkup will be small for users with framegrabbers because they use RTJPEG or MPEG-4 instead of MPEG-2, so they don't have to work to get a small database. However, IMHO, it's not worth the effort of transcoding even for MPEG-2 users. I have one transcoded recording out of 189 total and my 192MiB database is the equivalent of less than 10 min of video even with the terribly low bitrate (2200kbps average) I use.

Mike
_______________________________________________
mythtv-users mailing list
[email protected]
http://mythtv.org/cgi-bin/mailman/listinfo/mythtv-users

Reply via email to