Re: [mythtv-users] How big is your database?

Michael T. Dean Fri, 02 Dec 2005 11:07:11 -0800

Ant Daniel wrote:

On 01/12/05, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:

No doubt most of that is taken up by the "previously recorded" data.
That could probably be reduced to just the series # and epsiode # if it
wasn't for the "forget old" feature.  The other thing that a good
clean-up would do is check the relationships so that there aren't
orphans in large tables like the markup table.  Another nice thing to do
would be to have an ability to remove old frontend data from the system.

Once again us poor non-US not considered ;)

On the bright side, I don't think this idea would be considered forinclusion--even for US users.

In the UK our listing data doesn't appear to give a decent series # /
epsiode #, I know myth's supposed to generate something when not
provided, but I haven't worked out if that works. (I guess with almost
2yrs data I should go and have a look).

However when first reading your post (again we don't tend to use # to
symbolise number), I read it as series hash, epsiode hash. Now that's
a good idea.

Why not hash the episode description rather than store the full length
text, wouldn't that make a significant impact on previously recorded?

OK, so why shouldn't this be considered?

My database:
Total number of records (SELECT COUNT(*) FROM ...):
   oldrecorded: 3787
   recorded: 189
   recordedmarkup: 1746031 (Yes, that is 1.75M records.)

SELECT MIN(starttime) FROM oldrecorded;  ->  2004-05-03 03:00:00
SELECT NOW();  -> 2005-12-02 13:07:4

So, my database is approximately 19 months old. In more than a year anda half, I've got 3787 records in oldrecorded.


SELECT MAX(LENGTH(description)) FROM oldrecorded;  ->  251

(and this is with DataDirect, which typically has much longer, moredetailed descriptions than other grabbers).

So, let's assume that every description is this same length. That meansI have 950537 bytes of description after 1 1/2 years.


Let's add in title and subtitle:

SELECT MAX(LENGTH(title)) FROM recorded;  ->  39
SELECT MAX(LENGTH(subtitle)) FROM recorded;  ->  58

Once again, assume all titles/subtitles are the as long as the longest,and we have an additional 147693 and 219646 bytes. Adding these up, wehave 1317876 bytes. That's 1.26MiB. OK, so what about the actualstorage requirements (length + 1 for the two VARCHAR types and length +2 for the TEXT type). If we don't assume that's taken care of by thefact that most values are smaller than the max current value, we need toadd an additional 15148 bytes, So, we're up to 1.27MiB...


Looking at the actual files:
# ls -al oldrecorded.*
-rw-rw----  1 mysql mysql 1004848 Dec  2 12:59 oldrecorded.MYD
-rw-rw----  1 mysql mysql  462848 Dec  2 12:59 oldrecorded.MYI
-rw-rw----  1 mysql mysql    9040 Nov  2 16:46 oldrecorded.frm

So, including other fields and the index, we have 1.41MiB (and noticethat we've overestimated the table size by assuming the max length).Now, I could go down the road of comparing sizes when using the lengthof episode ID (12) or the length of a hash, but it all comes downto--even if we save 1.4MiB, have we accomplished anything?

To answer this question, let's look at recordedmarkup. As mentionedabove, I have 189 recordings which give me 1.75 million records inrecordedmarkup. Since recordedmarkup has a relatively simple structure(only one VARCHAR), we can figure the row size using column type storagerequirements:


Field          Type                 Storage Required
-----          ----                 ----------------
chanid         int(10) unsigned      4
starttime      datetime              8
mark           bigint(20)            8
offset         varchar(32)           l+1 (we'll call it 11)
type           int(11)               4
-------        ----                 ----------------
Total                                35

and we get about 100 of these records per minute of an MPEG-2 recording.

From inside the directory containing the mythconverg binary files:
# du -sh
192M    .

# ls -al recordedmarkup.*
-rw-rw----  1 mysql mysql 106883020 Dec  2 13:45 recordedmarkup.MYD
-rw-rw----  1 mysql mysql  69468160 Dec  2 13:45 recordedmarkup.MYI
-rw-rw----  1 mysql mysql      8668 Jul 19  2004 recordedmarkup.frm

So, as Gregorio said, most of the storage (168MiB of 192MiB) required isin the recordedmarkup table. And, there are two ways for you tominimize the size of your recordedmarkup table:

   a) delete your recordings, or
   b) transcode all your recordings to NUV's with MPEG-4

Note that recordedmarkup will be small for users with framegrabbersbecause they use RTJPEG or MPEG-4 instead of MPEG-2, so they don't haveto work to get a small database. However, IMHO, it's not worth theeffort of transcoding even for MPEG-2 users. I have one transcodedrecording out of 189 total and my 192MiB database is the equivalent ofless than 10 min of video even with the terribly low bitrate (2200kbpsaverage) I use.


Mike
_______________________________________________
mythtv-users mailing list
[email protected]
http://mythtv.org/cgi-bin/mailman/listinfo/mythtv-users

Re: [mythtv-users] How big is your database?

Reply via email to