* Smells like some huge LONGTEXTs were INSERTed, then DELETEd. Perhaps just a single one of nearly 500M.
* Yes, there is an impact on full table scans -- it has to step over the empty spots. Or maybe not -- one big cow chip of 500MB would be easy to leap over. * OPTIMIZE TABLE is the primary way to recover the space. It _may_ be that space on the _end_ is automatically recovered. If so, you might see the .MYD shrink even when OPTIMIZE is not run. * LONGTEXT is almost never useful. Do you really think there are thingies that big? Consider changing it to MEDIUMTEXT -- that would truncate any biggies to 16MB. * Smells like a key-value (EAV) schema design. Such is destined to fail when trying to scale. Yeah, you are probably stuck with Drupal. Here are my comments and recommendations on EAV: http://mysql.rjweb.org/doc.php/eav * Please try to find a way in your Email client to display STATUS without losing the spacing. * When you switched to InnoDB, I hope you had innodb_file_per_table turned on. That way, you can actually recoup the space when doing ALTER. Otherwise, you will be stuck with a bloated ibdata1 file that you cannot easily shrink. * In InnoDB, the LONGTEXT will usually be stored separately, thereby making a full table scan relatively efficient. > -----Original Message----- > From: Johan De Meersman [mailto:vegiv...@tuxera.be] > Sent: Friday, February 15, 2013 4:21 AM > To: mysql. > Subject: MyISAM table size vs actual data, and performance > > > > Hey list, > > I've got another peculiar thing going on :-) Let me give you a quick > summary of the situation first: we host a number of Drupal sites, each > site and it's db on separate VMs for reasons that are not important to > this scenario. MySQL is 5.0.51a-24+lenny4-log (Debian); I don't have > the exact Drupal version here but it's likely to be a 5.x branch. > > The easy thing to say would of course be "upgrade your versions", but > that's not an option right now. I don't really care if that means I > have no actual *fix* for the problem - I know how to work around it. > I'm just looking for a cause, ideally maybe even a specific known bug. > Strangely enough, I'm seeing this on three distinct installs; but > others with the same versions and setup (but different sites) seem to > not exhibit the issue. > > So, what I'm seeing is this: Drupal's "variable" table keeps growing, > but there does not seem to be more data. I understand how record > allocation and free space in datafiles works, but this is well beyond > the normal behaviour. > > > http://www.tuxera.be/filestore/heciexohhohj/df-year.png > > As you can see here (the lime green line of /data), growth occurs > gradually (and the issue happened in september, as well), until it > seems to reach a certain point. At some point, however, performance on > that table (notably select * - it's a drupal thing) pretty much > instantly plummets, and the query takes around half a minute to run - > whereas now, after reclaiming the free space, it takes 0.03 seconds. > > I don't have the exact numbers as I wasn't on-site yesterday evening, > but since the disk is 5GB, the reclaimed space yesterday must have been > around 850MB - for a table that is now 30MB. No records were deleted > from the table, the workaround is as simple as "OPTIMIZE TABLE > variable" - simply rebuild the table. The logs make no mention of a > crashed table, so it's very unlikely that this is a borked index. Even > if it were, I wouldn't expect a scan of 30MB in 1202 rows to take half > a minute, on a table that is accessed so often that it's relevant > blocks are bound to be in the filesystem cache. > > The table's structure is fairly simple, too: > > > > CREATE TABLE `variable` ( > `name` varchar(128) NOT NULL DEFAULT '', `value` longtext NOT NULL, > PRIMARY KEY (`name`) > ) ENGINE=MyISAM DEFAULT CHARSET=utf8 > > > > > I currently have another system that's also growing that table, here's > a bit of session: > > > <blockquote> > mysql> show table status like 'variable'; > +----------+--------+---------+------------+------+----------------+--- > ----------+-----------------+--------------+-----------+--------------- > -+---------------------+---------------------+---------------------+--- > --------------+----------+----------------+---------+ > | Name | Engine | Version | Row_format | Rows | Avg_row_length | > | Data_length | Max_data_length | Index_length | Data_free | > | Auto_increment | Create_time | Update_time | Check_time | Collation | > | Checksum | Create_options | Comment | > +----------+--------+---------+------------+------+----------------+--- > ----------+-----------------+--------------+-----------+--------------- > -+---------------------+---------------------+---------------------+--- > --------------+----------+----------------+---------+ > | variable | MyISAM | 10 | Dynamic | 1188 | 795 | 493277732 | > | 281474976710655 | 41984 | 492332716 | NULL | 2011-12-13 16:18:53 | > | 2013-02-15 12:35:18 | 2012-10-17 15:45:11 | utf8_general_ci | NULL | > | > | | > +----------+--------+---------+------------+------+----------------+--- > ----------+-----------------+--------------+-----------+--------------- > -+---------------------+---------------------+---------------------+--- > --------------+----------+----------------+---------+ > > 12:36:55|root@xendbprod1-1:/data/mysql 0 # mysqlcheck --verbose DBNAME > variable # Connecting to localhost... > DBBAME.variable OK > # Disconnecting from localhost... > > 12:37:07|root@xendbprod1-1:/data/mysql 0 # mysqlcheck --verbose -g > DBNAME variable # Connecting to localhost... > DBNAME.variable OK > # Disconnecting from localhost... > > mysql> show table status where name like "variable"; > +----------+--------+---------+------------+------+----------------+--- > ----------+-----------------+--------------+-----------+--------------- > -+---------------------+---------------------+---------------------+--- > --------------+----------+----------------+---------+ > | Name | Engine | Version | Row_format | Rows | Avg_row_length | > | Data_length | Max_data_length | Index_length | Data_free | > | Auto_increment | Create_time | Update_time | Check_time | Collation | > | Checksum | Create_options | Comment | > +----------+--------+---------+------------+------+----------------+--- > ----------+-----------------+--------------+-----------+--------------- > -+---------------------+---------------------+---------------------+--- > --------------+----------+----------------+---------+ > | variable | MyISAM | 10 | Dynamic | 1188 | 497 | 493277732 | > | 281474976710655 | 41984 | 492686616 | NULL | 2011-12-13 16:18:53 | > | 2013-02-15 12:37:35 | 2013-02-15 12:37:07 | utf8_general_ci | NULL | > | > | | > +----------+--------+---------+------------+------+----------------+--- > ----------+-----------------+--------------+-----------+--------------- > -+---------------------+---------------------+---------------------+--- > --------------+----------+----------------+---------+ > 1 row in set (0.00 sec) > > mysql> optimize table variable; > +-----------------+----------+----------+----------+ > | Table | Op | Msg_type | Msg_text | > +-----------------+----------+----------+----------+ > | DBNAME.variable | optimize | status | OK | > +-----------------+----------+----------+----------+ > 1 row in set (2.37 sec) > > mysql> show table status where name like "variable"; > +----------+--------+---------+------------+------+----------------+--- > ----------+-----------------+--------------+-----------+--------------- > -+---------------------+---------------------+---------------------+--- > --------------+----------+----------------+---------+ > | Name | Engine | Version | Row_format | Rows | Avg_row_length | > | Data_length | Max_data_length | Index_length | Data_free | > | Auto_increment | Create_time | Update_time | Check_time | Collation | > | Checksum | Create_options | Comment | > +----------+--------+---------+------------+------+----------------+--- > ----------+-----------------+--------------+-----------+--------------- > -+---------------------+---------------------+---------------------+--- > --------------+----------+----------------+---------+ > | variable | MyISAM | 10 | Dynamic | 1188 | 497 | 590940 | > | 281474976710655 | 39936 | 0 | NULL | 2011-12-13 16:18:53 | 2013-02-15 > | 12:39:30 | 2013-02-15 12:39:30 | utf8_general_ci | NULL | | | > +----------+--------+---------+------------+------+----------------+--- > ----------+-----------------+--------------+-----------+--------------- > -+---------------------+---------------------+---------------------+--- > --------------+----------+----------------+---------+ > 1 row in set (0.00 sec) > > </blockquote> > > > As you can see, the table is almost 500MB. Running mysqlcheck shows > that it is fine and not in need of a format upgrade. Optimizing the > table, however, results in it being reduced to half an MB... At any > point in this cycle, however, no excessive amount of data is inserted > in the table. Insert, updates and deletes are fairly common, but not to > the point where there would have been a couple of hundred MB of data in > the file. Had I left this one for some more time, I'm sure that it > would eventually also have reached the tipping point where the side > suddenly becomes unusably slow. > > The performance degradation isn't perfectly traceable to a single point > in time; the slowlog does show that query being slow on occasion; > however it seems that it is intermittent until it reaches a point of no > return, when the queries get slow enough that a cascade of pending > connections happens until we run out of free handles and the site just > stops responding. > > > Now, in my understanding, the size of the file, while unusual, really > shouldn't have much bearing on the execution time of a full table scan, > should it? I mean, even a full tablescan is simply going to scan the > row index and only read the blocks that are actually in use? Hell, even > IF every single row would somehow have ended up in a different block > (which the allocation alghorithm should have prevented), that's still > only slightly over a thousand blocks, or half a meg. > > Thus, my question to the smart people on this list is threefold: > > 1) Has anyone seen this behaviour before, and maybe know of a specific > bug? > 2) What might cause the file to grow this large? > 3) How could the size of the file impact the speed of a full tablescan > if only a few blocks in the file are actually in use? > > Am I missing something obvious, here? > > I have now updated those tables to InnoDB, time will tell if the issue > reasserts itself. In the mean time, though I would very much like to > understand what's going on. > > > Thank you for your thoughts, > /johan > > > -- > > What's tiny and yellow and very, very dangerous? > A canary with the root password.