Nathan Schmidt wrote:
Dormando,
The most intense performance hits for us have come from losing storage
nodes and the resulting rebalance operations putting a lot of
read/write load on the tables. Worst case we've seen was almost 20
hours of thrashing. Not really user-affecting but it did peg a lot of
ops charts.
We have huge storage hosts (bad idea, I know). So losing an entire host,
while infrequent, can cause thrashing of up to a week. Lowering the
replicate workers prevents it from actually hurting anything, but it
does run full tilt for a while. We have some ideas floating around to
make this less painful / adjustable. Probably won't do them for a while
though :\
We're far from optimal - using UTF-8 collations/charsets which seems
to unconditionally add a great deal of overhead to index sizes. Our
initial bring-up was on a clean debian box so we went with MyISAM,
which does have its problems - index corruption, O(n) repairs, ugh.
Functional but non-optimal. Good point about the optimize table
routine, we probably haven't done that for a while.
Oh man :)
Switch to InnoDB as soon as possible! The access patterns were designed
for InnoDB, and you're very likely to get lock thrashing under MyISAM.
MogileFS also counts on the database being fairly reliable.
Get a box with a decent RAID housing battery backed write cache, get
more RAM, and switch to InnoDB :) Should be way faster even if the data
size jumps a bit during the switch. OPTIMIZE TABLE still applies.
We do a lot of list_keys operations when our caches are cold because
we're using mogile as our primary datastore for wiki page revisions -
we didn't really anticipate the rest of our infrastructure handling
this kind of data size so well so mogile's been left alone while the
rest of our system has become much more efficient and
well-distributed. We're on all-commodity hardware, and those boxes max
out at 8Gb each, which puts a ceiling on where we can go without
moving to some substantially more expensive systems. Our file.MYI and
file_on.MYI are 8.5Gb together, which means MySQL is doing some
interesting tricks to keep things running smoothly. That said, we're
getting ready to bite the bullet and bring in somewhat more
substantial boxes for our tracker dbs, oh well.
I think 16-32G are still commodity (see Ask's response)...
I don't really believe you lose out once the index is large than RAM.
Especially with MyISAM files, you lose out when your key_reads over
key_read_requests get above nothing. A bulk of your data is unlikely to
be active, so the entire index is never cached in RAM. Under innodb you
give both the index _and_ data caches a large sum of memory, which are
also LRU's.
So while my mogilefs DB's have 50G+ datafiles sometimes, 8-16G of RAM is
plenty. The writeback cache is a gigantic win due to all of the scratch
writes mogilefs does. Also switching to deadline elevator for the io
scheduler... The read is read cache, and you're caching hot paths :)
-Dormando