Hi, [Ankur]: How can make sure this happens? -- show processlist is how we spot it... literally it takes hours on our set up so easy to find.
So we ended up with 2 DBs - DB1 we insert to, prepare and do batch processing - DB2 serving the read only web app Periodically we dump the DB1, point the read-only web app at DB1, load the dump into DB2, and then DB2 becomes the write DB... and an endless cycle. We can afford data latency in our system though. Why did we end up with this? Because of locking on writes that kill reads as you say... basically you can't insert when a read is happening on myisam as it locks the whole table. InnoDB has row level locking to get around this but in our experience (at the time we had 130million records) it just didn't work either. We spent €10,000 for the supposed "european expert" on mysql from their professional services and were unfortunately very disappointed. Seems such large tables are just problematic with mysql. We are now very much looking into Lucene technologies for search and Hadoop for reporting and datamining type operations. SOLR does a lot of what our DB does for us. So with myisam... here is what we learnt: Only very latest mysql versions (beta still I think) support more than 4G memory for indexes (you really really need the index in memory, and where possible the FK for joins in the index too). Mysql has differing join strategies between innoDB and myisam, so be aware. An undocumented feature of myisam is you can create memory buffers for single indexes: In the my.cnf: taxon_concept_cache.key_buffer_size=3990M -- for some reason you have to drop a little under 4G then in the DB run: cache index taxon_concept in taxon_concept_cache; load index into cache taxon_concept; This allows for making sure an index gets into memory for sure. And here is our main cnf changes: table_cache = 512 sort_buffer_size = 128M read_buffer_size = 64M read_rnd_buffer_size = 300M thread_cache_size = 8 query_cache_type = 0 query_cache_size = 0 thread_concurrency = 8 # this is used for repair by sorting and is faster than keycache by far myisam_max_sort_file_size = 200G myisam_max_extra_sort_file_size = 200G # this is used for repair by sorting and is limited to 4G myisam_sort_buffer_size = 3990M # maximum value for this is 4g for now but see section on specific keys for large indexes key_buffer_size=3990M tmp_table_size=512M max_heap_table_size=512M -- for repair by sorting, here are the required values myisam_max_sort_file_size = 200G myisam_max_extra_sort_file_size = 200G myisam_sort_buffer_size = 3990M I guess just archive this mail off for if you ever do try myisam for something... these params are the result of a lot of trial and error and days of waiting! Our DB has about 44G of index which we can keep in memory but are hitting limits - hence on the hadoop train. Cheers Tim On Tue, Apr 28, 2009 at 2:49 PM, Wang Zhong <wangzhong....@gmail.com> wrote: > Hi, > > That's really cool. It seems that Hadoop could work with SQL DBs like > Mysql with bearable time. I thought when inserting data to Mysql, the > expense of communication was always a big problem. You got a method to > reduce the expense. > > Using Distribution Databases like HBase is another good choice. It is > based on Hadoop and HDFS, so there is no problem with communication > expense. > > > > On Tue, Apr 28, 2009 at 6:46 PM, Ankur Goel <ankur.g...@corp.aol.com> wrote: >> >> hello hadoop users, >> Recently I had a chance to lead a team building a log-processing system that >> uses Hadoop and MySQL. The system's goal was to process the incoming >> information as quickly as possible (real time or near real time), and make >> it available for querying in MySQL. I thought it would be good to share the >> experience and the challenges with the community. Couldn't think of a better >> place than these mailing lists as I am not much of a blogger :-) >> >> The information flow in the system looks something like >> >> [Apache-Servers] -> [Hadoop] -> [MySQL-shards] -> [Query-Tools] >> >> Transferring from Apache-Servers to Hadoop was quite easy as we just had to >> organize the data in timely buckets (directories). Once that was running >> smooth we had to make sure that map-reduce jobs are fired at regular >> intervals and they pick up the right data. The jobs would then >> process/aggregate the date and dump the info into MySQL shards from the >> reducers [we have our own DB partioning set up]. This is where we hit major >> bottlenecks [any surprises? :-)] >> >> The table engine used was InnoDB as there was a need for fast replication >> and writes but only moderate reads (should eventually support high read >> rates). The data would take up quite a while to load completely far away >> from being near-real time. And so our optimization journey begin. >> >> 1. We tried to optimize/tune InnoDB parameters like increasing the buffer >> pool size to 75 % of available RAM. This helped but only till the time DBs >> were lightly loaded i.e. innoDB had sufficient buffer pool to host the data >> and indexes. >> >> 2. We also realized that InnoDB has considerable locking overhead because of >> which write concurrency is really bad when you have a large number of >> concurrent threads doing writes. The default thread concurrency for us was >> set to no_of_cpu * 2 = 8 which is what the official documentation advises as >> the optimal limit. So we limited the number of reduce tasks and consequently >> the number of concurrent writes and boy the performance improved 4x. We were >> almost there :-) >> >> 3. Next thing we tried is the standard DB optimzation techniques like >> de-normalizing the schema and dropping constraints. This gave only a minor >> performance improvement, nothing earth shattering. Note that we were already >> caching connections in reducers to each MySQL shard and partionining logic >> was embedded into reducers. >> >> 4. Falling still short of our performance objectives, we finally we decided >> to get rid of JDBC writes from reducers and work on an alternative that uses >> MySQLs LOAD utility. >> - The processing would partition the data into MySQL shard specific files >> resident in HDFS. >> - A script would then spawn processes via ssh on different physical machines >> to download this data. >> - Each spawned process just downloads the data for the shard it should >> upload to. >> - All the processes then start uploading data in parallel into their >> respective MySQL shards using LOAD DATA infile. >> >> This proved to be the fastest approach, even in the wake of increasing data >> loads. The enitre processing/loading would complete in less than 6 min. The >> system has been holding up quite well so far, even though we've had to limit >> the number of days for which we keep the data or else the MySQLs get >> overwhelmed. >> >> Hope this is helpful to people. >> >> Regards >> -Ankur >> > > > > -- > Wang Zhong >