Yes, the rows are in primary key order, however each row contains specific integer primary keys; I'm not inserting nulls into a table where the primary key is auto increment, so I don't see why concurrent inserts would fight for similar spots (although, I'm admittedly not a MySQL hotshot, so the basis of my assumption is a *hunch* only).
I'm not sure (yet) if a single-threaded operation would run into an i/o bottleneck. I didn't run mysqlimport using --use-threads=1 just yet (will do if I have the time), but when I've ran it with --use-threads=4 the import (of a ~500 MB dump) took more time than running for different processes (I've split my tab delimited dumps with split into four even pieces and imported those in four different sessions). Anyway, it seems that doing a simple import (from a dump, which isn't tab delimited, but contains complete or extended inserts) takes the same amount of time than doing a mysqlimport using --use-threads=4 and as it turns out splitting my tab delimited dump is too complex to handle gracefully, because my data contains newline characters all over the place, so I've dropped the idea of this whole mysqlimport thing for now. (I'll try the method of migrating an InnoDB database to an NDBCluster described here[1] instead.) If I have the time I'll write up a bug report, or documentation enhancement request for this. Thanks for the input! Regards, Kohányi Róbert [1]: http://johanandersson.blogspot.se/2012/04/mysql-cluster-how-to-load-it-with-data.html On Wed, Jul 25, 2012 at 6:49 PM, Rick James <rja...@yahoo-inc.com> wrote: > I'm skeptical that use-treads can every be very effective. > > What order are the rows in? They are probably in PRIMARY KEY order, which > means that the INSERTing threads will be fighting over similar spots in the > table. > > Is it I/O bound when it is single-threaded? If so, then there can't be any > improvement with use-threads. > > etc. > > Suggest you file a bug with bugs.mysql.com. If nothing else, the > documentation should say more than it does. > >> -----Original Message----- >> From: Róbert Kohányi [mailto:kohanyi.rob...@gmail.com] >> Sent: Tuesday, July 24, 2012 10:52 AM >> To: mysql@lists.mysql.com >> Subject: mysqlimport --use-threads / mysqladmin processlist >> >> I'm in the middle of migrating an InnoDB database to an NDBCluster. I >> use mysqldump to first create two dumps, the first one contains only >> the database schema, the second one contains only tab delimited data >> (via mysqldump --tab). I edit my InnoDB schema here and there >> (ENGINE=InnoDB to ENGINE=NDB, etc.) import it and after this I import >> the InnoDB data *as is* using mysqlimport. >> >> I use it like this: >> >> mysqlimport --local --use-threads=4 db dir/*.txt >> >> (dir of course cotains the tab delimited data I dumped before.) >> >> The import starts, and I check its progress via mysqladmin, like this: >> >> mysqladmin --sleep=1 processlist >> >> this is what I see: http://pastebin.com/raw.php?i=M23fWVjc >> >> Only a single process seems to be loading my data. Is this what I >> *should* see, or, in my case using 4 threads, should I see four >> processes? I'm not asking which one will be faster, I'm just simply >> confused because I don't know what to expect. If I start four different >> mysqlimport processes, each one importing different files, then I can >> see four different process in the mysql processlist. >> >> If it's matters, here is my server version (I use the official >> binaries). >> Server version: 5.5.25a-ndb-7.2.7-gpl MySQL Cluster Community Server >> (GPL) >> >> Regards, >> Kohányi Róbert >> >> -- >> MySQL General Mailing List >> For list archives: http://lists.mysql.com/mysql >> To unsubscribe: http://lists.mysql.com/mysql > -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql