Well, it is not live indexing - I know that slows things down a lot. :) It is the first time I used the MySQL Python connector and based my script on an example [1]. I think the bottleneck may be calling commit() after each edition record. My hard disk is writing almost continuously. I also use one insert statement for each contributor, publisher, identifier, etc. I think dynamically creating insert statements containing all of these may outweigh the many database calls.
I'll look into the MySQL dump format. Ben On 2 September 2013 00:10, Tom Morris <[email protected]> wrote: > On Sun, Sep 1, 2013 at 5:42 PM, Ben Companjen <[email protected]> > wrote: >> >> >> I created a Python script that reads a dump file and puts the edition >> records in a MySQL database. >> >> It works (when you manually create the tables), but it's very slow: >> 10000 records in about an hour, which means all editions will take >> about 10 days of continuous operation. >> >> Does anybody have a faster way? Is there some script for this in the >> repository? > > > Are you creating the SQL by hand? You probably want to be emulating the > format used by the MySQL dump utility. That'll make sure it gets loaded in > quickly. > > In particular, I suspect it's probably doing live indexing. What you want > to do is load all the data and then index at the end rather than indexing as > you go. > > Tom > > _______________________________________________ > Ol-tech mailing list > [email protected] > http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech > To unsubscribe from this mailing list, send email to > [email protected] > _______________________________________________ Ol-tech mailing list [email protected] http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech To unsubscribe from this mailing list, send email to [email protected]
