Well, it is not live indexing - I know that slows things down a lot. :)
It is the first time I used the MySQL Python connector and based my
script on an example [1].
I think the bottleneck may be calling commit() after each edition
record. My hard disk is writing almost continuously. I also use one
insert statement for each contributor, publisher, identifier, etc. I
think dynamically creating insert statements containing all of these
may outweigh the many database calls.

I'll look into the MySQL dump format.

Ben

On 2 September 2013 00:10, Tom Morris <[email protected]> wrote:
> On Sun, Sep 1, 2013 at 5:42 PM, Ben Companjen <[email protected]>
> wrote:
>>
>>
>> I created a Python script that reads a dump file and puts the edition
>> records in a MySQL database.
>>
>> It works (when you manually create the tables), but it's very slow:
>> 10000 records in about an hour, which means all editions will take
>> about 10 days of continuous operation.
>>
>> Does anybody have a faster way? Is there some script for this in the
>> repository?
>
>
> Are you creating the SQL by hand?  You probably want to be emulating the
> format used by the MySQL dump utility.  That'll make sure it gets loaded in
> quickly.
>
> In particular, I suspect it's probably doing live indexing.  What you want
> to do is load all the data and then index at the end rather than indexing as
> you go.
>
> Tom
>
> _______________________________________________
> Ol-tech mailing list
> [email protected]
> http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech
> To unsubscribe from this mailing list, send email to
> [email protected]
>
_______________________________________________
Ol-tech mailing list
[email protected]
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech
To unsubscribe from this mailing list, send email to 
[email protected]

Reply via email to