Hi all, for those of you who want to run the DBpedia extraction locally, here are some tips on how to import a Wikipedia dump:
1. configure your mysql server! thats very very important. if you have enough RAM, use it. Find my mysql config below. 2. have two hard-disks. one for the mysql data, one for the wikipedia dumps. 3. use a standalone machine. the Wikipedia import puts a lot of load on harddisk and cpu. I used to use one of our application servers, which already had some load on it, and the import took weeks. 4. defrag your harddisks. can save some time. 5. configure the dbpedia import script. if your're not running a server OS, remove the "-server" flag in the java mwdumper call. (ok, this is not a performance tip, just a note for getting it working) On my workstation (intel quadcore 2,66 ghz, 8gb ram, vista 64bit, two 10k harddisks), the Wikipedia import took around a very decent 6 hours. Cheers, Georgi mysql config: key_buffer = 1024M max_allowed_packet = 32M table_cache = 256 sort_buffer_size = 512M net_buffer_length = 8M read_buffer_size = 64M read_rnd_buffer_size = 64M myisam_sort_buffer_size = 512M -- Georgi Kobilarov Freie Universität Berlin www.georgikobilarov.com ------------------------------------------------------------------------- Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW! Studies have shown that voting for your favorite open source project, along with a healthy diet, reduces your potential for chronic lameness and boredom. Vote Now at http://www.sourceforge.net/community/cca08 _______________________________________________ Dbpedia-discussion mailing list Dbpedia-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion