Basically if you load enough triples into a triple store eventually the loading process will slow down. Two ways to handle this are (1) improve the performance of the triple store and (2) load fewer triples.
Freebase contains a lot of repetitive information. For instance, around 10% percent of the facts in Freebase are "a" statements, but Freebase also restates these using the ns:type.object.type predicate and also states the reverse predicates. If you remove the duplicate facts, you can get rid of 20% of the facts just like that. My infovore framework uses Map/Reduce to make a purified extract of Freebase and it runs quickly enough and reliably enough I can run it against the freebase dump every week http://basekb.com/ Recently I loaded a practically complete copy of Freebase into Virtuoso on a machine with 32GB of RAM https://groups.google.com/forum/#!topic/infovore-basekb/m7FL5nqVDbI Load time was around four hours. On Wed, Mar 5, 2014 at 3:56 PM, Gopala Krishna Koduri <gopala.kod...@gmail.com> wrote: > Hi, > > I'm in the process of importing the freebase into a local instance of > virtuoso 7. I've split the freebase dump into chunks of 10 million triples > each. > > I'm running two instances of bulk loaders on a quadcore machine with 48 GB > of memory (set 4500000 buffers and 3300000 dirty buffers in virtuoso.ini). > As the loading progressed, it got slower and slower. I tried halting the > process, creating a checkpoint and resuming it again. But it did not seem to > help (even restarting the virtuoso instance did not). The remaining few > million triples are taking forever to load. > > Did I miss any performance tuning that can improve the process? Or is this > the normal behaviour? > > thanks, > > -- > Koduri Gopala Krishna, > Music Technology Group, UPF - Barcelona, Spain. > > Portfolio - http://tidbits.co.in > తెలుగువారికి సాంకేతిక సహాయం - http://techsetu.com > > > ------------------------------------------------------------------------------ > Subversion Kills Productivity. Get off Subversion & Make the Move to > Perforce. > With Perforce, you get hassle-free workflows. Merge that actually works. > Faster operations. Version large binaries. Built-in WAN optimization and > the > freedom to use Git, Perforce or both. Make the move to Perforce. > http://pubads.g.doubleclick.net/gampad/clk?id=122218951&iu=/4140/ostg.clktrk > _______________________________________________ > Virtuoso-users mailing list > Virtuoso-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/virtuoso-users > -- Paul Houle Expert on Freebase, DBpedia, Hadoop and RDF (607) 539 6254 paul.houle on Skype ontolo...@gmail.com ------------------------------------------------------------------------------ _______________________________________________ Virtuoso-users mailing list Virtuoso-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/virtuoso-users