Basically if you load enough triples into a triple store eventually
the loading process will slow down.  Two ways to handle this are (1)
improve the performance of the triple store and (2) load fewer
triples.

Freebase contains a lot of repetitive information.  For instance,
around 10% percent of the facts in Freebase are "a" statements,  but
Freebase also restates these using the ns:type.object.type predicate
and also states the reverse predicates.  If you remove the duplicate
facts,  you can get rid of 20% of the facts just like that.

My infovore framework uses Map/Reduce to make a purified extract of
Freebase and it runs quickly enough and reliably enough I can run it
against the freebase dump every week

http://basekb.com/

Recently I loaded a practically complete copy of Freebase into
Virtuoso on a machine with 32GB of RAM

https://groups.google.com/forum/#!topic/infovore-basekb/m7FL5nqVDbI

Load time was around four hours.

On Wed, Mar 5, 2014 at 3:56 PM, Gopala Krishna Koduri
<gopala.kod...@gmail.com> wrote:
> Hi,
>
> I'm in the process of importing the freebase into a local instance of
> virtuoso 7. I've split the freebase dump into chunks of 10 million triples
> each.
>
> I'm running two instances of bulk loaders on a quadcore machine with 48 GB
> of memory (set 4500000 buffers and 3300000 dirty buffers in virtuoso.ini).
> As the loading progressed, it got slower and slower. I tried halting the
> process, creating a checkpoint and resuming it again. But it did not seem to
> help (even restarting the virtuoso instance did not). The remaining few
> million triples are taking forever to load.
>
> Did I miss any performance tuning that can improve the process? Or is this
> the normal behaviour?
>
> thanks,
>
> --
> Koduri Gopala Krishna,
> Music Technology Group, UPF - Barcelona, Spain.
>
> Portfolio - http://tidbits.co.in
> తెలుగువారికి సాంకేతిక సహాయం - http://techsetu.com
>
>
> ------------------------------------------------------------------------------
> Subversion Kills Productivity. Get off Subversion & Make the Move to
> Perforce.
> With Perforce, you get hassle-free workflows. Merge that actually works.
> Faster operations. Version large binaries.  Built-in WAN optimization and
> the
> freedom to use Git, Perforce or both. Make the move to Perforce.
> http://pubads.g.doubleclick.net/gampad/clk?id=122218951&iu=/4140/ostg.clktrk
> _______________________________________________
> Virtuoso-users mailing list
> Virtuoso-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/virtuoso-users
>



-- 
Paul Houle
Expert on Freebase, DBpedia, Hadoop and RDF
(607) 539 6254    paul.houle on Skype   ontolo...@gmail.com

------------------------------------------------------------------------------
_______________________________________________
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtuoso-users

Reply via email to