Hi Rob,

I might be wrong here or there might be some things that can be done  
better,
but this is my experience with loading datasets in Virtuoso.

My load took close to a day. If this is any indication that might be  
useful,
my total Virtuoso db size came to 12Gb; I did not choose some of the
datasets you loaded. I also only loaded the en set. From my own
estimates I figure that a full load of multiple languages on my machine
is going to take a *long* time.

I got the same, almost exponentially increased load times the more
datasets I loaded. The configuration settings (your first link)  
definitely
improved the loading performance for me. My load would have gone on
for a couple of days if those configuration settings were not used.

By the way, I loaded mine onto a CoreDuo (1st-gen Intel) MacBook Pro,
running at 2.16GHz, and with 2Gb of RAM – not too different from your
specs. The hard disk on your server should run faster, and you'll  
probably
get faster queries. Virtuoso runs very well for what I'm applying it  
for (an art
project that requires simple SPARQL queries over the interval of 30  
seconds
or so).

However, on top of running the Virtuoso server I'm also executing more
CPU-intensive Java applets that send my cooling fans into a frenzy. I  
haven't
had a chance to monitor CPU load yet but even with the extra load the  
queries
perform satisfactorily. I'm calling the queries via POST methods from  
my Java
app to the sparql endpoint.

What kind of queries are you calling? Make sure you also include the
default-graph-uri=http://dbpedia.org argument as Virtuoso will search  
all
datasets in it's databases for the query.


Hope this helps,
Andrew


On Apr 18, 2008, at 2:21 PM, robl wrote:

> Hi,
>
> I'm trying to create a local mirror of DBpedia using Openlink Virtuoso
> open source edition and it seems to be taking a number of days to load
> the data.  I've used the DBpedia load scripts previously submitted on
> the list by Hugh Williams (using ttlp_mt).  I've so far managed to  
> load :
>
> articlecategories_en.nt
> infobox_en.nt
> redirect_en.nt
> articles_label_en.nt
> infoboxproperties_en.nt
> shortabstract_en.nt
> disambiguation_en.nt
> longabstract_en.nt
> wikipage_en.nt
> externallinks_en.nt
> pagelinks_en.nt
>
> The load time seems to increase massively as files are loaded. I've  
> used
> the virtuoso configuration settings from
> http://www4.wiwiss.fu-berlin.de/benchmarks-200801/#rdfstores, and I'll
> be adding the extra recommended indexes from
> http://www.openlinksw.com/dataspace/[EMAIL PROTECTED]/weblog/[EMAIL 
> PROTECTED]'s%20BLOG%20%5B127%5D/1298 
> .
>
> This results in a 15Gb virtuoso db file which seems to take an
> inordinately long time to query.  I was wondering if there are any
> recommended techniques for loading a full dataset e.g. are there any
> methods for partitioning of data to improve load and query times ?
>
> Could anyone else let me know how long their load times are for a full
> core dataset (or even just the en version) ?
>
> My server is a 2Gb, AMD Athlon BE-2400 (dual core) - Is this 'good'
> enough, should I invest in more memory ?
>
> Any suggestions would be really appreciated,
>
> Thanks,
>
> Rob
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by the 2008 JavaOne(SM) Conference
> Don't miss this year's exciting event. There's still time to save  
> $100.
> Use priority code J8TL2D2.
> http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
> _______________________________________________
> Dbpedia-discussion mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion


-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to