Hi Rob, I might be wrong here or there might be some things that can be done better, but this is my experience with loading datasets in Virtuoso.
My load took close to a day. If this is any indication that might be useful, my total Virtuoso db size came to 12Gb; I did not choose some of the datasets you loaded. I also only loaded the en set. From my own estimates I figure that a full load of multiple languages on my machine is going to take a *long* time. I got the same, almost exponentially increased load times the more datasets I loaded. The configuration settings (your first link) definitely improved the loading performance for me. My load would have gone on for a couple of days if those configuration settings were not used. By the way, I loaded mine onto a CoreDuo (1st-gen Intel) MacBook Pro, running at 2.16GHz, and with 2Gb of RAM – not too different from your specs. The hard disk on your server should run faster, and you'll probably get faster queries. Virtuoso runs very well for what I'm applying it for (an art project that requires simple SPARQL queries over the interval of 30 seconds or so). However, on top of running the Virtuoso server I'm also executing more CPU-intensive Java applets that send my cooling fans into a frenzy. I haven't had a chance to monitor CPU load yet but even with the extra load the queries perform satisfactorily. I'm calling the queries via POST methods from my Java app to the sparql endpoint. What kind of queries are you calling? Make sure you also include the default-graph-uri=http://dbpedia.org argument as Virtuoso will search all datasets in it's databases for the query. Hope this helps, Andrew On Apr 18, 2008, at 2:21 PM, robl wrote: > Hi, > > I'm trying to create a local mirror of DBpedia using Openlink Virtuoso > open source edition and it seems to be taking a number of days to load > the data. I've used the DBpedia load scripts previously submitted on > the list by Hugh Williams (using ttlp_mt). I've so far managed to > load : > > articlecategories_en.nt > infobox_en.nt > redirect_en.nt > articles_label_en.nt > infoboxproperties_en.nt > shortabstract_en.nt > disambiguation_en.nt > longabstract_en.nt > wikipage_en.nt > externallinks_en.nt > pagelinks_en.nt > > The load time seems to increase massively as files are loaded. I've > used > the virtuoso configuration settings from > http://www4.wiwiss.fu-berlin.de/benchmarks-200801/#rdfstores, and I'll > be adding the extra recommended indexes from > http://www.openlinksw.com/dataspace/[EMAIL PROTECTED]/weblog/[EMAIL > PROTECTED]'s%20BLOG%20%5B127%5D/1298 > . > > This results in a 15Gb virtuoso db file which seems to take an > inordinately long time to query. I was wondering if there are any > recommended techniques for loading a full dataset e.g. are there any > methods for partitioning of data to improve load and query times ? > > Could anyone else let me know how long their load times are for a full > core dataset (or even just the en version) ? > > My server is a 2Gb, AMD Athlon BE-2400 (dual core) - Is this 'good' > enough, should I invest in more memory ? > > Any suggestions would be really appreciated, > > Thanks, > > Rob > > ------------------------------------------------------------------------- > This SF.net email is sponsored by the 2008 JavaOne(SM) Conference > Don't miss this year's exciting event. There's still time to save > $100. > Use priority code J8TL2D2. > http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone > _______________________________________________ > Dbpedia-discussion mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion ------------------------------------------------------------------------- This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone _______________________________________________ Dbpedia-discussion mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
