Peter Ansell wrote: > 2009/7/13 Georgi Kobilarov <[email protected]>: > >> Hi John, >> >> it is true that the pagelinks dataset has not been loaded into the >> dbpedia.org/sparql instance. We did load it once, but most Linked Data >> and Sparql clients struggled with the very large result sets. For >> example, DESCRIBE http://dbpedia.org/resource/United_States would result >> in more than 300k triples due to the pagelinks (in particular because of >> inbound links) >> >> So we decided to only provide that dataset for download, but don't load >> it. I think it would be possible to provide a different rdf instance >> containing the pagelinks as well, but for the moment you'd have to load >> that dataset into your own repository. >> > > In any very highly linked dataset you will find that DESCRIBE > behaviour implemented to retrieve both subject and object triples will > pretty much fail because of this issue. It is a shame that the inter > article links in Wikipedia are so diverse that they aren't accessible > using Linked Data anymore... ;-) (Seriously though, it is actually a > shame that there is a barrier to accessing this information without > downloading the dataset yourself, and even then you have to explain to > others that they need to do it themselves also...) > > If only there was a way to tell the DESCRIBE query to get subject > triples and any referenced blank nodes but ignore triples which would > only be included because the URI was the object, but sadly DESCRIBE > queries are specified to be vendor implemented so anything could > happen in the future with ones DESCRIBE query I guess? Do the triples > where the URI is the subject get retrieved before the more trivial > triples where the URI is the object? > > Depending on the limits that are set you might also be unable to > actually retrieve all of the triples in cases like this even if you > did set up another endpoint just for pagelinks. The public dbpedia > endpoint is setup from memory so a client can't even use LIMIT and > OFFSET to get 300 thousand triples? > > Cheers, > > Peter > > ------------------------------------------------------------------------------ > Enter the BlackBerry Developer Challenge > This is your chance to win up to $100,000 in prizes! For a limited time, > vendors submitting new applications to BlackBerry App World(TM) will have > the opportunity to enter the BlackBerry Developer Challenge. See full prize > details at: http://p.sf.net/sfu/Challenge > _______________________________________________ > Dbpedia-discussion mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion > > Peter,
Solution is to put the data set in a separate graph, and then people can explicity access and mesh with other data sources. Once loaded into the DBpedia.org instance, using the /fct interface will expose data from the <http:/dbpedia.org> group (which itself is made of graph IRIs per loaded data set, now) and from the new pagelinks graph which will have IRI: <http://dbpedia.org/pagelinks#> . -- Regards, Kingsley Idehen Weblog: http://www.openlinksw.com/blog/~kidehen President & CEO OpenLink Software Web: http://www.openlinksw.com ------------------------------------------------------------------------------ Enter the BlackBerry Developer Challenge This is your chance to win up to $100,000 in prizes! For a limited time, vendors submitting new applications to BlackBerry App World(TM) will have the opportunity to enter the BlackBerry Developer Challenge. See full prize details at: http://p.sf.net/sfu/Challenge _______________________________________________ Dbpedia-discussion mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
