Stephen Betts schrieb: > Hi Sebastian, > > Thank you very much for the detailed response – that makes a lot of > sense. > > So it sounds like we could probably use the abstract_live if there is > not abstract in the DBpedia record, and use the proper abstract when > it becomes available. > > Do you think the bugs that you mention should stop us using DBpedia > Live at present, or are they things that we can work around at present > (as I suggest above) and they will improve in future? Basically I’m > asking whether the bugs are serious enough for you to advise holding > off on using DBpedia Live at present. > > Of course, if we should hold off for the minute, some idea of when > those bugs would be resolved sufficiently for us to use DBpedia Live > would be really helpful, but I know that may be difficult to say at > present. Yes, it is quite difficult to say. I will get back to you next week, as I can better answer your question then. Basically, if we fix the issue that some things are not deleted properly resulting in e.g. double abstracts, we would have a more or less stable version. Then the current framework uses a basic mapping for the dbpedia.org/ontology namespace, which also has some bugs. We are testing both currently and if it is resolved, there will be a stable version. As I said, next week I can give you an estimate. Regards, Sebastian
> > Thanks again. Yours, > > Stephen. > > On 2/3/10 18:19, "Sebastian Hellmann" > <[email protected]> wrote: > > Hello, > The issue is hard to explain, as there are still quite some bugs on > dbpedia-live, which we only found after letting it run for a while. > We optimized the speed and will soon reload DBpedia 3.4 and also load > all changes since September, which will fix the missing or double > abstracts. > Here is how it should work: > Every page has static abstracts (the ones you know). These > abstracts are > the same as as in 3.4 and will remain static. > abstract_live is the abstract extracted for the last revision. > The main reason why there are two is, that there are two different > responsible extractors: > - A slow one, which produces abstracts with better quality and > produced > the abstracts for 3.4. > - A fast one (factor 10-100) , which produces abstracts with more or > less acceptable quality and produces comment_live and abstract_live > Once we improve the speed of the "better" AbstractExtractor, they will > be merged again, but this could take quite some time as it is a > complicated and expensive process( It involves parsing Wiki syntax, > extending a MediaWiki and synchronizing template definitions...) > > Perhaps, it will even stay like that and we will only refresh the > static > abstract information with each available Wikipedia dump. > So there would always be a live english (for now) version and a static > one with better quality and for all languages. > I think, it might not be the worst solution. > > Regards, > Sebastian > > > Stephen Betts schrieb: > > We (in the BBC Search team) are still seeing the problem below. > > > > To recap, Dave wrote: > > > > I'm having difficulty querying dbpedia using the sparql interface on > > > > http://dbpedia-live.openlinksw.com/sparql > > > > I'm requesting the label, abstract and a list of redirects, and > > approx. 20% of the time I'm just getting the redirects. > > > > > > For example using a simple SPARQL query like > > > > SELECT ?abstract WHERE { > > <http://dbpedia.org/resource/Ferrari> > > <http://dbpedia.org/property/abstract> ?abstract . > > FILTER ( langMatches( lang(?abstract), 'en') || ! > > langMatches(lang(?abstract),'*') ) > > } > > > > on dbepdia.org/sparql gives the abstract that you’d expect; > running it > > on dbpedia-live.openlinksw.com/sparql doesn’t find it. However if you > > use abstract_live rather than abstract then it does find something – > > although surprisingly it finds two different abstracts! > > > > > > Just to be explicit, running this query > > > > SELECT ?abstract WHERE { > > <http://dbpedia.org/resource/Ferrari> > > <http://dbpedia.org/property/abstract_live> ?abstract . > > FILTER ( langMatches( lang(?abstract), 'en') || ! > > langMatches(lang(?abstract),'*') ) > > } > > > > on dbpedia-live.openlinksw.com/sparql does give a result, in fact it > > gives two! > > > > So really I have three main questions (although any information would > > be useful): > > > > 1. What is the http://dbpedia.org/property/abstract_live property? > > 2. Cam we safely use it instead of > > http://dbpedia.org/property/abstract > > 3. Why does it have more than one value? > > > > > > We are seeing this on around 20% of the resources that we are > > interested in, and is preventing us from using the Live DBPedia at > > present. > > > > Thanks very much, > > > > Stephn. > > > > > > On 15/1/10 12:06, "[email protected]" > > <[email protected]> wrote: > > > > > > ----------------------------- > > > > Message: 2 > > Date: Fri, 15 Jan 2010 11:05:54 +0000 > > From: David Spacey <[email protected]> > > Subject: [Dbpedia-discussion] Inconsistent results from sparql > queries > > To: [email protected] > > Message-ID: <[email protected]> > > Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed > > > > Hi, > > > > I'm having difficulty querying dbpedia using the sparql interface on > > > > http://dbpedia-live.openlinksw.com/sparql > > > > I'm requesting the label, abstract and a list of redirects, and > > approx. 20% of the time I'm just getting the redirects. With an > > internal instance, running an older snapshot of dbpedia, I'm seeing > > the same problem for around 1% of queries. In every example I've > > checked, the dbpedia id is still current. > > > > So, here are some examples. > > > > Failing on dbpedia-live only > > Adolf_Hitler > > Chicago > > Dora_Bryan > > Emma_Thompson > > Ferrari > > > > Failing on snapshot only > > Austriamicrosystems > > Valery_Gergiev > > Lady_Rachel_Billington > > > > Failing on both > > G20 > > Darcy_Edwards > > > > This is the query I'm using, where '$lod_id' is one of the strings > > listed above. > > > > SELECT ?label, ?abstract, ?redirect, COUNT(?wikilink) > > WHERE { > > { > > <http://dbpedia.org/resource/$lod_id> <http://dbpedia.org/ > > property/abstract> ?abstract . > > FILTER ( langMatches( lang(?abstract), 'en') || ! langMatches > > (lang(?abstract),'*') ) . > > <http://dbpedia.org/resource/$lod_id> <http://www.w3.org/2000/01/ > > rdf-schema#label> ?label . > > FILTER ( langMatches( lang(?label), 'en') || ! langMatches(lang(? > > label),'*') ) . > > } > > UNION > > { > > OPTIONAL { > > ?redirect <http://dbpedia.org/property/redirect> <http:// > > dbpedia.org/resource/$lod_id> . > > OPTIONAL { ?wikilink <http://dbpedia.org/property/wikilink> ? > > redirect } > > } > > } > > } > > > > Can anyone suggest why this might fail for valid entries? > > > > TIA > > > > Dave Spacey > > > > > > http://www.bbc.co.uk/ > > This e-mail (and any attachments) is confidential and may contain > > personal views which are not the views of the BBC unless > > specifically stated. > > If you have received it in error, please delete it from your system. > > Do not use, copy or disclose the information in any way nor act in > > reliance on it and notify the sender immediately. > > Please note that the BBC monitors e-mails sent or received. > > Further communication will signify your consent to this. > > > > > > > > > > ------------------------------ > > > > > ------------------------------------------------------------------------------ > > Throughout its 18-year history, RSA Conference consistently > > attracts the > > world's best and brightest in the field, creating opportunities > > for Conference > > attendees to learn about information security's most important > > issues through > > interactions with peers, luminaries and emerging and established > > companies. > > http://p.sf.net/sfu/rsaconf-dev2dev > > > > ------------------------------ > > > > _______________________________________________ > > Dbpedia-discussion mailing list > > [email protected] > > https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion > > > > > > End of Dbpedia-discussion Digest, Vol 35, Issue 11 > > ************************************************** > > > > > > > > http://www.bbc.co.uk > > This e-mail (and any attachments) is confidential and may contain > > personal views which are not the views of the BBC unless specifically > > stated. > > If you have received it in error, please delete it from your system. > > Do not use, copy or disclose the information in any way nor act in > > reliance on it and notify the sender immediately. > > Please note that the BBC monitors e-mails sent or received. > > Further communication will signify your consent to this. > > ------------------------------------------------------------------------ > > > > > ------------------------------------------------------------------------------ > > Download Intel® Parallel Studio Eval > > Try the new software tools for yourself. Speed compiling, find bugs > > proactively, and fine-tune applications for parallel performance. > > See why Intel Parallel Studio got high marks during beta. > > http://p.sf.net/sfu/intel-sw-dev > > ------------------------------------------------------------------------ > > > > _______________________________________________ > > Dbpedia-discussion mailing list > > [email protected] > > https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion > > > > > -- > Dipl. Inf. Sebastian Hellmann > Department of Computer Science, University of Leipzig > Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann > Research Group: http://aksw.org > > > > > http://www.bbc.co.uk > This e-mail (and any attachments) is confidential and may contain > personal views which are not the views of the BBC unless specifically > stated. > If you have received it in error, please delete it from your system. > Do not use, copy or disclose the information in any way nor act in > reliance on it and notify the sender immediately. > Please note that the BBC monitors e-mails sent or received. > Further communication will signify your consent to this. > ------------------------------------------------------------------------ > > ------------------------------------------------------------------------------ > Download Intel® Parallel Studio Eval > Try the new software tools for yourself. Speed compiling, find bugs > proactively, and fine-tune applications for parallel performance. > See why Intel Parallel Studio got high marks during beta. > http://p.sf.net/sfu/intel-sw-dev > ------------------------------------------------------------------------ > > _______________________________________________ > Dbpedia-discussion mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion > -- Dipl. Inf. Sebastian Hellmann Department of Computer Science, University of Leipzig Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann Research Group: http://aksw.org ------------------------------------------------------------------------------ Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev _______________________________________________ Dbpedia-discussion mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
