Re: [Dbpedia-discussion] what areas of knowledge has the best qualityand coverage in dbpedia?

2013-09-30 Thread Paul A. Houle
I'd say that things are represented well in Dbpedia when the things are objects that have well defined properties. For instance, if I show up at the courthouse with a birth certificate that documents my date and place of birth and my parents, that proves that I'm a particular Person.

Re: [Dbpedia-discussion] ANN: DBpedia 3.9 released, including wider infobox coverage, additional type statements, and new YAGO and Wikidata links

2013-09-23 Thread Paul A. Houle
One of the goals of the infovore project is to develop something that targets this latency problem. https://github.com/paulhoule/infovore/wiki I’ve talked with a number of organizations that use DBpedia and Freebase data and almost all of them have either no solution or an incomplete

Re: [Dbpedia-discussion] Extractor for a specific class of entities

2013-09-16 Thread Paul A. Houle
Any chance we can get a well-defined interface that could be used to run ‘rdfslice’ as an Infovore application but still let people run it independently of Hadoop? I think that would help with the “number of hours” problems. From: adrian.brasove...@gmail.com Sent: Monday, September 16,

Re: [Dbpedia-discussion] [Wikidata-l] Best practices for large RDF dumps, was: Re: Wikidata RDF export available

2013-08-12 Thread Paul A. Houle
My feelings are strong towards one-line-per-fact. Large RDF data sets have validity problems, and the difficulty of convincing publishers that this matters indicates that this situation will continue. I’ve thought a bit about the problem of the “streaming converter from Turtle to

Re: [Dbpedia-discussion] Strategies to download subsets of DBPedia

2013-07-16 Thread Paul A. Houle
too. From: Dan Gravell Sent: Tuesday, July 16, 2013 4:26 AM To: Paul A. Houle Cc: dbpedia-discussion@lists.sourceforge.net Subject: Re: [Dbpedia-discussion] Strategies to download subsets of DBPedia Thanks Paul. The end goal of this data is import into AWS SimpleDB and CloudSearch

Re: [Dbpedia-discussion] Strategies to download subsets of DBPedia

2013-07-15 Thread Paul A. Houle
I can report my progress on this front. I’ve got a system in place that moves Freebase dumps, recompresses them and stores them in the AMZN cloud. I can suck in DBpedia data the same way. I’m hadoopifying my Infovore tools so I can do my preprocessing, parallel super eyeball and be able to

Re: [Dbpedia-discussion] Airpedia resource

2013-06-10 Thread Paul A. Houle
I am a fan of the SPARQL result set format whenever people want to express tuples of nodes: http://www.w3.org/TR/sparql11-results-csv-tsv/ I think it’s more standard than Turtle, and it is as efficient as you’ll get unless you want a binary format. This file can be processed with simple

[Dbpedia-discussion] RDF Validator puts Freebase and DBpedia Live to the test

2013-04-09 Thread Paul A. Houle
PRESS RELEASE Paul Houle, Ontology2 founder, stated that we updated Infovore to accept data from DBpedia, and ran a head to head test, in terms of RDF validity, between Freebase and DBpedia Live. Unlike most scientific results, he said, these results are repeatable, because you can

[Dbpedia-discussion] Live down?

2013-04-04 Thread Paul A. Houle
I’ve been wanting to update my copy of dbpedia live so I can publish some results on this month’s version, but I’ve noticed that live.dbpedia.org has been down since yesterday. Can we get it back up?--

[Dbpedia-discussion] No page links in Live?

2012-07-01 Thread Paul A. Houle
I just did $ bzcat ~/dbpedia_2012_05_31.nt.bz2 | grep 'wikiPageWikiLink' | wc and got back zero lines. Is it deliberate that ?s http://dbpedia.org/ontology/wikiPageWikiLink ?o . triples are missing from Live? If that's so, that's very disappointing. :wikiPageWikiLink is one of the most

[Dbpedia-discussion] Bad Turtle, No Cookie

2012-06-29 Thread Paul A. Houle
I've been trying to process DBpedia Live with a pipeline that uses Jena and I've found 8765 triples that Jena won't parse from http://live.dbpedia.org/dumps/dbpedia_2012_05_31.nt.bz2 The rejected triples can be found here: http://basekb.com/files/DBpediaRejected.nt.bz2 Several sorts of

Re: [Dbpedia-discussion] Get the number of users who have edited a particular article

2012-06-21 Thread Paul A. Houle
On 6/21/2012 12:48 AM, Somesh Jain wrote: Hi people, I have to choose important Wiki articles from a bunch of them. So, I was thinking of doing that by the number of users who have edited that page. Is it possible to get that information using DBpedia Data dumps with history

[Dbpedia-discussion] :BaseKB EA 2 and basekb-tools now available

2012-05-29 Thread Paul A. Houle
is an important milestone for both Freebase and the Semantic Web, says Ontology2 founder Paul Houle, :BaseKB opens Freebase to users of SPARQL and other RDF standards. The superior quality of Freebase data solves data quality problems that have, so far, frustrated Linked Data applications. Ontology2

Re: [Dbpedia-discussion] Like DBpedia? You'll love :BaseKB

2012-04-13 Thread Paul A. Houle
On 4/13/2012 6:35 AM, baran_H wrote: a.) Do you see local installations only as a temporary solution until public SPARQL endpoints get more powerful and cheaper in the future? I think the pendulum will swing to and away from the cloud and I think there's a place for everything.

[Dbpedia-discussion] Like DBpedia? You'll love :BaseKB

2012-04-11 Thread Paul A. Houle
We've cracked the code of the Freebase quad dump and produced what we believe is the first correct conversion of Freebase into industry-standard RDF. http://basekb.com/ By installing :BaseKB into any market-leading triple store, you can query Freebase with the powerful SPARQL

[Dbpedia-discussion] Dbpedia for dummies?

2012-02-14 Thread Paul A. Houle
Not to insult anybody, but it's a constant theme on this site http://answers.semanticweb.com/questions/14432/querying-dbpedia-timeout-exception?page=1#14438 that beginners find it challenging to get a DBpedia instance up and running. This isn't really a flaw in DBpedia, but DBpedia comes

[Dbpedia-discussion] Ontology2 Releases RDF Dump of Nearly 1, 000, 000 Free Images

2012-01-25 Thread Paul A. Houle
and build applications based on the dump. The 2011-01-23 beta release has been tested on Virtuoso OpenLink 6.1.4. Ontology2 founder Paul Houle says the RDF dump will be qualified against other leading triple stores before it gets out of beta. We want to work with vendors and users to produce a product