Davide Palmisano wrote:
>
>
> Yes, that's my fear. In fact, some months ago I was doing a massive
> ingestion from dbpedia and my business logic wasn't provide this kind of
> custom logic. The result was that a lot of dbpedia resources was not
> ingested.
>
That's why, when I do generic database work (either dbpedia or
freebase) I like to download the whole thing. If you're doing a little
bit at a time, integrity problems have a way of going unnoticed. For
instance, when I loaded all of dbpedia into a non-RDF system that was
case insensitive for titles, it was clear that wikipedia has 10,000 or
so entry pairs like this:
http://en.wikipedia.org/wiki/Direct_instruction
http://en.wikipedia.org/wiki/Direct_Instruction
here you've got two similar but different items that differ in title
only by case. In general you want a human-friendly search engine or
search completion facility to be case insensitive, "alan alda" should
turn up
http://en.wikipedia.org/wiki/Alan_Alda
but you also need something that can correctly handle labels/urls
that vary only by case without breaking. Had I used a standard RDF
stack for my work with dbpedia, I would have blasted right by many
integrity issues that the system I built forced me to confront.
------------------------------------------------------------------------------
Come build with us! The BlackBerry® Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9-12, 2009. Register now!
http://p.sf.net/sfu/devconf
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion