Thanks for the feedback so far! Seems like we should/will go for the approach of minting language-specific URIs for DBpedia resources and interlink them with owl:sameas.
> 2) I'm not aware how many URIs are from the Portuguese Wikipedia that > have not a language link to the English Wikipedia, but the issue here > is that it just seems wrong to use the English pages to look for facts > for an Italian actor, a Romanian film or a Portuguese film -- I expect > more detail on their respective languages than in the English > documents. Yes, and that's something that colleagues of mine in Berlin are evaluating at the moment. But it's not enough to use the language specific URI for the data extracted from that language, we will need to express the provenance of information within a suitable model as well (Named Graphs). If it-dbpedia:X owl:sameas dbpedia:X, then all statements for it-dbpedia:X apply for dbpedia:X as well. And maybe the statements from different languages conflict. With a Named Graphs approach, the statements can be associated with their origin, and conflicts can be resolved (later) with suitable trust mechanisms. > And with interlinked URIs and even properties (another good > question -- you should also consider > http://dbpedia.org/property/birthPlace owl:sameAs > http://pt.dbpedia.org/property/localNascimento), then it would become > much easier to make cross-language information extraction with DBpedia. From my point of view, we're not going to go this way of interlinking properties. That's a task for the DBpedia Ontology. The birthplace predicate for all languages will be http://dbpedia.org/ontology/birthplace, which will have labels in different languages. I described my idea of an interface to collaboratively maintain the ontology and infobox-to-ontology mappings a week ago on the mailing list, and that will include mappings of infoboxes/templates in different languages to the central ontology. So can I count on you for the Portuguese mappings, Nuno? ;) Cheers, Georgi -- Georgi Kobilarov Freie Universität Berlin www.georgikobilarov.com > -----Original Message----- > From: [email protected] [mailto:[email protected]] > Sent: Saturday, February 14, 2009 12:06 AM > To: Georgi Kobilarov > Cc: Neubert Joachim; [email protected]; dbpedia- > [email protected] > Subject: RE: [Dbpedia-discussion] Core Datasets in other languages > > 1) I totally agree with the proposal. Different namespaces for each > language interlinked with owl:sameAs sound the right thing to do now. > > 2) I'm not aware how many URIs are from the Portuguese Wikipedia that > have not a language link to the English Wikipedia, but the issue here > is that it just seems wrong to use the English pages to look for facts > for an Italian actor, a Romanian film or a Portuguese film -- I expect > more detail on their respective languages than in the English > documents. And with interlinked URIs and even properties (another good > question -- you should also consider > http://dbpedia.org/property/birthPlace owl:sameAs > http://pt.dbpedia.org/property/localNascimento), then it would become > much easier to make cross-language information extraction with DBpedia. > > Citando Georgi Kobilarov <[email protected]>: > > > Hello Joachim, > > > >> Are there some ideas out there how foreign-language-only entries > could > >> be added to the DBpedia dataset? (and how we could get rid of > foreign- > >> language entries which are connected to the English version > afterwards > >> and therefore are sameAs the first class (English) DBpedia entries?) > > > > I agree that there's a demand for URIs for foreign-language-only > > entries. One approach *could* be to create DBpedia namespaces for > every > > language, such as http://de.dbpedia.org for German, > > http://fr.dbpedia.org for French etc., and interlink URIs across > > languages with owl:sameAs links. So > > http://de.dbpedia.org/resource/Freie_Universität_Berlin owl:sameas > > http://dbpedia.org/resource/Free_University_of_Berlin > > > > But this would mean minting a lot of new URIs, and I'm not completely > > convinced of that approach. But I don't really see another solution > for > > the problem of URIs for foreign-language-only entries. We can't add > the > > foreign entities to the main dbpedia namespace, because that would > mess > > things up completely, and there certainly are a lot of articles with > the > > same name but referring to different concepts across languages. > > > > Note that we can represent data extracted from infoboxes from other > > languages by using e.g. Named Graphs, so this is only about those > > entities which are *not* in the English Wikipedia as well. > > > > > > So I'd like to ask the community: > > > > 1) Do you like the above approach? > > 2) How deeply do you need URIs for non-english concepts (those which > > can't be added to the English Wikipedia) > > > > > > Cheers, > > Georgi > > > > -- > > Georgi Kobilarov > > Freie Universität Berlin > > www.georgikobilarov.com > > > > > > > > > > > > ------------------------------------------------------------------------------ Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H _______________________________________________ Dbpedia-discussion mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
