I was thinking it could slowly evolve to a sort of DBpediaResourceFactory
class at the core of the workflow who knew everything about transforming
Wikipedia Page URLs into DBpedia Resource URIs/IRIs (including
language-specific knowledge, redirects, etc.)

But, yes, sure. Your solution sounds simple and efficient. :)

Keep in mind that redirects.nt may need some treatment to compute the
transitive closure (A redirects_to B redirects_to C -> A redirects_to C).

Cheers,
Pablo

On Fri, Apr 15, 2011 at 3:45 PM, Dimitris Kontokostas <[email protected]>wrote:

> A new extractor will be too expensive
> i think a script can do the job just fine
>
> it will have the redirects.nt as a look-up table and replace all
> occurrences in the extraction dumps
>
> cheers,
> Dimitris
>
>
> On Fri, Apr 15, 2011 at 4:10 PM, Pablo Mendes <[email protected]>wrote:
>
>>  I like the second approach ... if we could use a unique URI to denote the
>>> same entity, we are better off.
>>
>>
>> Yep. The disadvantage is that it is intrusive (requires access to DBpedia
>> extraction). Luckily, DBpedia is an open source project to which any of us
>> can contribute. Better yet, you can adapt similar code from DBpedia
>> Spotlight into a DBpedia extractor and contribute it to the project. It
>> should be in: org.dbpedia.spotlight.util.SurrogatesUtil.scala
>> (
>> http://dbp-spotlight.svn.sourceforge.net/viewvc/dbp-spotlight/trunk/core/src/main/scala/
>> )
>>
>> I will make sure to bug the leader of the next release to include it. :)
>>
>> Cheers,
>> Pablo
>>
>> On Fri, Apr 15, 2011 at 2:43 PM, Lushan Han <[email protected]> wrote:
>>
>>> I like the second approach -- resolving the problem at extraction
>>> time. Inference with large amount of data is still difficult. If we
>>> could use a unique URI to denote the same entity, we are better off.
>>>
>>> Thank you all for immediate response,
>>> Lushan Han
>>>
>>>
>>> On Thu, Apr 14, 2011 at 4:37 AM, Pablo Mendes <[email protected]>
>>> wrote:
>>> > Maybe what Dimitris says is that this query would indeed be answered
>>> if:
>>> > - redirects were treated as sameAs and inference was used (works for
>>> this
>>> > but not all cases)
>>> > - the framework used redirects to do identity resolution at extraction
>>> time
>>> >
>>> > Also, i should point out that you can probably sort this problem out
>>> with a
>>> > simple Silk link spec.
>>> >
>>> > Cheers
>>> > Pablo
>>> >
>>> > On Apr 13, 2011 3:12 PM, "Lushan Han" <[email protected]> wrote:
>>> >> Hi Dimitris,
>>> >>
>>> >> I am afraid that you did not completely see my point. It is not simply
>>> >> a redirection problem.
>>> >> For example, if I want to make a SPARQL query -- what is the birth
>>> >> date of the architect who designed the Brooklyn Bridge?
>>> >>
>>> >> PREFIX dbo: <http://dbpedia.org/ontology/>
>>> >>
>>> >> SELECT ?person, ?date WHERE {
>>> >> :Brooklyn_Bridge dbo:architect ?person .
>>> >> ?person dbo:birthDate ?date .
>>> >> }
>>> >>
>>> >> It should be able to return the correct answer. However, there is no
>>> >> result. The problem is caused by the redirection.
>>> >>
>>> >> I am curious that even the Wikipedia article doesn't use the
>>> >> redirection. Why does the corresponding DBpedia article use it?
>>> >>
>>> >>
>>> >> Best regards,
>>> >> Lushan Han
>>> >>
>>> >> On Wed, Apr 13, 2011 at 5:23 AM, Dimitris Kontokostas <
>>> [email protected]>
>>> >> wrote:
>>> >>> Hi,
>>> >>>
>>> >>> The wikipedia article about John_Augustus_Roebling (1) redirects to
>>> >>> John_A._Roebling (2)
>>> >>> that is why you cannot find any information for (1)
>>> >>>
>>> >>> the Brooklyn Bride article has a link on the redirection article
>>> >>>
>>> >>> Although this is not an a bug, it could be resolved in the extraction
>>> >>> framework and replace all redirections to the proper articles.
>>> >>> A shell script could do the job, any ideas / comments?
>>> >>>
>>> >>> Cheers,
>>> >>> Dimitris
>>> >>>
>>> >>> On Tue, Apr 12, 2011 at 11:22 PM, Lushan Han <[email protected]>
>>> wrote:
>>> >>>>
>>> >>>> Hi,
>>> >>>>
>>> >>>> It surprised me that a dbpedia URI is not consistent with its
>>> >>>> corresponding Wikipedia URI. This is
>>> >>>> http://en.wikipedia.org/wiki/John_Augustus_Roebling. Its
>>> corresponding
>>> >>>> URI in dbpedia is http://dbpedia.org/page/John_A._Roebling. I think
>>> we
>>> >>>> need resolve this issue because i found it break link of data. For
>>> >>>> example, from http://dbpedia.org/page/Brooklyn_Bridge, you can know
>>> >>>> its dbpedia-owl:architect is dbpedia:John_Augustus_Roebling.
>>> However,
>>> >>>> when I query the rdf:type of dbpedia:John_Augustus_Roebling using
>>> >>>> SPARQL endpoint, it gave me no result. The reason is that there is
>>> no
>>> >>>> dbpedia:John_Augustus_Roebling but instead dbpedia:John_A._Roebling.
>>> >>>>
>>> >>>> I don't know how many else such URIs exist.
>>> >>>>
>>> >>>> Best regards,
>>> >>>> Lushan Han
>>> >>>>
>>>
>>
>
>
> --
> Kontokostas Dimitris
>
------------------------------------------------------------------------------
Benefiting from Server Virtualization: Beyond Initial Workload 
Consolidation -- Increasing the use of server virtualization is a top
priority.Virtualization can reduce costs, simplify management, and improve 
application availability and disaster protection. Learn more about boosting 
the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to