the DBpediaResourceFactory seems better but redirects can be very big (only
the English one is ~750MB). I don't know how the framework can handle such
big data
You're right about the transitive issue, I haven't thought of it :)
you just found a bug in a new script i am creating :)
anyway, this can be worked out (somehow I guess)
Cheers,
Dimitris
On Fri, Apr 15, 2011 at 5:57 PM, Pablo Mendes <[email protected]> wrote:
>
> I was thinking it could slowly evolve to a sort of DBpediaResourceFactory
> class at the core of the workflow who knew everything about transforming
> Wikipedia Page URLs into DBpedia Resource URIs/IRIs (including
> language-specific knowledge, redirects, etc.)
>
> But, yes, sure. Your solution sounds simple and efficient. :)
>
> Keep in mind that redirects.nt may need some treatment to compute the
> transitive closure (A redirects_to B redirects_to C -> A redirects_to C).
>
> Cheers,
> Pablo
>
> On Fri, Apr 15, 2011 at 3:45 PM, Dimitris Kontokostas
> <[email protected]>wrote:
>
>> A new extractor will be too expensive
>> i think a script can do the job just fine
>>
>> it will have the redirects.nt as a look-up table and replace all
>> occurrences in the extraction dumps
>>
>> cheers,
>> Dimitris
>>
>>
>> On Fri, Apr 15, 2011 at 4:10 PM, Pablo Mendes <[email protected]>wrote:
>>
>>> I like the second approach ... if we could use a unique URI to denote
>>>> the same entity, we are better off.
>>>
>>>
>>> Yep. The disadvantage is that it is intrusive (requires access to DBpedia
>>> extraction). Luckily, DBpedia is an open source project to which any of us
>>> can contribute. Better yet, you can adapt similar code from DBpedia
>>> Spotlight into a DBpedia extractor and contribute it to the project. It
>>> should be in: org.dbpedia.spotlight.util.SurrogatesUtil.scala
>>> (
>>> http://dbp-spotlight.svn.sourceforge.net/viewvc/dbp-spotlight/trunk/core/src/main/scala/
>>> )
>>>
>>> I will make sure to bug the leader of the next release to include it. :)
>>>
>>> Cheers,
>>> Pablo
>>>
>>> On Fri, Apr 15, 2011 at 2:43 PM, Lushan Han <[email protected]> wrote:
>>>
>>>> I like the second approach -- resolving the problem at extraction
>>>> time. Inference with large amount of data is still difficult. If we
>>>> could use a unique URI to denote the same entity, we are better off.
>>>>
>>>> Thank you all for immediate response,
>>>> Lushan Han
>>>>
>>>>
>>>> On Thu, Apr 14, 2011 at 4:37 AM, Pablo Mendes <[email protected]>
>>>> wrote:
>>>> > Maybe what Dimitris says is that this query would indeed be answered
>>>> if:
>>>> > - redirects were treated as sameAs and inference was used (works for
>>>> this
>>>> > but not all cases)
>>>> > - the framework used redirects to do identity resolution at extraction
>>>> time
>>>> >
>>>> > Also, i should point out that you can probably sort this problem out
>>>> with a
>>>> > simple Silk link spec.
>>>> >
>>>> > Cheers
>>>> > Pablo
>>>> >
>>>> > On Apr 13, 2011 3:12 PM, "Lushan Han" <[email protected]> wrote:
>>>> >> Hi Dimitris,
>>>> >>
>>>> >> I am afraid that you did not completely see my point. It is not
>>>> simply
>>>> >> a redirection problem.
>>>> >> For example, if I want to make a SPARQL query -- what is the birth
>>>> >> date of the architect who designed the Brooklyn Bridge?
>>>> >>
>>>> >> PREFIX dbo: <http://dbpedia.org/ontology/>
>>>> >>
>>>> >> SELECT ?person, ?date WHERE {
>>>> >> :Brooklyn_Bridge dbo:architect ?person .
>>>> >> ?person dbo:birthDate ?date .
>>>> >> }
>>>> >>
>>>> >> It should be able to return the correct answer. However, there is no
>>>> >> result. The problem is caused by the redirection.
>>>> >>
>>>> >> I am curious that even the Wikipedia article doesn't use the
>>>> >> redirection. Why does the corresponding DBpedia article use it?
>>>> >>
>>>> >>
>>>> >> Best regards,
>>>> >> Lushan Han
>>>> >>
>>>> >> On Wed, Apr 13, 2011 at 5:23 AM, Dimitris Kontokostas <
>>>> [email protected]>
>>>> >> wrote:
>>>> >>> Hi,
>>>> >>>
>>>> >>> The wikipedia article about John_Augustus_Roebling (1) redirects to
>>>> >>> John_A._Roebling (2)
>>>> >>> that is why you cannot find any information for (1)
>>>> >>>
>>>> >>> the Brooklyn Bride article has a link on the redirection article
>>>> >>>
>>>> >>> Although this is not an a bug, it could be resolved in the
>>>> extraction
>>>> >>> framework and replace all redirections to the proper articles.
>>>> >>> A shell script could do the job, any ideas / comments?
>>>> >>>
>>>> >>> Cheers,
>>>> >>> Dimitris
>>>> >>>
>>>> >>> On Tue, Apr 12, 2011 at 11:22 PM, Lushan Han <[email protected]>
>>>> wrote:
>>>> >>>>
>>>> >>>> Hi,
>>>> >>>>
>>>> >>>> It surprised me that a dbpedia URI is not consistent with its
>>>> >>>> corresponding Wikipedia URI. This is
>>>> >>>> http://en.wikipedia.org/wiki/John_Augustus_Roebling. Its
>>>> corresponding
>>>> >>>> URI in dbpedia is http://dbpedia.org/page/John_A._Roebling. I
>>>> think we
>>>> >>>> need resolve this issue because i found it break link of data. For
>>>> >>>> example, from http://dbpedia.org/page/Brooklyn_Bridge, you can
>>>> know
>>>> >>>> its dbpedia-owl:architect is dbpedia:John_Augustus_Roebling.
>>>> However,
>>>> >>>> when I query the rdf:type of dbpedia:John_Augustus_Roebling using
>>>> >>>> SPARQL endpoint, it gave me no result. The reason is that there is
>>>> no
>>>> >>>> dbpedia:John_Augustus_Roebling but instead
>>>> dbpedia:John_A._Roebling.
>>>> >>>>
>>>> >>>> I don't know how many else such URIs exist.
>>>> >>>>
>>>> >>>> Best regards,
>>>> >>>> Lushan Han
>>>> >>>>
>>>>
>>>
>>
>>
>> --
>> Kontokostas Dimitris
>>
>
>
--
Kontokostas Dimitris
------------------------------------------------------------------------------
Benefiting from Server Virtualization: Beyond Initial Workload
Consolidation -- Increasing the use of server virtualization is a top
priority.Virtualization can reduce costs, simplify management, and improve
application availability and disaster protection. Learn more about boosting
the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion