Hi everyone,
i trying to find a way to calculate the distance between two resurces
withing dbpedia. I thought i could make a sareis of joins in sparql
like this
SELECT ?1 ?2 ?3 WHERE {
{<http://dbpedia.org/resource/Category:Historical_board_games> ?p1 ?1.}
UNION
{?1 ?p2 <http://dbpedia.org/resource/Category:Historical_board_games>.}
{?1 ?p3 ?2.}
UNION
{?2 ?p4 ?1.}
{?3 ?p5 ?2.}
UNION
{?2 ?p6 ?3.}
{<http://dbpedia.org/resource/Economics> ?p7 ?3.}
UNION
{?3 ?p8 <http://dbpedia.org/resource/Economics>.}
}
until i found at least one way between the resources, but as i thought
this kind of approach is really too heavy to compute and starting from
3 steps or more it will result in a timeout from the server executing
the sparql query.
So i was reading this article:
Discovering Unknown Connections – the DBpedia Relationship Finder
where are described two algorithms: the first one is a clustering one
witch divides the whole rdf triples in clusters of conncted sub-graphs
and assigns a distance value from a single random resource to all the
others. The other one does more or less what i'm trying to do,
calculating the ways that connect two nodes (also calculating the
minimum distance and the maximum distance as absolute value of the
difference and the sum of the relative distances to the central
resource), but there's this instruction:
"formulate SQL query for obtaining at most (n m) connections
between O1
and O2 of length d without objects and properties in the ignore list;"
so it is similar to my original idea.... i really would have liked to
see how it is implemented in an efficent way, so i downloaded the code
but i was unable to run it bacause of a db problem (it requies a
statement table and i donno where it cames from) and still can't find
where this implementation is within the sourcecode.
By the way, have someone tried to solve this problem? are there any
kind of suggestion?
I thought another possiblity could be the sequent: starting form the
clsuterized rdf triples in the article, construct two trees with the 2
resoruces we want to find the distance as roots. Then each son is a
connected resource with a distance from the central resource fo the
cluster wich is less than root's distance from the center. At the n-th
level (where n is the distance from the central resource of the root
resource) we will have the central resource for sure. Then for each
resource in the first tree we search for it in the second one. Then we
save every matching reasource in a list with the sum of the distances
from the roots in the two trees. Once every resource is checked, the
resurce in the list with the smaller value is the resource that
minimizes the distance between the two nodes. The worst situation is
that there are no common resources in the trees other than the central
resource.
Maybe this works but it's a bit elaborate and probably hard to
compute... maybe there's a simplier way. Any suggestion?
Thankyou,
Piero Molino
------------------------------------------------------------------------------
Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are
powering Web 2.0 with engaging, cross-platform capabilities. Quickly and
easily build your RIAs with Flex Builder, the Eclipse(TM)based development
software that enables intelligent coding and step-through debugging.
Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion