On 08/03/12 15:03, Chris Dollin wrote:
Mena said:
I want to apply the OnTools .FindShortestPath function on Yago.
I am using the following code to load the model:
Model model = TDBFactory.createModel(FullYagoDirectory);
The FindShortestPath function taking too much time to return a result.
I wonder if it is possible to load the model into main memory to make it
faster or if there is any other way to make FindShortestPath much faster.
Model model = ModelFactory.createDefaultModel().add(
TDBFactory.createModel(FullYagoDirectory) );
Of course you may then run out of memory if the model is big.
Chris
("Default" models are in-memory models.)
IIRC YAGO(2) is a bit big. The core is something like 30 million
triples and full 80 million triples, I think.
Bit big for memory unless you have a big server.
Do you need "shortest path" or is just connectivity of entities acceptable?
ARQ now has DISTINCT for paths and executes it (more) efficiently:
{ :x DISTINCT(path) ?y }
in the ARQ language.
(more to come here ... "soon")
If you do want "shortest path", you may need to simplify the problem.
Jena's OntTools shortest path is quite general - can you work with, say,
the path being a fixed property?
If so, maybe extract all the occurrences of that property and make a
subgraph, hopefully smaller.
You may need to look at a graph algorithm like the Floyd-Warshall
algorithm [*] which is space-consuming and O(N^3) in time. Being able
to reduce to something smaller helps with the space consumption.
(OntTool.findShortestPath is a simple breadth first search).
Andy
[*] http://en.wikipedia.org/wiki/Floyd%E2%80%93Warshall_algorithm
Andy