[
https://issues.apache.org/jira/browse/SOLR-8888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15210206#comment-15210206
]
Joel Bernstein commented on SOLR-8888:
--------------------------------------
I've been digging into the implementation and it looks like Streaming provides
some real advantages.
The biggest advantage comes from the ability to sort entire results by the Node
id and do this in parallel across the cluster. This means that once the Nodes
arrive at the worker they can simply be written to memory mapped files for the
book keeping. The book keeping files need to be sorted by Node Id and most
likely need offset information to support binary searching and skipping during
intersections. I looked at using MapDB for the book keeping and if the data
wasn't already coming in sorted then this would have been the approach to use.
But even as fast as MapDB is there is still overhead that we don't need in
managing the BTree's.
So, in order to get the maximum speed in reading and writing the book keeping
files I'm planning on just using memory mapped files with offsets. This is
going to take more time to develop but will pay off when there are large
traversals.
> Add shortestPath Streaming Expression
> -------------------------------------
>
> Key: SOLR-8888
> URL: https://issues.apache.org/jira/browse/SOLR-8888
> Project: Solr
> Issue Type: Improvement
> Reporter: Joel Bernstein
>
> This ticket is to implement a distributed shortest path graph traversal as a
> Streaming Expression.
> possible expression syntax:
> {code}
> shortestPath(collection,
> from="colA:node1",
> to="colB:node2",
> fq="limiting query",
> maxDepth="10")
> {code}
> This would start from colA:node1 and traverse from colA to colB iteratively
> until it finds colB:node2. The shortestPath function would emit Tuples
> representing the shortest path.
> The optional fq could be used to apply a filter on the traversal.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]