[
https://issues.apache.org/jira/browse/SOLR-8888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15214407#comment-15214407
]
Joel Bernstein edited comment on SOLR-8888 at 3/28/16 5:14 PM:
---------------------------------------------------------------
First patch which implements a breadth first search using a threaded nested
loop join. Each join in the traversal is split up into batches and is executed
in threads within the worker node. This approach spreads the join across all
replicas. The bottleneck in this scenario will be the network as potentially
dozens of search nodes will be returning nodes in parallel to the same worker
to satisfy the join. This bottleneck can be greatly reduced by compression
because the edges are returned sorted by the toField, which will cause large
amount of repeated data to be streamed in the same compression block. SOLR-8910
has been opened to add Lz4 compression to the /export handler.
In my last comment I mentioned using sorted memory mapped files for the book
keeping. In this patch all book keeping is done in memory using HashMaps.
was (Author: joel.bernstein):
First patch which implements a breadth first search using a threaded nested
loop join. Each join in the traversal is split up into batches and is executed
in threads within the worker node. This approach spreads the join across all
replicas. The bottleneck in this scenario will be the network as potentially
dozens of search nodes will be returning nodes in parallel to the same worker
to satisfy the join. This bottleneck can be greatly reduced by compression
because the edges are returned sorted by the toField, which will cause large
amount of repeated data to be streamed in the same compression block. SOLR-8910
has been opened to add Lz4 compression to the /export handler.
> Add shortestPath Streaming Expression
> -------------------------------------
>
> Key: SOLR-8888
> URL: https://issues.apache.org/jira/browse/SOLR-8888
> Project: Solr
> Issue Type: Improvement
> Reporter: Joel Bernstein
> Attachments: SOLR-8888.patch
>
>
> This ticket is to implement a distributed shortest path graph traversal as a
> Streaming Expression.
> possible expression syntax:
> {code}
> shortestPath(collection,
> from="colA:node1",
> to="colB:node2",
> fq="limiting query",
> maxDepth="10")
> {code}
> This would start from colA:node1 and traverse from colA to colB iteratively
> until it finds colB:node2. The shortestPath function would emit Tuples
> representing the shortest path.
> The optional fq could be used to apply a filter on the traversal.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]