[
https://issues.apache.org/jira/browse/SOLR-8188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15001376#comment-15001376
]
ASF subversion and git services commented on SOLR-8188:
-------------------------------------------------------
Commit 1713950 from [email protected] in branch 'dev/trunk'
[ https://svn.apache.org/r1713950 ]
SOLR-8188: Adds Hash and OuterHash Joins to the Streaming API and Streaming
Expressions
> Add hash style joins to the Streaming API and Streaming Expressions
> -------------------------------------------------------------------
>
> Key: SOLR-8188
> URL: https://issues.apache.org/jira/browse/SOLR-8188
> Project: Solr
> Issue Type: Improvement
> Components: SolrJ
> Reporter: Dennis Gove
> Assignee: Dennis Gove
> Priority: Minor
> Attachments: SOLR-8188.patch, SOLR-8188.patch
>
>
> Add HashJoinStream and OuterHashJoinStream to the Streaming API to allow for
> optimized joining between sub-streams.
> HashJoinStream is similar to an InnerJoinStream except that it does not
> insist on any particular order and will read all values from the stream being
> hashed (hashStream) when open() is called. During read() it will return the
> next tuple from the stream not being hashed (fullStream) which has at least
> one matching record in hashStream. It will return a tuple which is the merge
> of both tuples. If the tuple from the fullStream matches with more than one
> tuple from the hashStream then calling read() will return the merge with the
> next matching tuple. The order of the resulting stream is the order of the
> fullStream.
> OuterHashJoinStream is similar to a HashJoinStream and LeftOuterJoinStream in
> that a tuple from fullStream will be returned even if it doesn't have a
> matching record in hashStream. All other pieces are identical.
> In expression form
> {code}
> hashJoin(
> search(collection1, q=*:*, fl="fieldA, fieldB, fieldC", ...),
> hashed=search(collection2, q=*:*, fl="fieldA, fieldB, fieldE", ...),
> on="fieldA, fieldB"
> )
> {code}
> {code}
> outerHashJoin(
> search(collection1, q=*:*, fl="fieldA, fieldB, fieldC", ...),
> hashed=search(collection2, q=*:*, fl="fieldA, fieldB, fieldE", ...),
> on="fieldA, fieldB"
> )
> {code}
> As you can see the hashStream is named parameter which makes it very clear
> which stream should be hashed.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]