[jira] [Commented] (SOLR-9337) Add fetch Streaming Expression

2017-04-03 Thread Joel Bernstein (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15953934#comment-15953934
 ] 

Joel Bernstein commented on SOLR-9337:
--

Another name for this could have been nestedLoopJoin. But for the functionality 
I was looking to support at the time I didn't need to support all the different 
join conditions (one-to-one, one-to-many, many-to-one, many-to-many etc...). So 
rather then call it a join, I called it fetch. The full nestedLoopJoin 
capability will be available in the future and it will be called by that name. 
Eventually, possibly in Solr 7.0 fetch will be superseded by the nestedLoopJoin.

> Add fetch Streaming Expression
> --
>
> Key: SOLR-9337
> URL: https://issues.apache.org/jira/browse/SOLR-9337
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
> Fix For: 6.3
>
> Attachments: SOLR-9337.patch, SOLR-9337.patch
>
>
> The fetch() Streaming Expression wraps another expression and fetches 
> additional fields for documents in batches. The fetch() expression will 
> stream out the Tuples after the data has been fetched. Fields can be fetched 
> from any SolrCloud collection. 
> Sample syntax:
> {code}
> daemon(
>update(collectionC, batchSize="100"
>   fetch(collectionB, 
> topic(checkpoints, collectionA, q="*:*", fl="a,b,c", 
> rows="50"),
> fl="j,m,z",
> on="a=j")))
>
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9337) Add fetch Streaming Expression

2017-04-03 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15953637#comment-15953637
 ] 

David Smiley commented on SOLR-9337:


bq. How does a fetch differ from an innerJoin?

IMO I think that points to an unfortunate choice of names for this streaming 
expression.  I had the same question.  "join" should have been in this 
streaming expression's name; be it "join" exactly or some variation but I think 
developers expect to see that term if it works like DB joins that we all know.

> Add fetch Streaming Expression
> --
>
> Key: SOLR-9337
> URL: https://issues.apache.org/jira/browse/SOLR-9337
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
> Fix For: 6.3
>
> Attachments: SOLR-9337.patch, SOLR-9337.patch
>
>
> The fetch() Streaming Expression wraps another expression and fetches 
> additional fields for documents in batches. The fetch() expression will 
> stream out the Tuples after the data has been fetched. Fields can be fetched 
> from any SolrCloud collection. 
> Sample syntax:
> {code}
> daemon(
>update(collectionC, batchSize="100"
>   fetch(collectionB, 
> topic(checkpoints, collectionA, q="*:*", fl="a,b,c", 
> rows="50"),
> fl="j,m,z",
> on="a=j")))
>
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9337) Add fetch Streaming Expression

2016-10-10 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564154#comment-15564154
 ] 

ASF subversion and git services commented on SOLR-9337:
---

Commit ccc10fd5932fa5d830c3ecda86e85b4845bca863 in lucene-solr's branch 
refs/heads/branch_6x from [~joel.bernstein]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=ccc10fd ]

SOLR-9337: Update CHANGES.txt


> Add fetch Streaming Expression
> --
>
> Key: SOLR-9337
> URL: https://issues.apache.org/jira/browse/SOLR-9337
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
> Attachments: SOLR-9337.patch, SOLR-9337.patch
>
>
> The fetch() Streaming Expression wraps another expression and fetches 
> additional fields for documents in batches. The fetch() expression will 
> stream out the Tuples after the data has been fetched. Fields can be fetched 
> from any SolrCloud collection. 
> Sample syntax:
> {code}
> daemon(
>update(collectionC, batchSize="100"
>   fetch(collectionB, 
> topic(checkpoints, collectionA, q="*:*", fl="a,b,c", 
> rows="50"),
> fl="j,m,z",
> on="a=j")))
>
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9337) Add fetch Streaming Expression

2016-10-10 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564146#comment-15564146
 ] 

ASF subversion and git services commented on SOLR-9337:
---

Commit d69412bc676189600aed8b4cff2aad819526a5e2 in lucene-solr's branch 
refs/heads/master from [~joel.bernstein]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=d69412b ]

SOLR-9337: Update CHANGES.txt


> Add fetch Streaming Expression
> --
>
> Key: SOLR-9337
> URL: https://issues.apache.org/jira/browse/SOLR-9337
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
> Attachments: SOLR-9337.patch, SOLR-9337.patch
>
>
> The fetch() Streaming Expression wraps another expression and fetches 
> additional fields for documents in batches. The fetch() expression will 
> stream out the Tuples after the data has been fetched. Fields can be fetched 
> from any SolrCloud collection. 
> Sample syntax:
> {code}
> daemon(
>update(collectionC, batchSize="100"
>   fetch(collectionB, 
> topic(checkpoints, collectionA, q="*:*", fl="a,b,c", 
> rows="50"),
> fl="j,m,z",
> on="a=j")))
>
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9337) Add fetch Streaming Expression

2016-10-10 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563220#comment-15563220
 ] 

ASF subversion and git services commented on SOLR-9337:
---

Commit 5836f4032fac975707c85e260d509ecd06c7f7e1 in lucene-solr's branch 
refs/heads/branch_6x from [~joel.bernstein]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=5836f40 ]

SOLR-9337: Add fetch Streaming Expression


> Add fetch Streaming Expression
> --
>
> Key: SOLR-9337
> URL: https://issues.apache.org/jira/browse/SOLR-9337
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
> Attachments: SOLR-9337.patch, SOLR-9337.patch
>
>
> The fetch() Streaming Expression wraps another expression and fetches 
> additional fields for documents in batches. The fetch() expression will 
> stream out the Tuples after the data has been fetched. Fields can be fetched 
> from any SolrCloud collection. 
> Sample syntax:
> {code}
> daemon(
>update(collectionC, batchSize="100"
>   fetch(collectionB, 
> topic(checkpoints, collectionA, q="*:*", fl="a,b,c", 
> rows="50"),
> fl="j,m,z",
> on="a=j")))
>
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9337) Add fetch Streaming Expression

2016-10-10 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563054#comment-15563054
 ] 

ASF subversion and git services commented on SOLR-9337:
---

Commit ee3f9e1e058ac4205140b909a85d43fdd715ddb7 in lucene-solr's branch 
refs/heads/master from [~joel.bernstein]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=ee3f9e1 ]

SOLR-9337: Add fetch Streaming Expression


> Add fetch Streaming Expression
> --
>
> Key: SOLR-9337
> URL: https://issues.apache.org/jira/browse/SOLR-9337
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
> Attachments: SOLR-9337.patch, SOLR-9337.patch
>
>
> The fetch() Streaming Expression wraps another expression and fetches 
> additional fields for documents in batches. The fetch() expression will 
> stream out the Tuples after the data has been fetched. Fields can be fetched 
> from any SolrCloud collection. 
> Sample syntax:
> {code}
> daemon(
>update(collectionC, batchSize="100"
>   fetch(collectionB, 
> topic(checkpoints, collectionA, q="*:*", fl="a,b,c", 
> rows="50"),
> fl="j,m,z",
> on="a=j")))
>
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9337) Add fetch Streaming Expression

2016-09-22 Thread Joel Bernstein (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15513940#comment-15513940
 ] 

Joel Bernstein commented on SOLR-9337:
--

fetch works like this:

1) read N tuples into memory
2) Use a query to fetch fields for the tuples read in step 1.
3) stream the tuples out
4) repeat steps 1-3 until the underlying stream is EOF

This is essentially a nested loop join agains the index.

Mainly used when one side of the join is very small and you want to join it 
against the entire index. 

One main use case I have in mind is doing a graph query, fetching text fields 
for node set that is returned, and then running the classifier on the node set. 
This would combine graph queries and AI models to provide very intelligent 
recommendations.








> Add fetch Streaming Expression
> --
>
> Key: SOLR-9337
> URL: https://issues.apache.org/jira/browse/SOLR-9337
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>
> The fetch() Streaming Expression wraps another expression and fetches 
> additional fields for documents in batches. The fetch() expression will 
> stream out the Tuples after the data has been fetched. Fields can be fetched 
> from any SolrCloud collection. 
> Sample syntax:
> {code}
> daemon(
>update(collectionC, batchSize="100"
>   fetch(collectionB, 
> topic(checkpoints, collectionA, q="*:*", fl="a,b,c", 
> rows="50"),
> fl="j,m,z",
> on="a=j")))
>
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9337) Add fetch Streaming Expression

2016-09-22 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15513889#comment-15513889
 ] 

Dennis Gove commented on SOLR-9337:
---

How does a fetch differ from an innerJoin? I guess it could if it read in a 
tuple from the source and then looked up its specific fields, but I dunno how 
performant that'd be.

> Add fetch Streaming Expression
> --
>
> Key: SOLR-9337
> URL: https://issues.apache.org/jira/browse/SOLR-9337
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>
> The fetch() Streaming Expression wraps another expression and fetches 
> additional fields for documents in batches. The fetch() expression will 
> stream out the Tuples after the data has been fetched. Fields can be fetched 
> from any SolrCloud collection. 
> Sample syntax:
> {code}
> daemon(
>update(collectionC, batchSize="100"
>   fetch(collectionB, 
> topic(checkpoints, collectionA, q="*:*", fl="a,b,c", 
> rows="50"),
> fl="j,m,z",
> on="a=j")))
>
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org