[ 
https://issues.apache.org/jira/browse/SOLR-7535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15076415#comment-15076415
 ] 

Joel Bernstein edited comment on SOLR-7535 at 1/2/16 1:36 AM:
--------------------------------------------------------------

Yes, currently partitioning is only done as part of the search(). So any 
workflow that requires re-partitioning will have to be done in multiple steps. 
That's why this ticket is so important, the UpdateStream allows for write-backs.

In the example above, the first join would need to be wrapped in an 
UpdateStream and sent to a temp index. The temp index would be used for the 
next steps.

In the future we can look at faster ways to re-partition. One example would be 
to have the workers repartition to local disk. Then the second step could read 
from the worker nodes rather the searching. This still involves multiple steps 
but it would be much faster.


was (Author: joel.bernstein):
Yes, currently partitioning is only done as part of the search(). So any 
workflow that requires re-partitioning will have to be done in multiple steps. 
That's why this ticket is so important, the UpdateStream allows for write-backs.

In the example above, the first join would need to be wrapped in an 
UpdateStream and sent to a temp index. The temp index would be used for the 
next steps.

In the future we can look at faster ways to re-partition. One example would be 
have the workers repartition to local disk. Then the second step could read 
from the worker nodes rather the searching. This still involves multiple steps 
but it would be much faster.

> Add UpdateStream to Streaming API and Streaming Expression
> ----------------------------------------------------------
>
>                 Key: SOLR-7535
>                 URL: https://issues.apache.org/jira/browse/SOLR-7535
>             Project: Solr
>          Issue Type: New Feature
>          Components: clients - java, SolrJ
>            Reporter: Joel Bernstein
>            Priority: Minor
>         Attachments: SOLR-7535.patch, SOLR-7535.patch
>
>
> The ticket adds an UpdateStream implementation to the Streaming API and 
> streaming expressions. The UpdateStream will wrap a TupleStream and send the 
> Tuples it reads to a SolrCloud collection to be indexed.
> This will allow users to pull data from different Solr Cloud collections, 
> merge and transform the streams and send the transformed data to another Solr 
> Cloud collection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to