[jira] [Commented] (SOLR-6526) Solr Streaming API

Yonik Seeley (JIRA) Wed, 17 Sep 2014 07:04:53 -0700

    [ 
https://issues.apache.org/jira/browse/SOLR-6526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14137281#comment-14137281
 ]


Yonik Seeley commented on SOLR-6526:
------------------------------------

Related to "export", I don't think we should be adding more interfaces where we 
have existing ones that serve just fine.

If we can't already, we should be able to stream large sets of documents on 
both the server side (solr) and on the client side (solrj).  This ability 
shouldn't be tied to a different export handler.
It really feels like /export should just be an optimization, not a different 
interface (perhaps with an execution hint if one wanted to force it to go one 
way or the other).

Then this issue could add any additional streaming functionality needed over a 
normal document list response.  IIRC, there is already some streaming 
functionality to SolrJ, but not sure what else may be needed.

> Solr Streaming API
> ------------------
>
>                 Key: SOLR-6526
>                 URL: https://issues.apache.org/jira/browse/SOLR-6526
>             Project: Solr
>          Issue Type: New Feature
>          Components: clients - java
>            Reporter: Joel Bernstein
>             Fix For: 5.0
>
>         Attachments: SOLR-6526.patch
>
>
> It would be great if there was a SolrJ library that could connect to Solr's 
> /export handler (SOLR-5244) and perform streaming operations on the sorted 
> result sets.
> This ticket defines the base interfaces and implementations for the Streaming 
> API. The base API contains three classes:
> *SolrStream*: This represents a stream from a single Solr instance. It speaks 
> directly to the /export handler and provides methods to read() Tuples and 
> close() the stream
> *CloudSolrStream*: This represents a stream from a SolrCloud collection. It 
> speaks with Zk to discover the Solr instances in the collection and then 
> creates SolrStreams to make the requests. The results from the underlying 
> streams are merged inline to produce a single sorted stream of tuples.
> *Tuple*: The data structure returned by the read() method of the SolrStream 
> API. It is nested to support grouping and Cartesian product set operations.
> Once these base classes are implemented it paves the way for building 
> *Decorator* streams that perform operations on the sorted Tuple sets. For 
> example a CollapseStream could be created:
> {code}
> CollapseStream collapseStream = new CollapseStream(new CloudSolrStream(zkUrl, 
> queryRequest));
> Tuple tuple = null;
> while((tuple = collapseStream.read()) != null) {
> } 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SOLR-6526) Solr Streaming API

Reply via email to