[
https://issues.apache.org/jira/browse/SOLR-6526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14137481#comment-14137481
]
Joel Bernstein commented on SOLR-6526:
--------------------------------------
Also, it's important to keep in mind that existing Solr clients will simply
will run out of memory if they pull millions of records. They were built for a
specific use case that involves bringing back pages of results.
The export handler was built to export and sort millions of results. So there
is a basic mis-match between how the existing clients operate and how /export
handler operates. The use cases a fundamentally different. If you want to
return results pages you just use Solr's regular search.
The Streaming API though could apply to normal Solr searchs and /export'ed
result sets so it makes sense to bring them inline.
> Solr Streaming API
> ------------------
>
> Key: SOLR-6526
> URL: https://issues.apache.org/jira/browse/SOLR-6526
> Project: Solr
> Issue Type: New Feature
> Components: clients - java
> Reporter: Joel Bernstein
> Fix For: 5.0
>
> Attachments: SOLR-6526.patch
>
>
> It would be great if there was a SolrJ library that could connect to Solr's
> /export handler (SOLR-5244) and perform streaming operations on the sorted
> result sets.
> This ticket defines the base interfaces and implementations for the Streaming
> API. The base API contains three classes:
> *SolrStream*: This represents a stream from a single Solr instance. It speaks
> directly to the /export handler and provides methods to read() Tuples and
> close() the stream
> *CloudSolrStream*: This represents a stream from a SolrCloud collection. It
> speaks with Zk to discover the Solr instances in the collection and then
> creates SolrStreams to make the requests. The results from the underlying
> streams are merged inline to produce a single sorted stream of tuples.
> *Tuple*: The data structure returned by the read() method of the SolrStream
> API. It is nested to support grouping and Cartesian product set operations.
> Once these base classes are implemented it paves the way for building
> *Decorator* streams that perform operations on the sorted Tuple sets. For
> example a CollapseStream could be created:
> {code}
> CollapseStream collapseStream = new CollapseStream(new CloudSolrStream(zkUrl,
> queryRequest));
> Tuple tuple = null;
> while((tuple = collapseStream.read()) != null) {
> }
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]