[
https://issues.apache.org/jira/browse/SOLR-6526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14137422#comment-14137422
]
Yonik Seeley commented on SOLR-6526:
------------------------------------
bq. The /export handler doesn't mesh with the full range of Solr features.
I know... and in hindsight I don't think it should have gone through for those
reasons. There's no reason it *couldn't* mesh with the full range of Solr
features.
Let's not compound that problem by adding on more usecase-specific APIs when
they aren't needed.
This streaming feature should work with any document list, not just "export".
If you look at /export (i.e. the xsort response format), it's very very close
to using the standard doc list format.
I think it should just be changed so that it matches.
It's currently this:
{code}
{"numFound":32, "docs":[{"id":"VA902B","popularity":6},
{code}
But should be this:
{code}
{
"response":{"numFound":32,"start":0,"docs":[
{
"id":"VA902B",
"popularity":6},
{code}
Then all of the existing solr client code and libraries out there can already
read it.
And given that /export is so restrictive (only works with certain sorts, fields
with docvalues, etc) it should not be the default request handler target for
this streaming.
> Solr Streaming API
> ------------------
>
> Key: SOLR-6526
> URL: https://issues.apache.org/jira/browse/SOLR-6526
> Project: Solr
> Issue Type: New Feature
> Components: clients - java
> Reporter: Joel Bernstein
> Fix For: 5.0
>
> Attachments: SOLR-6526.patch
>
>
> It would be great if there was a SolrJ library that could connect to Solr's
> /export handler (SOLR-5244) and perform streaming operations on the sorted
> result sets.
> This ticket defines the base interfaces and implementations for the Streaming
> API. The base API contains three classes:
> *SolrStream*: This represents a stream from a single Solr instance. It speaks
> directly to the /export handler and provides methods to read() Tuples and
> close() the stream
> *CloudSolrStream*: This represents a stream from a SolrCloud collection. It
> speaks with Zk to discover the Solr instances in the collection and then
> creates SolrStreams to make the requests. The results from the underlying
> streams are merged inline to produce a single sorted stream of tuples.
> *Tuple*: The data structure returned by the read() method of the SolrStream
> API. It is nested to support grouping and Cartesian product set operations.
> Once these base classes are implemented it paves the way for building
> *Decorator* streams that perform operations on the sorted Tuple sets. For
> example a CollapseStream could be created:
> {code}
> CollapseStream collapseStream = new CollapseStream(new CloudSolrStream(zkUrl,
> queryRequest));
> Tuple tuple = null;
> while((tuple = collapseStream.read()) != null) {
> }
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]