[jira] [Commented] (SOLR-6526) Solr Streaming API

Yonik Seeley (JIRA) Wed, 17 Sep 2014 08:47:51 -0700

    [ 
https://issues.apache.org/jira/browse/SOLR-6526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14137422#comment-14137422
 ]


Yonik Seeley commented on SOLR-6526:
------------------------------------

bq. The /export handler doesn't mesh with the full range of Solr features.

I know... and in hindsight I don't think it should have gone through for those 
reasons.  There's no reason it *couldn't* mesh with the full range of Solr 
features.

Let's not compound that problem by adding on more usecase-specific APIs when 
they aren't needed.
This streaming feature should work with any document list, not just "export".

If you look at /export (i.e. the xsort response format), it's very very close 
to using the standard doc list format.
I think it should just be changed so that it matches.

It's currently this:
{code}
{"numFound":32, "docs":[{"id":"VA902B","popularity":6},
{code}
But should be this:
{code}
{
  "response":{"numFound":32,"start":0,"docs":[
      {
        "id":"VA902B",
        "popularity":6},
{code}

Then all of the existing solr client code and libraries out there can already 
read it.

And given that /export is so restrictive (only works with certain sorts, fields 
with docvalues, etc) it should not be the default request handler target for 
this streaming.

> Solr Streaming API
> ------------------
>
>                 Key: SOLR-6526
>                 URL: https://issues.apache.org/jira/browse/SOLR-6526
>             Project: Solr
>          Issue Type: New Feature
>          Components: clients - java
>            Reporter: Joel Bernstein
>             Fix For: 5.0
>
>         Attachments: SOLR-6526.patch
>
>
> It would be great if there was a SolrJ library that could connect to Solr's 
> /export handler (SOLR-5244) and perform streaming operations on the sorted 
> result sets.
> This ticket defines the base interfaces and implementations for the Streaming 
> API. The base API contains three classes:
> *SolrStream*: This represents a stream from a single Solr instance. It speaks 
> directly to the /export handler and provides methods to read() Tuples and 
> close() the stream
> *CloudSolrStream*: This represents a stream from a SolrCloud collection. It 
> speaks with Zk to discover the Solr instances in the collection and then 
> creates SolrStreams to make the requests. The results from the underlying 
> streams are merged inline to produce a single sorted stream of tuples.
> *Tuple*: The data structure returned by the read() method of the SolrStream 
> API. It is nested to support grouping and Cartesian product set operations.
> Once these base classes are implemented it paves the way for building 
> *Decorator* streams that perform operations on the sorted Tuple sets. For 
> example a CollapseStream could be created:
> {code}
> CollapseStream collapseStream = new CollapseStream(new CloudSolrStream(zkUrl, 
> queryRequest));
> Tuple tuple = null;
> while((tuple = collapseStream.read()) != null) {
> } 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SOLR-6526) Solr Streaming API

Reply via email to