[
https://issues.apache.org/jira/browse/SOLR-5244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14096078#comment-14096078
]
Joel Bernstein commented on SOLR-5244:
--------------------------------------
Yes I think we should do what you suggest. I may not have time to implement
this before Solr 4.10 though. I don't think we need to hold this up because
the client interface will remain stable and we can simply slide in the new
SearchHandler in a later release.
Also moving forward they''ll be different types of export functionality and a
specialized SearchHandler will be needed to sort out the different options.
> Exporting Full Sorted Result Sets
> ---------------------------------
>
> Key: SOLR-5244
> URL: https://issues.apache.org/jira/browse/SOLR-5244
> Project: Solr
> Issue Type: New Feature
> Components: search
> Affects Versions: 5.0
> Reporter: Joel Bernstein
> Assignee: Joel Bernstein
> Priority: Minor
> Fix For: 5.0, 4.10
>
> Attachments: 0001-SOLR_5244.patch, SOLR-5244.patch, SOLR-5244.patch,
> SOLR-5244.patch, SOLR-5244.patch, SOLR-5244.patch, SOLR-5244.patch
>
>
> This ticket allows Solr to export full sorted result sets. A new export
> request handler has been created that sets up the default writer type
> (SortingResponseWriter) and the required rank query (ExportQParserPlugin).
> The syntax is:
> {code}
> /solr/collection1/export?q=*:*&fl=a,b,c&sort=a desc,b desc
> {code}
> This capability will open up Solr for a whole range of uses that were
> typically done using aggregation engines like Hadoop. For example:
> *Large Distributed Joins*
> A client outside of Solr calls two different Solr collections and returns the
> results sorted by a join key. The client iterates through both streams and
> performs a merge join.
> *Fully Distributed Field Collapsing/Grouping*
> A client outside of Solr makes individual calls to all the servers in a
> single collection and returns results sorted by the collapse key. The client
> merge joins the sorted lists on the collapse key to perform the field
> collapse.
> *High Cardinality Distributed Aggregation*
> A client outside of Solr makes individual calls to all the servers in a
> single collection and sorts on a high cardinality field. The client then
> merge joins the sorted lists to perform the high cardinality aggregation.
> *Large Scale Time Series Rollups*
> A client outside Solr makes individual calls to all servers in a collection
> and sorts on time dimensions. The client merge joins the sorted result sets
> and rolls up the time dimensions as it iterates through the data.
> In these scenarios Solr is being used as a distributed sorting engine.
> Developers can write clients that take advantage of this sorting capability
> in any way they wish.
> *Session Analysis and Aggregation*
> A client outside Solr makes individual calls to all servers in a collection
> and sorts on the sessionID. The client merge joins the sorted results and
> aggregates sessions as it iterates through the results.
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]