[ 
https://issues.apache.org/jira/browse/SOLR-16235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-16235:
----------------------------------
    Description: 
The latest versions of Streaming Expressions support the drill function which 
is designed for high cardinality aggregations 
(https://solr.apache.org/guide/8_6/stream-source-reference.html#drill). Drill 
allows users to push down a Streaming Expression into the export handler itself 
and emit aggregated tuples over the wire. Because drill still takes advantage 
of the sort order of the export handler it supports unlimited cardinality. Do 
to the massive performance improvements in the export handler in Solr 9, drill 
is almost as fast as facets. Drill is not a replacement for facet mode though, 
which is still faster in low to medium cardinality situations.

This improvement should be rather easy to implement but there are some 
questions about the design. One thought I had was to add a new mode called 
*drill* to the existing *map_reduce* and *facet* modes.  This would preserve 
the existing map_reduce execution plan. The other approach is simply to always 
use drill in *map_reduce* mode aggregations.

  was:
The latest versions of Streaming Expressions support the drill function which 
is designed for high cardinality aggregations 
(https://solr.apache.org/guide/8_6/stream-source-reference.html#drill). Drill 
allows users to push down a Streaming Expression into the export handler itself 
and emit aggregated tuples over the wire. Because drill still takes advantage 
of the sort order of the export handler it supports unlimited cardinality.

This improvement should be rather easy to implement but there are some 
questions about the design. One thought I had was to add a new mode called 
*drill* to the existing *map_reduce* and *facet* modes.  This would preserve 
the existing map_reduce execution plan. The other approach is simply to always 
use drill in *map_reduce* mode aggregations.


> Allow Solr SQL to use the drill Streaming Expression for high cardinality 
> aggregations
> --------------------------------------------------------------------------------------
>
>                 Key: SOLR-16235
>                 URL: https://issues.apache.org/jira/browse/SOLR-16235
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: Parallel SQL
>            Reporter: Joel Bernstein
>            Assignee: Joel Bernstein
>            Priority: Major
>              Labels: RobustSQL
>
> The latest versions of Streaming Expressions support the drill function which 
> is designed for high cardinality aggregations 
> (https://solr.apache.org/guide/8_6/stream-source-reference.html#drill). Drill 
> allows users to push down a Streaming Expression into the export handler 
> itself and emit aggregated tuples over the wire. Because drill still takes 
> advantage of the sort order of the export handler it supports unlimited 
> cardinality. Do to the massive performance improvements in the export handler 
> in Solr 9, drill is almost as fast as facets. Drill is not a replacement for 
> facet mode though, which is still faster in low to medium cardinality 
> situations.
> This improvement should be rather easy to implement but there are some 
> questions about the design. One thought I had was to add a new mode called 
> *drill* to the existing *map_reduce* and *facet* modes.  This would preserve 
> the existing map_reduce execution plan. The other approach is simply to 
> always use drill in *map_reduce* mode aggregations.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to