[ 
https://issues.apache.org/jira/browse/DRILL-230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Phillips updated DRILL-230:
----------------------------------

    Attachment: DRILL-230.patch

> Build a sampling range partitioner
> ----------------------------------
>
>                 Key: DRILL-230
>                 URL: https://issues.apache.org/jira/browse/DRILL-230
>             Project: Apache Drill
>          Issue Type: New Feature
>            Reporter: Jacques Nadeau
>            Assignee: Steven Phillips
>         Attachments: DRILL-230.patch
>
>
> Create a new operator that caches a number of record batches and then 
> coordinates across the cluster on the distribution of partitioning keys to 
> try to determine a reasonable set of range partitions.  The outgoing stream 
> should include a partition key that is equal to the width of the receiving 
> fragment.
> - histogram or similar should be held in the distributed cache
> - need to figure out the logic for how long to wait before the partitioning 
> estimate is good enough.  
> - need to update the partitioning sender so that we can drop the partitioning 
> column rather than sending it onward.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to