amansinha100 opened a new pull request #1715: DRILL-7117: Support creation of equi-depth histogram for selected dat… URL: https://github.com/apache/drill/pull/1715 …a types. - This PR adds support for creating equi-depth histograms on the following data types: INT, BIGINT, FLOAT4, FLOAT8, DATE, TIME, TIMESTAMP and BOOLEAN. No selectivity calculations have been modified yet (that will be done in a later PR). - The histogram is built using the t-digest approximation algorithm and associated data structure. Please see details in [DRILL-7117](https://issues.apache.org/jira/browse/DRILL-7117) and the parent JIRA [DRILL-6992](https://issues.apache.org/jira/browse/DRILL-6992) which contains a link to the design document. - The same ANALYZE command used for NDV etc will also gather histograms and no new syntax has been added. For testing, I have done a bunch of manual testing using both skewed and uniform distributions and with different data types. Please see [DRILL-7117](https://issues.apache.org/jira/browse/DRILL-7117) for results of such testing. No unit tests have been added yet since the bucket boundaries change slightly by the underlying t-digest. Making this repeatable and unit-testable needs some thinking and I will do this in a follow-up PR.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
