[
https://issues.apache.org/jira/browse/IMPALA-8755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Norbert Luksa resolved IMPALA-8755.
-----------------------------------
Target Version: Impala 4.0
Resolution: Implemented
> Implement Z-ordering for Impala
> -------------------------------
>
> Key: IMPALA-8755
> URL: https://issues.apache.org/jira/browse/IMPALA-8755
> Project: IMPALA
> Issue Type: New Feature
> Reporter: Zoltán Borók-Nagy
> Assignee: Norbert Luksa
> Priority: Major
>
> Implement Z-ordering for Impala: [https://en.wikipedia.org/wiki/Z-order_curve]
> A Z-order curve defines an ordering on multi-dimensional data. Data sorted
> that way can be efficiently filtered by min/max statistics regarding to the
> columns participating in the ordering.
> Impala currently only supports lexicographic ordering via the SORT BY clause.
> This strongly prefers the first column, i.e. given the "SORT BY A, B, C"
> clause => A will be totally ordered (hence filtering on A will be very
> efficient), but values belonging to B and C will be scattered throughout the
> data set (hence filtering on B or C will barely do any good).
> We could add a new clause, e.g. a "ZSORT BY" clause to Impala that writes the
> data in Z-order.
> "ZSORT BY A, B C" would cluster the rows in a way that filtering on A, B, or
> C would be equally efficient.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)