[
https://issues.apache.org/jira/browse/IMPALA-8755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17104254#comment-17104254
]
Zoltán Borók-Nagy commented on IMPALA-8755:
-------------------------------------------
Hi [~jbapple], I think the feature is complete. It just needs one more commit
to make it enabled by default, plus move some tests from CustomClusterTestSuite
to ImpalaTestSuite. Do you have an ETA for that, [~norbertluksa]?
> Implement Z-ordering for Impala
> -------------------------------
>
> Key: IMPALA-8755
> URL: https://issues.apache.org/jira/browse/IMPALA-8755
> Project: IMPALA
> Issue Type: New Feature
> Reporter: Zoltán Borók-Nagy
> Assignee: Norbert Luksa
> Priority: Major
>
> Implement Z-ordering for Impala: [https://en.wikipedia.org/wiki/Z-order_curve]
> A Z-order curve defines an ordering on multi-dimensional data. Data sorted
> that way can be efficiently filtered by min/max statistics regarding to the
> columns participating in the ordering.
> Impala currently only supports lexicographic ordering via the SORT BY clause.
> This strongly prefers the first column, i.e. given the "SORT BY A, B, C"
> clause => A will be totally ordered (hence filtering on A will be very
> efficient), but values belonging to B and C will be scattered throughout the
> data set (hence filtering on B or C will barely do any good).
> We could add a new clause, e.g. a "ZSORT BY" clause to Impala that writes the
> data in Z-order.
> "ZSORT BY A, B C" would cluster the rows in a way that filtering on A, B, or
> C would be equally efficient.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]