[
https://issues.apache.org/jira/browse/IMPALA-5254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Thomas Tauber-Marshall reassigned IMPALA-5254:
----------------------------------------------
Assignee: (was: Thomas Tauber-Marshall)
> Take Kudu partitioning into account when deciding to repartition
> ----------------------------------------------------------------
>
> Key: IMPALA-5254
> URL: https://issues.apache.org/jira/browse/IMPALA-5254
> Project: IMPALA
> Issue Type: Improvement
> Components: Frontend
> Affects Versions: Impala 2.9.0
> Reporter: Thomas Tauber-Marshall
> Priority: Major
> Labels: kudu, performance
>
> A change that's about to go in (IMPALA-3742) adds partitioning of inserts
> into Kudu tables that matches the partitioning scheme of the table.
> As is, the patch will always repartition inserts that are sufficiently large,
> but we should improve this to not repartition if the input is already
> partitioned correctly.
> This is somewhat complicated because Kudu allows tables to have multi-level
> partitioning schemes that include both hash and range partitioning, and we're
> currently treating Kudu's partition decisions as a black box because we don't
> have a good way of representing that with our DataPartition functionality and
> because we don't want to have to guarantee our hash function matches up with
> Kudu's.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]