[
https://issues.apache.org/jira/browse/IMPALA-7141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16774450#comment-16774450
]
ASF subversion and git services commented on IMPALA-7141:
---------------------------------------------------------
Commit c50aa17da6854f52efee02df3f11a170170a938e in impala's branch
refs/heads/2.x from Todd Lipcon
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=c50aa17 ]
IMPALA-7141 (part 1): clean up handling of default/dummy partition
Currently, HdfsTable inconsistently uses the term "default partition"
to refer to two different concepts:
1) For unpartitioned tables, a single partition with ID 1 and no partition
keys is created and added to the partition map.
2) All tables have an additional partition added with partition ID -1
which acts as a sort of prototype for partition creation: when new
partitions are created during an INSERT operation, the file format
and other related options are copied out of this special partition.
This partition is inconsistently referred to as either the "default
partition" or the "dummy partition".
The handling of this second case (the partition with id -1) was somewhat
messy:
- the partition shows up in the partitionMap_ member, but does not show
up in the partitionIds_ member.
- almost all of the call sites that iterate through the partitions of an
HdfsTable instance ended up skipping over the dummy partition.
- several call sites called getPartitions().size() but then had to
adjust the result by subtracting one in order to actually count the
number of partitions in a table.
- similarly, test assertions had to assert that tables with 24
partitions had an expected partition map size of 25.
In order to address the above, this patch makes the following changes:
- getPartitions() and getPartitionMap() no longer include the dummy
partition. This removes a bunch of special case checks to skip over
the dummy partition or to adjust partition counts based on it.
- to clarify the purpose of this partition, references to it are renamed
to "prototype partition" instead of "default partition".
- when converting the HdfsTable to/from Thrift, the prototype partition
is included in its own field in the struct, instead of being stuffed
into the same map with the true partitions of the table. This reflects
the fact that this partition is special (eg missing fields like
'location' which otherwise are required for real partitions)
This change should should be entirely internal with no functional
differences. As such, the only testing changes are some fixes for
assertions on the Thrift serialized structures and other internals.
Change-Id: I15e91b50eb7c2a5e0bac8c33d603d6cd8cbaca2e
Reviewed-on: http://gerrit.cloudera.org:8080/10711
Tested-by: Impala Public Jenkins <[email protected]>
Reviewed-by: Todd Lipcon <[email protected]>
> Extract interfaces for partition pruning prior to fetching partitions
> ---------------------------------------------------------------------
>
> Key: IMPALA-7141
> URL: https://issues.apache.org/jira/browse/IMPALA-7141
> Project: IMPALA
> Issue Type: Sub-task
> Reporter: Todd Lipcon
> Assignee: Todd Lipcon
> Priority: Major
> Fix For: Impala 3.1.0
>
>
> In the LocalCatalog, we want to only fetch the partitions that are referenced
> by a query -- i.e. we must prune partitions based only on the partition names
> and not the entire partition objects. However, the PartitionPruner
> implementation currently expect to be able to fetch the full map of
> HdfsPartition objects from the table and work on them as is.
> This JIRA is to do some refactorings such that the PartitionPruner interacts
> with a slightly more restricted interface that only exposes the minimal
> interaction points with the table and the partition map. Once it has computed
> a list of remaining partitions, it can then instruct the table to fully load
> them to yield the resulting full Partition objects.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]