[
https://issues.apache.org/jira/browse/HIVE-26335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
zhangdonglin updated HIVE-26335:
--------------------------------
Description:
Hi,
I found that when partition A already exists, after calling
Hive.loadPartition to load data into partition A again, the metadata of
partition params in table PARTITION_PARAMS was not updated. even I set
hasFollowingStatsTask=false.
The reason is below, in the method of Hive.loadPartition, newTPart was set
to oldPart when old partition exists,
{code:java}
Partition newTPart = oldPart != null ? oldPart : new Partition(tbl, partSpec,
newPartPath);
...
if (oldPart == null) {
// ...
} else {
setStatsPropAndAlterPartition(hasFollowingStatsTask, tbl, newTPart);
}
private void setStatsPropAndAlterPartition(boolean hasFollowingStatsTask, Table
tbl,
Partition newTPart) throws MetaException, TException {
EnvironmentContext environmentContext = null;
if (hasFollowingStatsTask) {
environmentContext = new EnvironmentContext();
environmentContext.putToProperties(StatsSetupConst.DO_NOT_UPDATE_STATS,
StatsSetupConst.TRUE);
}
LOG.debug("Altering existing partition " + newTPart.getSpec());
getSychronizedMSC().alter_partition(tbl.getDbName(), tbl.getTableName(),
newTPart.getTPartition(), environmentContext);
}{code}
Due to this, when calling alter_partition, oldPart info was send to
metastore and it will not update partition params.
I think we should recompute the numFiles and totalSize of the new partition
before calling alter_partition
was:
Hi,
I found that when partition A already exists, after calling
Hive.loadPartition to load data into partition A, the partition params in table
PARTITION_PARAMS was not updated. even I set hasFollowingStatsTask=false.
The reason is below, newTPart was set to oldPart when old partition exists,
{code:java}
Partition newTPart = oldPart != null ? oldPart : new Partition(tbl, partSpec,
newPartPath); {code}
Due to this, when calling alter_partition, oldPart info was send to
metastore and it will not update partition params.
> Metadata of Partition params dit not updated after calling Hive.loadPartition
> -----------------------------------------------------------------------------
>
> Key: HIVE-26335
> URL: https://issues.apache.org/jira/browse/HIVE-26335
> Project: Hive
> Issue Type: Bug
> Components: Hive
> Affects Versions: All Versions
> Reporter: zhangdonglin
> Priority: Major
>
> Hi,
> I found that when partition A already exists, after calling
> Hive.loadPartition to load data into partition A again, the metadata of
> partition params in table PARTITION_PARAMS was not updated. even I set
> hasFollowingStatsTask=false.
> The reason is below, in the method of Hive.loadPartition, newTPart was set
> to oldPart when old partition exists,
> {code:java}
> Partition newTPart = oldPart != null ? oldPart : new Partition(tbl, partSpec,
> newPartPath);
> ...
> if (oldPart == null) {
> // ...
> } else {
> setStatsPropAndAlterPartition(hasFollowingStatsTask, tbl, newTPart);
> }
> private void setStatsPropAndAlterPartition(boolean hasFollowingStatsTask,
> Table tbl,
> Partition newTPart) throws MetaException, TException {
> EnvironmentContext environmentContext = null;
> if (hasFollowingStatsTask) {
> environmentContext = new EnvironmentContext();
> environmentContext.putToProperties(StatsSetupConst.DO_NOT_UPDATE_STATS,
> StatsSetupConst.TRUE);
> }
>
> LOG.debug("Altering existing partition " + newTPart.getSpec());
> getSychronizedMSC().alter_partition(tbl.getDbName(), tbl.getTableName(),
> newTPart.getTPartition(), environmentContext);
> }{code}
> Due to this, when calling alter_partition, oldPart info was send to
> metastore and it will not update partition params.
> I think we should recompute the numFiles and totalSize of the new
> partition before calling alter_partition
--
This message was sent by Atlassian Jira
(v8.20.7#820007)