xiarixiaoyao commented on pull request #4405:
URL: https://github.com/apache/hudi/pull/4405#issuecomment-1004680987


   @nsivabalan 
   According to the current logic, this problem is difficult to occur, because 
we determine whether the current partition needs alter by comparing whether the 
paths of the partitions are the same.  It is not common for Hudi tables to 
modify partition paths,Although  we can modify the partition path through alter 
partition syntax.
   
   It's easy to reproduce this problem in UT code,
   add follow codes after line 146 in TestHiveSyncTool
       _String testP = 
Arrays.stream(hiveClient.scanTablePartitions(hiveSyncConfig.tableName).get(0).getValues().get(0).split("-")).collect(Collectors.joining("/"));
       hiveClient.updatePartitionsToTable(hiveSyncConfig.tableName, 
Arrays.asList(testP));_
   
   
   BTW
   When we sync alter partitions,we should better set "numFiles" and 
"totalSize" for our alterd partition.
   
   since hive.stats.autogather=true by default, hive will try to calculate 
partitionStats( "numFiles" and. "totalSize") by default,
   1)for add partition operation:when sync new partitions to hive,hive will 
call updatePartitionStatsFast to update the Stats for every new partition。
   2)for alter partition operation:hive metastore will find the old partition 
which need to alter firstly;
   then hive metastore will try to update the partition stats by comparing the 
stats between old partition and our altered partition
   however the oldPartition has stats but our altered partition has no stats(we 
has not specified it), so the error occur.
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to