lamber-ken commented on issue #1105: [HUDI-405] Fix sync no hive partition at 
first time
URL: https://github.com/apache/incubator-hudi/pull/1105#issuecomment-566390544
 
 
   > Don't follow why the partitions are not visible after the commit? Can we 
first layout the root cause for that?
   
   ### Why the first time can't get the data
   
   At the first time, the `lastCommitTimeSynced` of the target table is not 
present, HoodieHiveClient gets all partition paths by 
`FSUtils.getAllPartitionPaths`. If `HIVE_ASSUME_DATE_PARTITION_OPT_KEY` is set 
true, the fsutil can only match `basePath + /*/*/*`, but actually partition is 
`basePath + /yyyy-MM-dd`. 
   
   
![image](https://user-images.githubusercontent.com/20113411/70967797-5cc72f00-20d2-11ea-8004-6d910879d1ac.png)
   
   ### Two ways to solve this problem
   1, Set `HIVE_ASSUME_DATE_PARTITION_OPT_KEY` to `false`. After that, 
HoodieHiveClient will get all folder partitions, for detail, you can visit 
`FSUtils#getAllPartitionPaths`.
   
   2, If user custom the partition extractor, HiveSyncTool sync no partition at 
the first commit, we can get the partiton info from `HoodieTimeline`, just like 
the code I modified.
   
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to