lamber-ken edited a comment on issue #1105: [HUDI-405] Fix sync no hive 
partition at first time
URL: https://github.com/apache/incubator-hudi/pull/1105#issuecomment-566390544
 
 
   > Don't follow why the partitions are not visible after the commit? Can we 
first layout the root cause for that?
   
   ### Why the first time can't get the data
   
   At the first time, the `lastCommitTimeSynced` of the target table is not 
present, HoodieHiveClient gets all partition paths by 
`FSUtils.getAllPartitionPaths`. If `HIVE_ASSUME_DATE_PARTITION_OPT_KEY` is set 
`true`, the fsutil can only match `basePath + /*/*/*`, but actually partition 
is `basePath + /yyyy-MM-dd`. 
   
   
![image](https://user-images.githubusercontent.com/20113411/70967797-5cc72f00-20d2-11ea-8004-6d910879d1ac.png)
   
   ### Two ways to solve this problem
   1, Set `HIVE_ASSUME_DATE_PARTITION_OPT_KEY` to `false`. After that, 
HoodieHiveClient will get all folder partitions, for detail, you can visit 
`FSUtils#getAllPartitionPaths`.
   
   2, If user custom the partition extractor, HiveSyncTool sync no partition at 
the first commit, we can get the partiton info from `HoodieTimeline`, just like 
the code I modified.
   
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to