stym06 opened a new issue #3890:
URL: https://github.com/apache/hudi/issues/3890


   **_Tips before filing an issue_**
   
   **Describe the problem you faced**
   
   Hudi HiveSyncTool did not add some partitions in Hive, and just added the 
newest partition
   
   **Expected behavior**
   
   It should have added all the partitions
   
   **Environment Description**
   
   * Hudi version : 0.9.0
   
   * Spark version : 2.4.4
   
   * Hive version : 3.2.1
   
   * Hadoop version : 3.1.1
   
   * Storage (HDFS/S3/GCS..) : Azure 
   
   * Running on Docker? (yes/no) : K8s
   
   
   **Additional context**
   
   I run this sync job every day using cron. It did not run for some reasons on 
27th and 28th. When I reran the job on 29th, it just added 29th partition, 
leaving behind the partitions for day 27 and 28. Previous partitions were 
already created.
   
   **Stacktrace**
   
   * Data and Partitions after day-29 run
   ```
   Found 12 items
   drwxr-xr-x   - root supergroup          0 2021-10-29 10:48 
wasb://[email protected]/data/pipelines/hudi/kafka/telemetrics/dp.hmi.quectel.test.data.packet.v2/.hoodie
   drwxr-xr-x   - root supergroup          0 2021-10-22 10:11 
wasb://[email protected]/data/pipelines/hudi/kafka/telemetrics/dp.hmi.quectel.test.data.packet.v2/dt=2021-10-19
   drwxr-xr-x   - root supergroup          0 2021-10-22 10:10 
wasb://[email protected]/data/pipelines/hudi/kafka/telemetrics/dp.hmi.quectel.test.data.packet.v2/dt=2021-10-20
   drwxr-xr-x   - root supergroup          0 2021-10-22 10:10 
wasb://[email protected]/data/pipelines/hudi/kafka/telemetrics/dp.hmi.quectel.test.data.packet.v2/dt=2021-10-21
   drwxr-xr-x   - root supergroup          0 2021-10-22 18:40 
wasb://[email protected]/data/pipelines/hudi/kafka/telemetrics/dp.hmi.quectel.test.data.packet.v2/dt=2021-10-22
   drwxr-xr-x   - root supergroup          0 2021-10-23 18:40 
wasb://[email protected]/data/pipelines/hudi/kafka/telemetrics/dp.hmi.quectel.test.data.packet.v2/dt=2021-10-23
   drwxr-xr-x   - root supergroup          0 2021-10-24 18:40 
wasb://[email protected]/data/pipelines/hudi/kafka/telemetrics/dp.hmi.quectel.test.data.packet.v2/dt=2021-10-24
   drwxr-xr-x   - root supergroup          0 2021-10-25 18:40 
wasb://[email protected]/data/pipelines/hudi/kafka/telemetrics/dp.hmi.quectel.test.data.packet.v2/dt=2021-10-25
   drwxr-xr-x   - root supergroup          0 2021-10-26 18:40 
wasb://[email protected]/data/pipelines/hudi/kafka/telemetrics/dp.hmi.quectel.test.data.packet.v2/dt=2021-10-26
   drwxr-xr-x   - root supergroup          0 2021-10-27 18:40 
wasb://[email protected]/data/pipelines/hudi/kafka/telemetrics/dp.hmi.quectel.test.data.packet.v2/dt=2021-10-27
   drwxr-xr-x   - root supergroup          0 2021-10-29 08:21 
wasb://[email protected]/data/pipelines/hudi/kafka/telemetrics/dp.hmi.quectel.test.data.packet.v2/dt=2021-10-28
   drwxr-xr-x   - root supergroup          0 2021-10-29 10:48 
wasb://[email protected]/data/pipelines/hudi/kafka/telemetrics/dp.hmi.quectel.test.data.packet.v2/dt=2021-10-29
   
   hive> show partitions dp_hmi_quectel_test_data_packet_v2;
   OK
   dt=2021-10-19
   dt=2021-10-20
   dt=2021-10-21
   dt=2021-10-22
   dt=2021-10-23
   dt=2021-10-24
   dt=2021-10-25
   dt=2021-10-26
   dt=2021-10-29
   ```
   
   * Day-29 Job logs
   
   ```
   2021-10-29 10:46:02,513 INFO  [main] hive.HiveSyncTool 
(HiveSyncTool.java:syncHoodieTable(190)) - Last commit time synced was found to 
be 20211026005933
   2021-10-29 10:46:02,513 INFO  [main] common.AbstractSyncHoodieClient 
(AbstractSyncHoodieClient.java:getPartitionsWrittenToSince(162)) - Last commit 
time synced is 20211026005933, Getting commits since then
   2021-10-29 10:46:03,070 INFO  [main] hive.HiveSyncTool 
(HiveSyncTool.java:syncHoodieTable(192)) - Storage partitions scan complete. 
Found 1
   2021-10-29 10:46:03,070 INFO  [main] metastore.HiveMetaStore 
(HiveMetaStore.java:logInfo(895)) - 0: get_partitions : 
tbl=hive.default.dp_hmi_quectel_imu_data_packet_v2
   2021-10-29 10:46:03,071 INFO  [main] HiveMetaStore.audit 
(HiveMetaStore.java:logAuditEvent(347)) - ugi=root  ip=unknown-ip-addr      
cmd=get_partitions : tbl=hive.default.dp_hmi_quectel_imu_data_packet_v2
   2021-10-29 10:46:03,104 INFO  [main] hive.HiveSyncTool 
(HiveSyncTool.java:syncPartitions(333)) - New Partitions [dt=2021-10-29]
   2021-10-29 10:46:03,104 INFO  [main] ddl.HMSDDLExecutor 
(HMSDDLExecutor.java:addPartitionsToTable(181)) - Adding partitions 1 to table 
dp_hmi_quectel_imu_data_packet_v2
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to