sivabalan narayanan created HUDI-1507:
-----------------------------------------

             Summary: Hive sync having issues w/ Clustering
                 Key: HUDI-1507
                 URL: https://issues.apache.org/jira/browse/HUDI-1507
             Project: Apache Hudi
          Issue Type: Bug
          Components: Storage Management
    Affects Versions: 0.7.0
            Reporter: sivabalan narayanan


I was trying out clustering w/ test suite job and ran into hive sync issues.

 

21/01/05 16:45:05 WARN DagNode: Executing ClusteringNode node 
5522853c-653b-4d92-acf4-d299c263a77f

21/01/05 16:45:05 WARN AbstractHoodieWriteClient: Scheduling clustering at 
instant time :20210105164505 clustering strategy 
org.apache.hudi.client.clustering.plan.strategy.SparkRecentDaysClusteringPlanStrategy,
 clustering sort cols : _row_key, target partitions for clustering :: 0, inline 
cluster max commit : 1

21/01/05 16:45:05 WARN HoodieTestSuiteWriter: Clustering instant :: 
20210105164505

21/01/05 16:45:22 WARN DagScheduler: Executing node "second_hive_sync" :: 
\{"queue_name":"adhoc","engine":"mr","name":"80325009-bb92-4df5-8c34-71bd75d001b8","config":"second_hive_sync"}

21/01/05 16:45:22 ERROR HiveSyncTool: Got runtime exception when hive syncing

org.apache.hudi.exception.HoodieIOException: unknown action in timeline 
replacecommit

 at 
org.apache.hudi.common.table.timeline.TimelineUtils.lambda$getAffectedPartitions$1(TimelineUtils.java:99)

 at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:267)

 at 
java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)

 at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)

 at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)

 at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)

 at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)

 at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)

 at 
org.apache.hudi.common.table.timeline.TimelineUtils.getAffectedPartitions(TimelineUtils.java:102)

 at 
org.apache.hudi.common.table.timeline.TimelineUtils.getPartitionsWritten(TimelineUtils.java:50)

 at 
org.apache.hudi.sync.common.AbstractSyncHoodieClient.getPartitionsWrittenToSince(AbstractSyncHoodieClient.java:136)

 at org.apache.hudi.hive.HiveSyncTool.syncHoodieTable(HiveSyncTool.java:145)

 at org.apache.hudi.hive.HiveSyncTool.syncHoodieTable(HiveSyncTool.java:94)

 at 
org.apache.hudi.utilities.deltastreamer.DeltaSync.syncHive(DeltaSync.java:589)

 at 
org.apache.hudi.integ.testsuite.helpers.HiveServiceProvider.syncToLocalHiveIfNeeded(HiveServiceProvider.java:53)

 at 
org.apache.hudi.integ.testsuite.dag.nodes.HiveSyncNode.execute(HiveSyncNode.java:41)

 at 
org.apache.hudi.integ.testsuite.dag.scheduler.DagScheduler.executeNode(DagScheduler.java:139)

 at 
org.apache.hudi.integ.testsuite.dag.scheduler.DagScheduler.lambda$execute$0(DagScheduler.java:105)

 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)

 at java.util.concurrent.FutureTask.run(FutureTask.java:266)

 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

 at java.lang.Thread.run(Thread.java:748)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to