sivabalan narayanan created HUDI-1507:
-----------------------------------------
Summary: Hive sync having issues w/ Clustering
Key: HUDI-1507
URL: https://issues.apache.org/jira/browse/HUDI-1507
Project: Apache Hudi
Issue Type: Bug
Components: Storage Management
Affects Versions: 0.7.0
Reporter: sivabalan narayanan
I was trying out clustering w/ test suite job and ran into hive sync issues.
21/01/05 16:45:05 WARN DagNode: Executing ClusteringNode node
5522853c-653b-4d92-acf4-d299c263a77f
21/01/05 16:45:05 WARN AbstractHoodieWriteClient: Scheduling clustering at
instant time :20210105164505 clustering strategy
org.apache.hudi.client.clustering.plan.strategy.SparkRecentDaysClusteringPlanStrategy,
clustering sort cols : _row_key, target partitions for clustering :: 0, inline
cluster max commit : 1
21/01/05 16:45:05 WARN HoodieTestSuiteWriter: Clustering instant ::
20210105164505
21/01/05 16:45:22 WARN DagScheduler: Executing node "second_hive_sync" ::
\{"queue_name":"adhoc","engine":"mr","name":"80325009-bb92-4df5-8c34-71bd75d001b8","config":"second_hive_sync"}
21/01/05 16:45:22 ERROR HiveSyncTool: Got runtime exception when hive syncing
org.apache.hudi.exception.HoodieIOException: unknown action in timeline
replacecommit
at
org.apache.hudi.common.table.timeline.TimelineUtils.lambda$getAffectedPartitions$1(TimelineUtils.java:99)
at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:267)
at
java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
at
org.apache.hudi.common.table.timeline.TimelineUtils.getAffectedPartitions(TimelineUtils.java:102)
at
org.apache.hudi.common.table.timeline.TimelineUtils.getPartitionsWritten(TimelineUtils.java:50)
at
org.apache.hudi.sync.common.AbstractSyncHoodieClient.getPartitionsWrittenToSince(AbstractSyncHoodieClient.java:136)
at org.apache.hudi.hive.HiveSyncTool.syncHoodieTable(HiveSyncTool.java:145)
at org.apache.hudi.hive.HiveSyncTool.syncHoodieTable(HiveSyncTool.java:94)
at
org.apache.hudi.utilities.deltastreamer.DeltaSync.syncHive(DeltaSync.java:589)
at
org.apache.hudi.integ.testsuite.helpers.HiveServiceProvider.syncToLocalHiveIfNeeded(HiveServiceProvider.java:53)
at
org.apache.hudi.integ.testsuite.dag.nodes.HiveSyncNode.execute(HiveSyncNode.java:41)
at
org.apache.hudi.integ.testsuite.dag.scheduler.DagScheduler.executeNode(DagScheduler.java:139)
at
org.apache.hudi.integ.testsuite.dag.scheduler.DagScheduler.lambda$execute$0(DagScheduler.java:105)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
--
This message was sent by Atlassian Jira
(v8.3.4#803005)