jarrodcodes commented on issue #5330:
URL: https://github.com/apache/hudi/issues/5330#issuecomment-1555951277

   Seeing this issue with Flink 1.15 and Hudi 0.12.3: 
   
   `java.lang.RuntimeException: Duplicate fileId 
00000002-fb1c-47ac-a203-397ffbbd9b91 from bucket 2 of partition dt=2022-11-21 
found during the BucketStreamWriteFunction index bootstrap.
        at 
org.apache.hudi.sink.bucket.BucketStreamWriteFunction.lambda$bootstrapIndexIfNeed$1(BucketStreamWriteFunction.java:162)
        at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(Unknown 
Source)
        at java.base/java.util.stream.ReferencePipeline$3$1.accept(Unknown 
Source)
        at 
java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(Unknown 
Source)
        at java.base/java.util.stream.AbstractPipeline.copyInto(Unknown Source)
        at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(Unknown 
Source)
        at 
java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(Unknown 
Source)
        at 
java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(Unknown
 Source)
        at java.base/java.util.stream.AbstractPipeline.evaluate(Unknown Source)
        at java.base/java.util.stream.ReferencePipeline.forEach(Unknown Source)
        at 
org.apache.hudi.sink.bucket.BucketStreamWriteFunction.bootstrapIndexIfNeed(BucketStreamWriteFunction.java:155)
        at 
org.apache.hudi.sink.bucket.BucketStreamWriteFunction.processElement(BucketStreamWriteFunction.java:111)
        at 
org.apache.flink.streaming.api.operators.ProcessOperator.processElement(ProcessOperator.java:66)
        at 
org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:233)
        at 
org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.processElement(AbstractStreamTaskNetworkInput.java:134)
        at 
org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.emitNext(AbstractStreamTaskNetworkInput.java:105)
        at 
org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65)
        at 
org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:519)
        at 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:203)
        at 
org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:804)
        at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:753)
        at 
org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:948)
        at 
org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:927)
        at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:741)
        at org.apache.flink.runtime.taskmanager.Task.run(Task.java:563)
        at java.base/java.lang.Thread.run(Unknown Source)
   `
   Looking for any suggestions, thank you.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to