DavidZ1 commented on issue #8071:
URL: https://github.com/apache/hudi/issues/8071#issuecomment-1457281256

   I adjusted the` write.parquet.max.file.size` parameter to 3000, and the 
flink job started to run normally, but after several checkpoints, it failed. I 
checked the size of the written file, and the maximum was 80MB. The exception 
is still the same as before, as follows:
   `
   2023-03-06 20:51:56.793 [Checkpoint Timer] INFO  
org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - Checkpoint 3 of 
job 00000000000000000000000000000000 expired before completing.
   2023-03-06 20:51:56.795 [jobmanager-future-thread-5] INFO  
com.tencent.cloud.tstream.flink.OceanusCheckpointListener [] - Begin to post 
checkpoint failed event
   2023-03-06 20:51:56.851 [jobmanager-future-thread-5] INFO  
com.tencent.cloud.tstream.flink.OceanusCheckpointListener [] - Watchdog 
response: HttpResponseProxy{HTTP/1.1 200 OK [Content-Type: 
text/html;charset=UTF-8, Content-Length: 48, Server: Jetty(8.1.19.v20160209)] 
ResponseEntityProxy{[Content-Type: text/html;charset=UTF-8,Content-Length: 
48,Chunked: false]}}
   2023-03-06 20:51:57.128 [Checkpoint Timer] INFO  
org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - Triggering 
checkpoint 4 (type=CHECKPOINT) @ 1678107116797 for job 
00000000000000000000000000000000.
   2023-03-06 20:51:57.128 [pool-18-thread-1] INFO  
org.apache.hudi.sink.StreamWriteOperatorCoordinator [] - Executor executes 
action [taking checkpoint 4] success!
   2023-03-06 20:52:42.977 [flink-akka.actor.default-dispatcher-24] INFO  
org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Source: IvcVhrCan 
Source From Kafka with no watermarks -> hoodie_append_write: 
ods_icv_can_hudi_temp -> Sink: dummy (17/24) (adf8b18b3e47d3c35adcb79b1a953bb4) 
switched from RUNNING to FAILED on cql-ncgz4z8e-582618-taskmanager-1-1  @xxxxx 
(dataPort=xxx).
   org.apache.hudi.exception.HoodieException: Timeout(601000ms) while waiting 
for instant initialize
        at org.apache.hudi.sink.utils.TimeWait.waitFor(TimeWait.java:57) ~[?:?]
        at 
org.apache.hudi.sink.common.AbstractStreamWriteFunction.instantToWrite(AbstractStreamWriteFunction.java:276)
 ~[?:?]
        at 
org.apache.hudi.sink.append.AppendWriteFunction.initWriterHelper(AppendWriteFunction.java:110)
 ~[?:?]
        at 
org.apache.hudi.sink.append.AppendWriteFunction.processElement(AppendWriteFunction.java:84)
 ~[?:?]
        at 
org.apache.flink.streaming.api.operators.ProcessOperator.processElement(ProcessOperator.java:66)
 ~[flink-dist_2.11-1.13.6.jar:1.13.6]
        at 
org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.pushToOperator(CopyingChainingOutput.java:71)
 ~[flink-dist_2.11-1.13.6.jar:1.13.6]
        at 
org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.collect(CopyingChainingOutput.java:46)
 ~[flink-dist_2.11-1.13.6.jar:1.13.6]
        at 
org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.collect(CopyingChainingOutput.java:26)
 ~[flink-dist_2.11-1.13.6.jar:1.13.6]
        at 
org.apache.flink.streaming.runtime.tasks.SourceOperatorStreamTask$AsyncDataOutputToOutput.emitRecord(SourceOperatorStreamTask.java:188)
 ~[flink-dist_2.11-1.13.6.jar:1.13.6]
        at 
org.apache.flink.streaming.api.operators.source.SourceOutputWithWatermarks.collect(SourceOutputWithWatermarks.java:110)
 ~[flink-dist_2.11-1.13.6.jar:1.13.6]
        at 
org.apache.flink.connector.kafka.source.reader.KafkaRecordEmitter.emitRecord(KafkaRecordEmitter.java:36)
 ~[?:?]
        at 
org.apache.flink.connector.kafka.source.reader.KafkaRecordEmitter.emitRecord(KafkaRecordEmitter.java:27)
 ~[?:?]
        at 
org.apache.flink.connector.base.source.reader.SourceReaderBase.pollNext(SourceReaderBase.java:128)
 ~[flink-table-blink_2.11-1.13.6.jar:1.13.6]
        at 
org.apache.flink.streaming.api.operators.SourceOperator.emitNext(SourceOperator.java:305)
 ~[flink-dist_2.11-1.13.6.jar:1.13.6]
        at 
org.apache.flink.streaming.runtime.io.StreamTaskSourceInput.emitNext(StreamTaskSourceInput.java:69)
 ~[flink-dist_2.11-1.13.6.jar:1.13.6]
        at 
org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:66)
 ~[flink-dist_2.11-1.13.6.jar:1.13.6]
        at 
org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:423)
 ~[flink-dist_2.11-1.13.6.jar:1.13.6]
        at 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:204)
 ~[flink-dist_2.11-1.13.6.jar:1.13.6]
        at 
org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:684)
 ~[flink-dist_2.11-1.13.6.jar:1.13.6]
        at 
org.apache.flink.streaming.runtime.tasks.StreamTask.executeInvoke(StreamTask.java:639)
 ~[flink-dist_2.11-1.13.6.jar:1.13.6]
        at 
org.apache.flink.streaming.runtime.tasks.StreamTask.runWithCleanUpOnFail(StreamTask.java:650)
 ~[flink-dist_2.11-1.13.6.jar:1.13.6]
        at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:623) 
~[flink-dist_2.11-1.13.6.jar:1.13.6]
        at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:779) 
~[flink-dist_2.11-1.13.6.jar:1.13.6]
        at org.apache.flink.runtime.taskmanager.Task.run(Task.java:566) 
~[flink-dist_2.11-1.13.6.jar:1.13.6]
        at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_332]
   2023-03-06 20:52:42.992 [flink-akka.actor.default-dispatcher-18] INFO  
org.apache.flink.runtime.resourcemanager.slotmanager.DeclarativeSlotManager [] 
- Received resource requirements from job 00000000000000000000000000000000: 
[ResourceRequirement{resourceProfile=ResourceProfile{UNKNOWN}, 
numberOfRequiredSlots=23}]
   2023-03-06 20:52:42.994 [flink-akka.actor.default-dispatcher-24] WARN  
org.apache.hudi.sink.StreamWriteOperatorCoordinator [] - Reset the event for 
task [16]
   org.apache.hudi.exception.HoodieException: Timeout(601000ms) while waiting 
for instant initialize
        at org.apache.hudi.sink.utils.TimeWait.waitFor(TimeWait.java:57) ~[?:?]
        at 
org.apache.hudi.sink.common.AbstractStreamWriteFunction.instantToWrite(AbstractStreamWriteFunction.java:276)
 ~[?:?]
        at 
org.apache.hudi.sink.append.AppendWriteFunction.initWriterHelper(AppendWriteFunction.java:110)
 ~[?:?]
        at 
org.apache.hudi.sink.append.AppendWriteFunction.processElement(AppendWriteFunction.java:84)
 ~[?:?]
        at 
org.apache.flink.streaming.api.operators.ProcessOperator.processElement(ProcessOperator.java:66)
 ~[flink-dist_2.11-1.13.6.jar:1.13.6]
        at 
org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.pushToOperator(CopyingChainingOutput.java:71)
 ~[flink-dist_2.11-1.13.6.jar:1.13.6]
        at 
org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.collect(CopyingChainingOutput.java:46)
 ~[flink-dist_2.11-1.13.6.jar:1.13.6]
        at 
org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.collect(CopyingChainingOutput.java:26)
 ~[flink-dist_2.11-1.13.6.jar:1.13.6]
        at 
org.apache.flink.streaming.runtime.tasks.SourceOperatorStreamTask$AsyncDataOutputToOutput.emitRecord(SourceOperatorStreamTask.java:188)
 ~[flink-dist_2.11-1.13.6.jar:1.13.6]
        at 
org.apache.flink.streaming.api.operators.source.SourceOutputWithWatermarks.collect(SourceOutputWithWatermarks.java:110)
 ~[flink-dist_2.11-1.13.6.jar:1.13.6]
        at 
org.apache.flink.connector.kafka.source.reader.KafkaRecordEmitter.emitRecord(KafkaRecordEmitter.java:36)
 ~[?:?]
        at 
org.apache.flink.connector.kafka.source.reader.KafkaRecordEmitter.emitRecord(KafkaRecordEmitter.java:27)
 ~[?:?]
        at 
org.apache.flink.connector.base.source.reader.SourceReaderBase.pollNext(SourceReaderBase.java:128)
 ~[flink-table-blink_2.11-1.13.6.jar:1.13.6]
        at 
org.apache.flink.streaming.api.operators.SourceOperator.emitNext(SourceOperator.java:305)
 ~[flink-dist_2.11-1.13.6.jar:1.13.6]
        at 
org.apache.flink.streaming.runtime.io.StreamTaskSourceInput.emitNext(StreamTaskSourceInput.java:69)
 ~[flink-dist_2.11-1.13.6.jar:1.13.6]
        at 
org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:66)
 ~[flink-dist_2.11-1.13.6.jar:1.13.6]
        at 
org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:423)
 ~[flink-dist_2.11-1.13.6.jar:1.13.6]
        at 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:204)
 ~[flink-dist_2.11-1.13.6.jar:1.13.6]
        at 
org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:684)
 ~[flink-dist_2.11-1.13.6.jar:1.13.6]
        at 
org.apache.flink.streaming.runtime.tasks.StreamTask.executeInvoke(StreamTask.java:639)
 ~[flink-dist_2.11-1.13.6.jar:1.13.6]
        at 
org.apache.flink.streaming.runtime.tasks.StreamTask.runWithCleanUpOnFail(StreamTask.java:650)
 ~[flink-dist_2.11-1.13.6.jar:1.13.6]
        at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:623) 
~[flink-dist_2.11-1.13.6.jar:1.13.6]
        at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:779) 
~[flink-dist_2.11-1.13.6.jar:1.13.6]
        at org.apache.flink.runtime.taskmanager.Task.run(Task.java:566) 
~[flink-dist_2.11-1.13.6.jar:1.13.6]
        at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_332]
   `
   
![1678149169351](https://user-images.githubusercontent.com/30795397/223288091-d84c7701-9817-4ae9-bc02-3587f0bc36f3.png)
   
   
   Number of files in a single partition
   
![1678149414548](https://user-images.githubusercontent.com/30795397/223288556-6d489cae-cc03-4f52-9d0d-a03487157046.png)
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to