[
https://issues.apache.org/jira/browse/HUDI-2772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sivabalan narayanan updated HUDI-2772:
--------------------------------------
Description:
Even after setting the right config to copy over deltastreamer checkpoint,
deltastreamer fails to read the checkpoint from previous commit metadata. But
if deltastreamer is restarted, the exception is not seen and picks up the
checkpoint.
Setup:
Deltastreamer in continuous mode.
And triggered a concurrent write from spark-datasource.
I inspected the last commit.completed instant(that was reported by
deltastreamer) made by spark writer and it looks ok to me.
{code:java}
grep "checkpoint"
/tmp/hudi-deltastreamer-gh-mw/.hoodie/20211116074129737.deltacommit
"deltastreamer.checkpoint.key" : "1637066483000" {code}
But after the below exception, if I restart deltastreamer, it just runs fine.
Very strange? I was able to reprod this 2 times out of 5.
here is the checkpoint from last delta commit by deltastreamer (which matches
the entry found by delta commit by spark writer above)
{code:java}
grep "checkpoint"
/tmp/hudi-deltastreamer-gh-mw/.hoodie/20211116074123384.deltacommit
"deltastreamer.checkpoint.key" : "1637066483000" {code}
I also check detlastreamer code and we do look at only completed instants and
the completed commit metadata. So, not sure why is this happening.
stacktrace:
{code:java}
21/11/16 07:41:31 ERROR HoodieDeltaStreamer: Shutting down delta-sync due to
exception
org.apache.hudi.utilities.exception.HoodieDeltaStreamerException: Unable to
find previous checkpoint. Please double check if this table was indeed built
via delta streamer. Last Commit
:Option{val=[20211116074129737__deltacommit__COMPLETED]}, Instants
:[[20211116072710523__deltacommit__COMPLETED],
[20211116072748906__deltacommit__COMPLETED],
[20211116072910768__deltacommit__COMPLETED],
[20211116074114874__deltacommit__COMPLETED],
[20211116074123384__deltacommit__COMPLETED],
[20211116074129737__deltacommit__COMPLETED]], CommitMetadata={
"partitionToWriteStats" : { },
"compacted" : false,
"extraMetadata" : { },
"operationType" : "UNKNOWN",
"fileIdAndRelativePaths" : { },
"totalRecordsDeleted" : 0,
"totalLogRecordsCompacted" : 0,
"totalLogFilesCompacted" : 0,
"totalCompactedRecordsUpdated" : 0,
"totalLogFilesSize" : 0,
"totalScanTime" : 0,
"totalCreateTime" : 0,
"totalUpsertTime" : 0,
"minAndMaxEventTime" : {
"Optional.empty" : {
"val" : null,
"present" : false
}
},
"writePartitionPaths" : [ ]
}
at
org.apache.hudi.utilities.deltastreamer.DeltaSync.readFromSource(DeltaSync.java:346)
at
org.apache.hudi.utilities.deltastreamer.DeltaSync.syncOnce(DeltaSync.java:281)
at
org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer$DeltaSyncService.lambda$startService$0(HoodieDeltaStreamer.java:634)
at
java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
21/11/16 07:41:31 WARN HoodieDeltaStreamer: Gracefully shutting down compactor
21/11/16 07:41:39 WARN SparkRDDWriteClient: Slept for 20 secs, proceeding
21/11/16 07:41:40 ERROR HoodieAsyncService: Monitor noticed one or more threads
failed. Requesting graceful shutdown of other threads
java.util.concurrent.ExecutionException:
org.apache.hudi.exception.HoodieException: Unable to find previous checkpoint.
Please double check if this table was indeed built via delta streamer. Last
Commit :Option{val=[20211116074129737__deltacommit__COMPLETED]}, Instants
:[[20211116072710523__deltacommit__COMPLETED],
[20211116072748906__deltacommit__COMPLETED],
[20211116072910768__deltacommit__COMPLETED],
[20211116074114874__deltacommit__COMPLETED],
[20211116074123384__deltacommit__COMPLETED],
[20211116074129737__deltacommit__COMPLETED]], CommitMetadata={{code}
was:
Even after setting the right config to copy over deltastreamer checkpoint,
deltastreamer fails to read the checkpoint from previous commit metadata
intermittantly.
Setup:
Deltastreamer in continuous mode.
And triggered a concurrent write from spark-datasource.
I inspected the last commit.completed instant(that was reported by
deltastreamer) made by spark writer and it looks ok to me.
{code:java}
grep "checkpoint"
/tmp/hudi-deltastreamer-gh-mw/.hoodie/20211116074129737.deltacommit
"deltastreamer.checkpoint.key" : "1637066483000" {code}
But after the below exception, if I restart deltastreamer, it just runs fine.
Very strange? I was able to reprod this 2 times out of 5.
here is the checkpoint from last delta commit by deltastreamer (which matches
the entry found by delta commit by spark writer above)
{code:java}
grep "checkpoint"
/tmp/hudi-deltastreamer-gh-mw/.hoodie/20211116074123384.deltacommit
"deltastreamer.checkpoint.key" : "1637066483000" {code}
I also check detlastreamer code and we do look at only completed instants and
the completed commit metadata. So, not sure why is this happening.
stacktrace:
{code:java}
21/11/16 07:41:31 ERROR HoodieDeltaStreamer: Shutting down delta-sync due to
exception
org.apache.hudi.utilities.exception.HoodieDeltaStreamerException: Unable to
find previous checkpoint. Please double check if this table was indeed built
via delta streamer. Last Commit
:Option{val=[20211116074129737__deltacommit__COMPLETED]}, Instants
:[[20211116072710523__deltacommit__COMPLETED],
[20211116072748906__deltacommit__COMPLETED],
[20211116072910768__deltacommit__COMPLETED],
[20211116074114874__deltacommit__COMPLETED],
[20211116074123384__deltacommit__COMPLETED],
[20211116074129737__deltacommit__COMPLETED]], CommitMetadata={
"partitionToWriteStats" : { },
"compacted" : false,
"extraMetadata" : { },
"operationType" : "UNKNOWN",
"fileIdAndRelativePaths" : { },
"totalRecordsDeleted" : 0,
"totalLogRecordsCompacted" : 0,
"totalLogFilesCompacted" : 0,
"totalCompactedRecordsUpdated" : 0,
"totalLogFilesSize" : 0,
"totalScanTime" : 0,
"totalCreateTime" : 0,
"totalUpsertTime" : 0,
"minAndMaxEventTime" : {
"Optional.empty" : {
"val" : null,
"present" : false
}
},
"writePartitionPaths" : [ ]
}
at
org.apache.hudi.utilities.deltastreamer.DeltaSync.readFromSource(DeltaSync.java:346)
at
org.apache.hudi.utilities.deltastreamer.DeltaSync.syncOnce(DeltaSync.java:281)
at
org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer$DeltaSyncService.lambda$startService$0(HoodieDeltaStreamer.java:634)
at
java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
21/11/16 07:41:31 WARN HoodieDeltaStreamer: Gracefully shutting down compactor
21/11/16 07:41:39 WARN SparkRDDWriteClient: Slept for 20 secs, proceeding
21/11/16 07:41:40 ERROR HoodieAsyncService: Monitor noticed one or more threads
failed. Requesting graceful shutdown of other threads
java.util.concurrent.ExecutionException:
org.apache.hudi.exception.HoodieException: Unable to find previous checkpoint.
Please double check if this table was indeed built via delta streamer. Last
Commit :Option{val=[20211116074129737__deltacommit__COMPLETED]}, Instants
:[[20211116072710523__deltacommit__COMPLETED],
[20211116072748906__deltacommit__COMPLETED],
[20211116072910768__deltacommit__COMPLETED],
[20211116074114874__deltacommit__COMPLETED],
[20211116074123384__deltacommit__COMPLETED],
[20211116074129737__deltacommit__COMPLETED]], CommitMetadata={{code}
> Deltastreamer fails to read checkpoint from previous commit metadata by spark
> writer on continuous mode
> -------------------------------------------------------------------------------------------------------
>
> Key: HUDI-2772
> URL: https://issues.apache.org/jira/browse/HUDI-2772
> Project: Apache Hudi
> Issue Type: Bug
> Reporter: sivabalan narayanan
> Assignee: sivabalan narayanan
> Priority: Critical
>
> Even after setting the right config to copy over deltastreamer checkpoint,
> deltastreamer fails to read the checkpoint from previous commit metadata.
> But if deltastreamer is restarted, the exception is not seen and picks up the
> checkpoint.
> Setup:
> Deltastreamer in continuous mode.
> And triggered a concurrent write from spark-datasource.
>
> I inspected the last commit.completed instant(that was reported by
> deltastreamer) made by spark writer and it looks ok to me.
> {code:java}
> grep "checkpoint"
> /tmp/hudi-deltastreamer-gh-mw/.hoodie/20211116074129737.deltacommit
> "deltastreamer.checkpoint.key" : "1637066483000" {code}
> But after the below exception, if I restart deltastreamer, it just runs fine.
> Very strange? I was able to reprod this 2 times out of 5.
> here is the checkpoint from last delta commit by deltastreamer (which matches
> the entry found by delta commit by spark writer above)
> {code:java}
> grep "checkpoint"
> /tmp/hudi-deltastreamer-gh-mw/.hoodie/20211116074123384.deltacommit
> "deltastreamer.checkpoint.key" : "1637066483000" {code}
>
> I also check detlastreamer code and we do look at only completed instants and
> the completed commit metadata. So, not sure why is this happening.
> stacktrace:
> {code:java}
> 21/11/16 07:41:31 ERROR HoodieDeltaStreamer: Shutting down delta-sync due to
> exception
> org.apache.hudi.utilities.exception.HoodieDeltaStreamerException: Unable to
> find previous checkpoint. Please double check if this table was indeed built
> via delta streamer. Last Commit
> :Option{val=[20211116074129737__deltacommit__COMPLETED]}, Instants
> :[[20211116072710523__deltacommit__COMPLETED],
> [20211116072748906__deltacommit__COMPLETED],
> [20211116072910768__deltacommit__COMPLETED],
> [20211116074114874__deltacommit__COMPLETED],
> [20211116074123384__deltacommit__COMPLETED],
> [20211116074129737__deltacommit__COMPLETED]], CommitMetadata={
> "partitionToWriteStats" : { },
> "compacted" : false,
> "extraMetadata" : { },
> "operationType" : "UNKNOWN",
> "fileIdAndRelativePaths" : { },
> "totalRecordsDeleted" : 0,
> "totalLogRecordsCompacted" : 0,
> "totalLogFilesCompacted" : 0,
> "totalCompactedRecordsUpdated" : 0,
> "totalLogFilesSize" : 0,
> "totalScanTime" : 0,
> "totalCreateTime" : 0,
> "totalUpsertTime" : 0,
> "minAndMaxEventTime" : {
> "Optional.empty" : {
> "val" : null,
> "present" : false
> }
> },
> "writePartitionPaths" : [ ]
> }
> at
> org.apache.hudi.utilities.deltastreamer.DeltaSync.readFromSource(DeltaSync.java:346)
> at
> org.apache.hudi.utilities.deltastreamer.DeltaSync.syncOnce(DeltaSync.java:281)
> at
> org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer$DeltaSyncService.lambda$startService$0(HoodieDeltaStreamer.java:634)
> at
> java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> 21/11/16 07:41:31 WARN HoodieDeltaStreamer: Gracefully shutting down compactor
> 21/11/16 07:41:39 WARN SparkRDDWriteClient: Slept for 20 secs, proceeding
> 21/11/16 07:41:40 ERROR HoodieAsyncService: Monitor noticed one or more
> threads failed. Requesting graceful shutdown of other threads
> java.util.concurrent.ExecutionException:
> org.apache.hudi.exception.HoodieException: Unable to find previous
> checkpoint. Please double check if this table was indeed built via delta
> streamer. Last Commit
> :Option{val=[20211116074129737__deltacommit__COMPLETED]}, Instants
> :[[20211116072710523__deltacommit__COMPLETED],
> [20211116072748906__deltacommit__COMPLETED],
> [20211116072910768__deltacommit__COMPLETED],
> [20211116074114874__deltacommit__COMPLETED],
> [20211116074123384__deltacommit__COMPLETED],
> [20211116074129737__deltacommit__COMPLETED]], CommitMetadata={{code}
--
This message was sent by Atlassian Jira
(v8.20.1#820001)