[
https://issues.apache.org/jira/browse/FLINK-28817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated FLINK-28817:
-----------------------------------
Labels: pull-request-available (was: )
> NullPointerException in HybridSource when restoring from checkpoint
> -------------------------------------------------------------------
>
> Key: FLINK-28817
> URL: https://issues.apache.org/jira/browse/FLINK-28817
> Project: Flink
> Issue Type: Bug
> Components: Connectors / Common
> Affects Versions: 1.14.4, 1.15.1
> Reporter: Michael
> Priority: Major
> Labels: pull-request-available
> Attachments: Preconditions-checkNotNull-error.zip,
> bf-29-JM-err-analysis.log
>
>
> Scenario:
> # CheckpointCoordinator - Completed checkpoint 14 for job
> 00000000000000000000000000000000
> # HybridSource successfully completed processing a few SourceFactories, that
> reads from s3
> # HybridSourceSplitEnumerator.switchEnumerator failed with
> com.amazonaws.SdkClientException: Unable to execute HTTP request: Read timed
> out. This is intermittent error, it is usually fixed, when Flink recover from
> checkpoint & repeat the operation.
> # Flink starts recovering from checkpoint,
> # HybridSourceSplitEnumerator receives
> SourceReaderFinishedEvent\{sourceIndex=-1}
> # Processing this event cause
> 2022/08/08 08:39:34.862 ERROR o.a.f.r.s.c.SourceCoordinator - Uncaught
> exception in the SplitEnumerator for Source Source: hybrid-source while
> handling operator event SourceEventWrapper[SourceReaderFinishedEvent
> {sourceIndex=-1}
> ] from subtask 6. Triggering job failover.
> java.lang.NullPointerException: Source for index=0 is not available from
> sources: \{788=org.apache.flink.connector.file.src.SppFileSource@5a3803f3}
> at org.apache.flink.util.Preconditions.checkNotNull(Preconditions.java:104)
> at
> org.apache.flink.connector.base.source.hybridspp.SwitchedSources.sourceOf(SwitchedSources.java:36)
> at
> org.apache.flink.connector.base.source.hybridspp.HybridSourceSplitEnumerator.sendSwitchSourceEvent(HybridSourceSplitEnumerator.java:152)
> at
> org.apache.flink.connector.base.source.hybridspp.HybridSourceSplitEnumerator.handleSourceEvent(HybridSourceSplitEnumerator.java:226)
> ...
> I'm running my version of the Hybrid Sources with additional logging, so line
> numbers & some names could be different from Flink Github.
> My Observation: the problem is intermittent, sometimes it works ok, i.e.
> SourceReaderFinishedEvent comes with correct sourceIndex. As I see from my
> log, it happens if my SourceFactory.create() is executed BEFORE
> HybridSourceSplitEnumerator - handleSourceEvent
> SourceReaderFinishedEvent\{sourceIndex=-1}.
> If HybridSourceSplitEnumerator - handleSourceEvent is executed before my
> SourceFactory.create(), then sourceIndex=-1 in SourceReaderFinishedEvent
> Preconditions-checkNotNull-error log from JobMgr is attached
--
This message was sent by Atlassian Jira
(v8.20.10#820010)