[ https://issues.apache.org/jira/browse/BEAM-14187?focusedWorklogId=749631&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-749631 ]
ASF GitHub Bot logged work on BEAM-14187: ----------------------------------------- Author: ASF GitHub Bot Created on: 29/Mar/22 21:01 Start Date: 29/Mar/22 21:01 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #17201: URL: https://github.com/apache/beam/pull/17201#issuecomment-1082369009 R: @lukecwik -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@beam.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 749631) Time Spent: 0.5h (was: 20m) > NullPointerException or IllegalStateException at IsmReaderImpl in Dataflow > -------------------------------------------------------------------------- > > Key: BEAM-14187 > URL: https://issues.apache.org/jira/browse/BEAM-14187 > Project: Beam > Issue Type: Bug > Components: runner-dataflow > Reporter: Minbo Bae > Priority: P2 > Time Spent: 0.5h > Remaining Estimate: 0h > > h6. Problem > Dataflow Java batch jobs with large side input intermittently throws > {{NullPointerException}} or {{{}IllegalStateException{}}}. > * > [NullPointerException|https://gist.githubusercontent.com/baeminbo/459e283eadbc7752c9f23616b52d958a/raw/f0480b8eaff590fb3f3ae2ab98ddce7dd3b4a237/npe.png] > happens at > [IsmReaderImpl.overKeyComponents|https://github.com/apache/beam/blob/v2.37.0/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/IsmReaderImpl.java#L217]: > * > [IllegalStateException|https://gist.githubusercontent.com/baeminbo/459e283eadbc7752c9f23616b52d958a/raw/f0480b8eaff590fb3f3ae2ab98ddce7dd3b4a237/IllegalStateException.png] > happens at [IsmReaderImpl. initializeForKeyedRead > |https://github.com/apache/beam/blob/v2.37.0/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/IsmReaderImpl.java#L500]. > (all error logs in the Dataflow job is > [here|https://gist.githubusercontent.com/baeminbo/459e283eadbc7752c9f23616b52d958a/raw/f0480b8eaff590fb3f3ae2ab98ddce7dd3b4a237/downloaded-logs-20220327-171955.json].) > h6. Hypothesis > The {{initializeForKeyedRead}} is not synchronized. Multiple threads can > enter the method so that initialize the index for the same shard and update > {{indexPerShard}} without synchronization. And, the {{overKeyComponents}} > also accesses {{indexPerShard}} without synchronization. As {{indexPerShard}} > is just a {{HashMap}} which is not thread-safe, it can cause > {{NullPointerException}} and {{IllegalStateException}} above. > h6. Suggestion > I think it can fix this issue if we change the type of {{indexPerShard}} to a > thread-safe map (e.g. {{ConcurrentHashMap}}). -- This message was sent by Atlassian Jira (v8.20.1#820001)