[ 
https://issues.apache.org/jira/browse/BEAM-5759?focusedWorklogId=154720&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154720
 ]

ASF GitHub Bot logged work on BEAM-5759:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 16/Oct/18 11:15
            Start Date: 16/Oct/18 11:15
    Worklog Time Spent: 10m 
      Work Description: andrew-flumaion opened a new pull request #6702: 
[BEAM-5759] Ensuring JmsIO checkpoint state is accessed and modified safely
URL: https://github.com/apache/beam/pull/6702
 
 
   As described in [BEAM-5759](https://jira.apache.org/jira/browse/BEAM-5759), 
a ConcurrentModificationException can be thrown when JmsIO source checkpoint 
finalisation occurs simultaneously to further messages being added for 
finalisation. This is due to the state of JmsCheckpointMark not being thread 
safe.
   
   This change adds a unit test which demonstrates the issue by
   
   1. Adding several messages to a queue.
   1. Reading some of these messages with an UnboundedJmsReader (which adds 
messages for finalization).
   1. Calling finalizeCheckpoint on the reader's checkpoint mark in a separate 
thread.
   1. Reading the remainder of the messages in parallel with the above 
finalization.
   
   In order to reliably reproduce this issue, the test uses a JMS consumer 
which decorates received messages with a callback which sleeps for a short 
period, introducing some "lag" into the finalization process.
   
   To fix the issue, JmsCheckpointMark has been updated with thread safety 
constructs. In order to avoid potentially long blocks for consumers during 
checkpoint finalization, a read/write lock is used and finalization takes a 
snapshot of state before processing in a lock-free manner, only locking to 
update state post-finalization.
   
   @jbonofre 
   
   Post-Commit Tests Status (on master branch)
   
------------------------------------------------------------------------------------------------
   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
 </br> [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_VR_Flink/lastCompletedBuild/)
 | --- | --- | ---
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

            Worklog Id:     (was: 154720)
            Time Spent: 10m
    Remaining Estimate: 0h

> ConcurrentModificationException on JmsIO checkpoint finalization
> ----------------------------------------------------------------
>
>                 Key: BEAM-5759
>                 URL: https://issues.apache.org/jira/browse/BEAM-5759
>             Project: Beam
>          Issue Type: Bug
>          Components: io-java-jms
>    Affects Versions: 2.8.0
>            Reporter: Andrew Fulton
>            Assignee: Andrew Fulton
>             Fix For: 2.9.0
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> When reading from a JmsIO source, a ConcurrentModificationException can be 
> thrown when checkpoint finalization occurs under heavy load.
> For example:
> {{jsonPayload: {}}
>  {{  exception: "java.util.ConcurrentModificationException}}
>  {{    at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:903)}}
>  {{    at java.util.ArrayList$Itr.next(ArrayList.java:853)}}
>  {{    at 
> org.apache.beam.sdk.io.jms.JmsCheckpointMark.finalizeCheckpoint(JmsCheckpointMark.java:65)}}
>  {{    at 
> com.google.cloud.dataflow.worker.StreamingModeExecutionContext$1.run(StreamingModeExecutionContext.java:379)}}
>  {{    at 
> com.google.cloud.dataflow.worker.StreamingDataflowWorker$8.run(StreamingDataflowWorker.java:846)}}
>  {{    at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)}}
>  {{    at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)}}
>  {{    at java.lang.Thread.run(Thread.java:745)}}
>  {{"}}
>  {{  job: "2018-09-27_08_55_18-6454085774348718625"   }}
>  {{  logger: "com.google.cloud.dataflow.worker.StreamingDataflowWorker"   }}
>  {{  message: "Source checkpoint finalization failed:"   }}
>  {{  thread: "309"   }}
>  {{  work: "<nil>"   }}
>  {{  worker: "test-andrew-092715504-09270855-tkfp-harness-dnmb"   }}
>  
> Looking at the JmsCheckpointMark code, it appears that access to the pending 
> message list is unprotected - thus if a thread calls finalizeCheckpoint while 
> a separate processing thread adds more messages to the checkpoint mark list 
> then an exception will be thrown.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to