[ 
https://issues.apache.org/jira/browse/BEAM-8101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17211791#comment-17211791
 ] 

Michael Luckey commented on BEAM-8101:
--------------------------------------

[~bhulette]
The test case is pretty strict in its assumptions of call order. I.e. it 
assumes, that lifecycle methods are called 'in order', especially that no 
method on the DoFn is called after teardown was invoked. This is obviously not 
the case.

Adding tracing reveals the problem

{noformat}
testTeardownCalledAfterExceptionInFinishBundleStateful

(method id sys.nanos thread)

>> @Setup  1937713687 665587577337000 on Thread[direct-runner-worker,5,main]
>> @Setup  741115650 665587577327413 on Thread[direct-runner-worker,5,main]
<< @Setup  1937713687 665587577546140 on Thread[direct-runner-worker,5,main]
<< @Setup  741115650 665587578096879 on Thread[direct-runner-worker,5,main]
>> @StartBundle  1937713687 665587579926913 on 
>> Thread[direct-runner-worker,5,main]
>> @StartBundle  741115650 665587579926951 on 
>> Thread[direct-runner-worker,5,main]
<< @StartBundle  741115650 665587580100510 on 
Thread[direct-runner-worker,5,main]
<< @StartBundle  1937713687 665587580100510 on 
Thread[direct-runner-worker,5,main]
>> @ProcessElement  1937713687 665587581241706 on 
>> Thread[direct-runner-worker,5,main]
>> @ProcessElement  741115650 665587581241896 on 
>> Thread[direct-runner-worker,5,main]
<< @ProcessElement  741115650 665587581404196 on 
Thread[direct-runner-worker,5,main]
<< @ProcessElement  1937713687 665587581396842 on 
Thread[direct-runner-worker,5,main]
>> @FinishBundle  741115650 665587584333444 on 
>> Thread[direct-runner-worker,5,main]
>> @FinishBundle  1937713687 665587584333444 on 
>> Thread[direct-runner-worker,5,main]
>> @Teardown  1937713687 665587584758702 on Thread[direct-runner-worker,5,main]
>> @Teardown  741115650 665587584745740 on Thread[direct-runner-worker,5,main]
<< @Teardown  1937713687 665587584875369 on Thread[direct-runner-worker,5,main]
<< @Teardown  741115650 665587584923628 on Thread[direct-runner-worker,5,main]
>> @Setup  1677746187 665587586081857 on Thread[direct-runner-worker,5,main]
>> @Setup  1659593841 665587586081571 on Thread[direct-runner-worker,5,main]
<< @Setup  1677746187 665587586534188 on Thread[direct-runner-worker,5,main]
<< @Setup  1659593841 665587586534148 on Thread[direct-runner-worker,5,main]
>> @Teardown  1659593841 665587586744314 on Thread[direct-runner-worker,5,main] 
>>                    <<<<<<<<<<<< (1)
>> @StartBundle  1659593841 665587587048231 on 
>> Thread[direct-runner-worker,5,main]                <<<<<<<<<<<< (2)
>> @StartBundle  1677746187 665587587048217 on 
>> Thread[direct-runner-worker,5,main]
<< @Teardown  1659593841 665587587497000 on Thread[direct-runner-worker,5,main] 
                   <<<<<<<<<<<< (3)
<< @StartBundle  1677746187 665587587334248 on 
Thread[direct-runner-worker,5,main]                 <<<<<<<<<<<< (4)
call state retraction at 665587587530066 on 1659593841: TEARDOWN -> START_BUNDLE
>> @Teardown  1677746187 665587587639067 on Thread[direct-runner-worker,5,main] 
>>                    <<<<<<<<<<<< (1)
>> @ProcessElement  1677746187 665587587927864 on 
>> Thread[direct-runner-worker,5,main].        <<<<<<<<<<<< (2)
<< @Teardown  1677746187 665587588127396 on Thread[direct-runner-worker,5,main] 
                   <<<<<<<<<<<<< (3)
{noformat}

As clearly can be seen, while teardown is invoked (1,3), another lifecycle gets 
called concurrently (2). This leads to the Allstate teardown been overwritten 
(in that case to start bundle) which causes the test to fail.

Now, obviously it is problematic to call any method after teardown has been 
called. This most certainly will cause issues. On the other hand, it might be 
tolerable.

Here someone with deeper understanding of the beam model and DoFns lifecycle 
constraints should chime in and clarify, if the tests implementation needs to 
be relaxed - e.g. we might consider it sufficient if teardown is called and do 
not update callstate after - or whether the DoFn lifecycle constraints should 
be considered violated, i.e. something on the runners need to be fixed to 
prevent above call patterns.

> ParDoLifecycleTest is flaky
> ---------------------------
>
>                 Key: BEAM-8101
>                 URL: https://issues.apache.org/jira/browse/BEAM-8101
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-direct, runner-flink, runner-spark
>    Affects Versions: 2.15.0, 2.22.0
>            Reporter: Jan Lukavský
>            Assignee: Michael Luckey
>            Priority: P1
>              Labels: flake, flaky, flaky-test
>
> Temporary fail from Jenkins:
> {code}
> java.lang.AssertionError: Function should have been torn down after exception
> Expected: is <TEARDOWN>
>      but: was <START_BUNDLE>
>       at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:18)
>       at 
> org.apache.beam.sdk.transforms.ParDoLifecycleTest.lambda$validate$0(ParDoLifecycleTest.java:266)
>       at 
> java.util.concurrent.ConcurrentHashMap$ValuesView.forEach(ConcurrentHashMap.java:4707)
>       at 
> org.apache.beam.sdk.transforms.ParDoLifecycleTest.validate(ParDoLifecycleTest.java:264)
>       at 
> org.apache.beam.sdk.transforms.ParDoLifecycleTest.testTeardownCalledAfterExceptionInProcessElementStateful(ParDoLifecycleTest.java:253)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:498)
>       at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>       at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>       at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>       at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>       at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>       at 
> org.apache.beam.sdk.testing.TestPipeline$1.evaluate(TestPipeline.java:319)
>       at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>       at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:349)
>       at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>       at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>       at org.junit.runners.ParentRunner$3.run(ParentRunner.java:314)
>       at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
>       at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:312)
>       at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
>       at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:292)
>       at org.junit.runners.ParentRunner.run(ParentRunner.java:396)
>       at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:110)
>       at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58)
>       at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38)
>       at 
> org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:62)
>       at 
> org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
>       at sun.reflect.GeneratedMethodAccessor31.invoke(Unknown Source)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:498)
>       at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
>       at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
>       at 
> org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)
>       at 
> org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)
>       at com.sun.proxy.$Proxy2.processTestClass(Unknown Source)
>       at 
> org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:118)
>       at sun.reflect.GeneratedMethodAccessor30.invoke(Unknown Source)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:498)
>       at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
>       at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
>       at 
> org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:175)
>       at 
> org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:157)
>       at 
> org.gradle.internal.remote.internal.hub.MessageHub$Handler.run(MessageHub.java:404)
>       at 
> org.gradle.internal.concurrent.ExecutorPolicy$CatchAndRecordFailures.onExecute(ExecutorPolicy.java:63)
>       at 
> org.gradle.internal.concurrent.ManagedExecutorImpl$1.run(ManagedExecutorImpl.java:46)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>       at 
> org.gradle.internal.concurrent.ThreadFactoryImpl$ManagedThreadRunnable.run(ThreadFactoryImpl.java:55)
>       at java.lang.Thread.run(Thread.java:748)
> Standard Error
> Aug 26, 2019 2:47:16 PM org.apache.beam.runners.direct.ParDoEvaluatorFactory 
> createParDoEvaluator
> SEVERE: Exception encountered while cleaning up in ParDo evaluator 
> construction
> org.apache.beam.sdk.util.UserCodeException: java.lang.AssertionError: 
> Expected to have a processing method throw an exception
>       at 
> org.apache.beam.sdk.util.UserCodeException.wrap(UserCodeException.java:34)
>       at 
> org.apache.beam.sdk.transforms.ParDoLifecycleTest$ExceptionThrowingFn$DoFnInvoker.invokeTeardown(Unknown
>  Source)
>       at 
> org.apache.beam.runners.direct.DoFnLifecycleManager$TeardownRemovedFnListener.onRemoval(DoFnLifecycleManager.java:113)
>       at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.processPendingNotifications(LocalCache.java:1809)
>       at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.runUnlockedCleanup(LocalCache.java:3462)
>       at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.postWriteCleanup(LocalCache.java:3438)
>       at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.remove(LocalCache.java:3072)
>       at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.remove(LocalCache.java:4236)
>       at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalManualCache.invalidate(LocalCache.java:4899)
>       at 
> org.apache.beam.runners.direct.DoFnLifecycleManager.remove(DoFnLifecycleManager.java:66)
>       at 
> org.apache.beam.runners.direct.ParDoEvaluatorFactory.createParDoEvaluator(ParDoEvaluatorFactory.java:168)
>       at 
> org.apache.beam.runners.direct.ParDoEvaluatorFactory.createEvaluator(ParDoEvaluatorFactory.java:121)
>       at 
> org.apache.beam.runners.direct.ParDoEvaluatorFactory.forApplication(ParDoEvaluatorFactory.java:79)
>       at 
> org.apache.beam.runners.direct.TransformEvaluatorRegistry.forApplication(TransformEvaluatorRegistry.java:169)
>       at 
> org.apache.beam.runners.direct.DirectTransformExecutor.run(DirectTransformExecutor.java:117)
>       at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>       at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.AssertionError: Expected to have a processing method 
> throw an exception
>       at org.junit.Assert.fail(Assert.java:89)
>       at 
> org.apache.beam.sdk.transforms.ParDoLifecycleTest$ExceptionThrowingFn.after(ParDoLifecycleTest.java:398)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to