Abacn commented on PR #26267: URL: https://github.com/apache/beam/pull/26267#issuecomment-1507573236
Verified that the unit test fails in current master and suceeds with fix in. Current master: ``` Apr 13, 2023 4:25:34 PM org.apache.beam.runners.dataflow.worker.WorkerCustomSources performSplitTyped WARNING: Splitting source <unknown> into bundles of estimated size 67108864 bytes produced 200 bundles, which have total serialized size 116265 bytes. As this is too large for the Google Cloud Dataflow API, retrying splitting once with increased desiredBundleSizeBytes 1560482414 to reduce the number of splits. Apr 13, 2023 4:25:34 PM org.apache.beam.runners.dataflow.worker.WorkerCustomSources performSplitTyped INFO: Splitting with desiredBundleSizeBytes 1560482414 produced 200 bundles with total serialized size 116265 bytes Apr 13, 2023 4:25:34 PM org.apache.beam.runners.dataflow.worker.WorkerCustomSources performSplitTyped WARNING: Splitting source <unknown> into bundles of estimated size 1560482414 bytes produced 200 bundles. Rebundling into 100 bundles. Apr 13, 2023 4:25:34 PM org.apache.beam.runners.dataflow.worker.WorkerCustomSources performSplitTyped INFO: Splitting source <unknown> produced 100 bundles with total serialized response size 58815 Total size of the BoundedSource objects generated by split() operation is larger than the allowable limit. When splitting <unknown> into bundles of 1560482414 bytes it generated 200 BoundedSource objects with total serialized size of 58815 bytes which is larger than the limit 10000. For more information, please check the corresponding FAQ entry at https://cloud.google.com/dataflow/pipelines/troubleshooting-your-pipeline java.lang.IllegalArgumentException: Total size of the BoundedSource objects generated by split() operation is larger than the allowable limit. When splitting <unknown> into bundles of 1560482414 bytes it generated 200 BoundedSource objects with total serialized size of 58815 bytes which is larger than the limit 10000. For more information, please check the corresponding FAQ entry at https://cloud.google.com/dataflow/pipelines/troubleshooting-your-pipeline at org.apache.beam.runners.dataflow.worker.WorkerCustomSources.performSplitTyped(WorkerCustomSources.java:286) at org.apache.beam.runners.dataflow.worker.WorkerCustomSources.performSplitWithApiLimit(WorkerCustomSources.java:201) at org.apache.beam.runners.dataflow.worker.WorkerCustomSourcesTest.testSplittingProducedResponseUnderLimit(WorkerCustomSourcesTest.java:242) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:258) at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) at org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) at org.junit.runners.ParentRunner.run(ParentRunner.java:413) at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:110) at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58) at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38) at org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:62) at org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:36) at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24) at org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:33) at org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:94) at com.sun.proxy.$Proxy2.processTestClass(Unknown Source) at org.gradle.api.internal.tasks.testing.worker.TestWorker$2.run(TestWorker.java:176) at org.gradle.api.internal.tasks.testing.worker.TestWorker.executeAndMaintainThreadName(TestWorker.java:129) at org.gradle.api.internal.tasks.testing.worker.TestWorker.execute(TestWorker.java:100) at org.gradle.api.internal.tasks.testing.worker.TestWorker.execute(TestWorker.java:60) at org.gradle.process.internal.worker.child.ActionExecutionWorker.execute(ActionExecutionWorker.java:56) at org.gradle.process.internal.worker.child.SystemApplicationClassLoaderWorker.call(SystemApplicationClassLoaderWorker.java:133) at org.gradle.process.internal.worker.child.SystemApplicationClassLoaderWorker.call(SystemApplicationClassLoaderWorker.java:71) at worker.org.gradle.process.internal.worker.GradleWorkerMain.run(GradleWorkerMain.java:69) at worker.org.gradle.process.internal.worker.GradleWorkerMain.main(GradleWorkerMain.java:74) ``` With the fix: ``` Apr 13, 2023 4:27:05 PM org.apache.beam.runners.dataflow.worker.WorkerCustomSources performSplitTyped WARNING: Splitting source <unknown> into bundles of estimated size 67108864 bytes produced 200 bundles, which have total serialized size 116265 bytes. As this is too large for the Google Cloud Dataflow API, retrying splitting once with increased desiredBundleSizeBytes 1560482414 to reduce the number of splits. Apr 13, 2023 4:27:05 PM org.apache.beam.runners.dataflow.worker.WorkerCustomSources performSplitTyped INFO: Splitting with desiredBundleSizeBytes 1560482414 produced 200 bundles with total serialized size 116265 bytes Apr 13, 2023 4:27:05 PM org.apache.beam.runners.dataflow.worker.WorkerCustomSources performSplitTyped WARNING: Re-bundle source <unknown> into bundles of estimated size 13330 bytes produced 16 bundles. Apr 13, 2023 4:27:06 PM org.apache.beam.runners.dataflow.worker.WorkerCustomSources performSplitTyped WARNING: Re-bundle source <unknown> into bundles of estimated size 9814 bytes produced 11 bundles. Apr 13, 2023 4:27:06 PM org.apache.beam.runners.dataflow.worker.WorkerCustomSources performSplitTyped INFO: Splitting source <unknown> produced 11 bundles with total serialized response size 9824 ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
