[ https://issues.apache.org/jira/browse/BEAM-6349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Lukasz Gajowy closed BEAM-6349. ------------------------------- Fix Version/s: Not applicable Resolution: Fixed > Exceptions (IllegalArgumentException or NoClassDefFoundError) when running > tests on Dataflow runner > --------------------------------------------------------------------------------------------------- > > Key: BEAM-6349 > URL: https://issues.apache.org/jira/browse/BEAM-6349 > Project: Beam > Issue Type: Improvement > Components: testing > Reporter: Lukasz Gajowy > Assignee: Craig Chambers > Priority: Major > Fix For: Not applicable > > Time Spent: 2h 40m > Remaining Estimate: 0h > > Running GroupByKeyLoadTest results in the following error on Dataflow runner: > > {code:java} > java.lang.ExceptionInInitializerError > at > org.apache.beam.runners.dataflow.worker.IntrinsicMapTaskExecutorFactory$2.typedApply(IntrinsicMapTaskExecutorFactory.java:344) > at > org.apache.beam.runners.dataflow.worker.IntrinsicMapTaskExecutorFactory$2.typedApply(IntrinsicMapTaskExecutorFactory.java:338) > at > org.apache.beam.runners.dataflow.worker.graph.Networks$TypeSafeNodeFunction.apply(Networks.java:63) > at > org.apache.beam.runners.dataflow.worker.graph.Networks$TypeSafeNodeFunction.apply(Networks.java:50) > at > org.apache.beam.runners.dataflow.worker.graph.Networks.replaceDirectedNetworkNodes(Networks.java:87) > at > org.apache.beam.runners.dataflow.worker.IntrinsicMapTaskExecutorFactory.create(IntrinsicMapTaskExecutorFactory.java:120) > at > org.apache.beam.runners.dataflow.worker.BatchDataflowWorker.doWork(BatchDataflowWorker.java:337) > at > org.apache.beam.runners.dataflow.worker.BatchDataflowWorker.getAndPerformWork(BatchDataflowWorker.java:291) > at > org.apache.beam.runners.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.doWork(DataflowBatchWorkerHarness.java:135) > at > org.apache.beam.runners.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:115) > at > org.apache.beam.runners.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:102) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.IllegalArgumentException: Multiple entries with same > key: > kind:varint=org.apache.beam.runners.dataflow.util.CloudObjectTranslators$8@39b69c48 > and > kind:varint=org.apache.beam.runners.dataflow.worker.RunnerHarnessCoderCloudObjectTranslatorRegistrar$1@7966f294 > at > org.apache.beam.repackaged.beam_runners_google_cloud_dataflow_java.com.google.common.collect.ImmutableMap.checkNoConflict(ImmutableMap.java:136) > at > org.apache.beam.repackaged.beam_runners_google_cloud_dataflow_java.com.google.common.collect.RegularImmutableMap.checkNoConflictInKeyBucket(RegularImmutableMap.java:100) > at > org.apache.beam.repackaged.beam_runners_google_cloud_dataflow_java.com.google.common.collect.RegularImmutableMap.fromEntryArray(RegularImmutableMap.java:86) > at > org.apache.beam.repackaged.beam_runners_google_cloud_dataflow_java.com.google.common.collect.ImmutableMap$Builder.build(ImmutableMap.java:300) > at > org.apache.beam.runners.dataflow.util.CloudObjects.populateCloudObjectTranslators(CloudObjects.java:60) > at > org.apache.beam.runners.dataflow.util.CloudObjects.<clinit>(CloudObjects.java:39) > ... 15 more > {code} > > Example command to run the tests (FWIW, it also runs the "clean" task > although I don't know if it's necessary): > {code:java} > ./gradlew clean :beam-sdks-java-load-tests:run --info > -PloadTest.mainClass=org.apache.beam.sdk.loadtests.GroupByKeyLoadTest > -Prunner=:beam-runners-google-cloud-dataflow-java > '-PloadTest.args=--sourceOptions={"numRecords":1000,"splitPointFrequencyRecords":1,"keySizeBytes":1,"valueSizeBytes":9,"numHotKeys":0,"hotKeyFraction":0,"seed":123456,"bundleSizeDistribution":{"type":"const","const":42},"forceNumInitialBundles":100,"progressShape":"LINEAR","initializeDelayDistribution":{"type":"const","const":42}} > > --stepOptions={"outputRecordsPerInputRecord":1,"preservesInputKeyDistribution":true,"perBundleDelay":1,"perBundleDelayType":"MIXED","cpuUtilizationInMixedDelay":0.5} > --fanout=1 --iterations=1 --runner=DataflowRunner'{code} > > After reverting commit bac909b8e237ef8a2ab7e17ac986e5cc90143e5b ([PR: > 7351|https://github.com/apache/beam/pull/7351]) I can no longer reproduce > this issue. -- This message was sent by Atlassian Jira (v8.3.4#803005)