[
https://issues.apache.org/jira/browse/BEAM-6349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16734375#comment-16734375
]
Craig Chambers commented on BEAM-6349:
--------------------------------------
Thanks, gcloud auth login was what I was missing. (I don't normally
develop on github or GCP, so I'm a newbie with these tools and processes.)
But the GroupByKeyLoadTest passes for me, at least in the context of my PR
before it was merged in. Is there some other way to find the issue?
On Fri, Jan 4, 2019 at 12:16 PM Lukasz Gajowy (JIRA) <[email protected]>
> Exceptions (IllegalArgumentException or NoClassDefFoundError) when running
> tests on Dataflow runner
> ---------------------------------------------------------------------------------------------------
>
> Key: BEAM-6349
> URL: https://issues.apache.org/jira/browse/BEAM-6349
> Project: Beam
> Issue Type: Improvement
> Components: testing
> Reporter: Lukasz Gajowy
> Assignee: Craig Chambers
> Priority: Major
> Time Spent: 0.5h
> Remaining Estimate: 0h
>
> Running GroupByKeyLoadTest results in the following error on Dataflow runner:
>
> {code:java}
> java.lang.ExceptionInInitializerError
> at
> org.apache.beam.runners.dataflow.worker.IntrinsicMapTaskExecutorFactory$2.typedApply(IntrinsicMapTaskExecutorFactory.java:344)
> at
> org.apache.beam.runners.dataflow.worker.IntrinsicMapTaskExecutorFactory$2.typedApply(IntrinsicMapTaskExecutorFactory.java:338)
> at
> org.apache.beam.runners.dataflow.worker.graph.Networks$TypeSafeNodeFunction.apply(Networks.java:63)
> at
> org.apache.beam.runners.dataflow.worker.graph.Networks$TypeSafeNodeFunction.apply(Networks.java:50)
> at
> org.apache.beam.runners.dataflow.worker.graph.Networks.replaceDirectedNetworkNodes(Networks.java:87)
> at
> org.apache.beam.runners.dataflow.worker.IntrinsicMapTaskExecutorFactory.create(IntrinsicMapTaskExecutorFactory.java:120)
> at
> org.apache.beam.runners.dataflow.worker.BatchDataflowWorker.doWork(BatchDataflowWorker.java:337)
> at
> org.apache.beam.runners.dataflow.worker.BatchDataflowWorker.getAndPerformWork(BatchDataflowWorker.java:291)
> at
> org.apache.beam.runners.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.doWork(DataflowBatchWorkerHarness.java:135)
> at
> org.apache.beam.runners.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:115)
> at
> org.apache.beam.runners.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:102)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.IllegalArgumentException: Multiple entries with same
> key:
> kind:varint=org.apache.beam.runners.dataflow.util.CloudObjectTranslators$8@39b69c48
> and
> kind:varint=org.apache.beam.runners.dataflow.worker.RunnerHarnessCoderCloudObjectTranslatorRegistrar$1@7966f294
> at
> org.apache.beam.repackaged.beam_runners_google_cloud_dataflow_java.com.google.common.collect.ImmutableMap.checkNoConflict(ImmutableMap.java:136)
> at
> org.apache.beam.repackaged.beam_runners_google_cloud_dataflow_java.com.google.common.collect.RegularImmutableMap.checkNoConflictInKeyBucket(RegularImmutableMap.java:100)
> at
> org.apache.beam.repackaged.beam_runners_google_cloud_dataflow_java.com.google.common.collect.RegularImmutableMap.fromEntryArray(RegularImmutableMap.java:86)
> at
> org.apache.beam.repackaged.beam_runners_google_cloud_dataflow_java.com.google.common.collect.ImmutableMap$Builder.build(ImmutableMap.java:300)
> at
> org.apache.beam.runners.dataflow.util.CloudObjects.populateCloudObjectTranslators(CloudObjects.java:60)
> at
> org.apache.beam.runners.dataflow.util.CloudObjects.<clinit>(CloudObjects.java:39)
> ... 15 more
> {code}
>
> Example command to run the tests (FWIW, it also runs the "clean" task
> although I don't know if it's necessary):
> {code:java}
> ./gradlew clean :beam-sdks-java-load-tests:run --info
> -PloadTest.mainClass=org.apache.beam.sdk.loadtests.GroupByKeyLoadTest
> -Prunner=:beam-runners-google-cloud-dataflow-java
> '-PloadTest.args=--sourceOptions={"numRecords":1000,"splitPointFrequencyRecords":1,"keySizeBytes":1,"valueSizeBytes":9,"numHotKeys":0,"hotKeyFraction":0,"seed":123456,"bundleSizeDistribution":{"type":"const","const":42},"forceNumInitialBundles":100,"progressShape":"LINEAR","initializeDelayDistribution":{"type":"const","const":42}}
>
> --stepOptions={"outputRecordsPerInputRecord":1,"preservesInputKeyDistribution":true,"perBundleDelay":1,"perBundleDelayType":"MIXED","cpuUtilizationInMixedDelay":0.5}
> --fanout=1 --iterations=1 --runner=DataflowRunner'{code}
>
> After reverting commit bac909b8e237ef8a2ab7e17ac986e5cc90143e5b ([PR:
> 7351|https://github.com/apache/beam/pull/7351]) I can no longer reproduce
> this issue.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)