kennknowles opened a new issue, #19410:
URL: https://github.com/apache/beam/issues/19410
Fixing this requires updating 4 locations.
* Dataflow RunnerHarness
* FNAPDoFnRunner
* UnifiedWorker
* Shared libraries for this proto generation, which should cover OSS runners
**** Remove the workaround in ProcessBundleHandler.java which will assume
that all PCollections are bounded, if not set.
See PCollectionTranslation.fromProto which should be always passed a valid
value and not default to error or assume the PCollection is bounded.
Context
\===
When I was updating the java SDK to conditionally serialize some elements to
reported a sampled byte size metric, I encountered this.
Its due to to the refactoring in my
[PR/8416](https://github.com/apache/beam/pull/8416), the RehydratedComponents
was pulled up a level, and shared now among all the calls to
createRunnerForPTransform in the various PtransfomRunnerFactories.
I is now triggering some code paths which were not previously triggered for
all types of PTransforms/PCollections, causing this error to occur.
jsonPayload: {
exception:
"org.apache.beam.vendor.guava.v20_0.com.google.common.util.concurrent.UncheckedExecutionException:
java.lang.IllegalArgumentException: Cannot convert unknown
org.apache.beam.model.pipeline.v1.RunnerApi.IsBounded to
org.apache.beam.sdk.values.PCollection.IsBounded: UNSPECIFIED
at
org.apache.beam.vendor.guava.v20_0.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2214)
at
org.apache.beam.vendor.guava.v20_0.com.google.common.cache.LocalCache.get(LocalCache.java:4053)
at
org.apache.beam.vendor.guava.v20_0.com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:4057)
at
org.apache.beam.vendor.guava.v20_0.com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4986)
at
org.apache.beam.runners.core.construction.RehydratedComponents.getPCollection(RehydratedComponents.java:144)
at
org.apache.beam.fn.harness.data.PCollectionConsumerRegistry.getMultiplexingConsumer(PCollectionConsumerRegistry.java:145)
at
org.apache.beam.fn.harness.DoFnPTransformRunnerFactory$Context.<init\>(DoFnPTransformRunnerFactory.java:284)
at
org.apache.beam.fn.harness.DoFnPTransformRunnerFactory.createRunnerForPTransform(DoFnPTransformRunnerFactory.java:97)
at
org.apache.beam.fn.harness.DoFnPTransformRunnerFactory.createRunnerForPTransform(DoFnPTransformRunnerFactory.java:63)
at
org.apache.beam.fn.harness.control.ProcessBundleHandler.createRunnerAndConsumersForPTransformRecursively(ProcessBundleHandler.java:198)
at
org.apache.beam.fn.harness.control.ProcessBundleHandler.createRunnerAndConsumersForPTransformRecursively(ProcessBundleHandler.java:166)
at
org.apache.beam.fn.harness.control.ProcessBundleHandler.createRunnerAndConsumersForPTransformRecursively(ProcessBundleHandler.java:166)
at
org.apache.beam.fn.harness.control.ProcessBundleHandler.processBundle(ProcessBundleHandler.java:306)
at
org.apache.beam.fn.harness.control.BeamFnControlClient.delegateOnInstructionRequestType(BeamFnControlClient.java:160)
at
org.apache.beam.fn.harness.control.BeamFnControlClient.lambda$processInstructionRequests$0(BeamFnControlClient.java:144)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IllegalArgumentException: Cannot convert unknown
org.apache.beam.model.pipeline.v1.RunnerApi.IsBounded to
org.apache.beam.sdk.values.PCollection.IsBounded: UNSPECIFIED
at
org.apache.beam.runners.core.construction.PCollectionTranslation.fromProto(PCollectionTranslation.java:88)
at
org.apache.beam.runners.core.construction.PCollectionTranslation.fromProto(PCollectionTranslation.java:56)
at
org.apache.beam.runners.core.construction.RehydratedComponents$3.load(RehydratedComponents.java:103)
at
org.apache.beam.runners.core.construction.RehydratedComponents$3.load(RehydratedComponents.java:93)
at
org.apache.beam.vendor.guava.v20_0.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3628)
at
org.apache.beam.vendor.guava.v20_0.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2336)
at
org.apache.beam.vendor.guava.v20_0.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2295)
at
org.apache.beam.vendor.guava.v20_0.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2208)
... 17 more
"
job: "2019-05-29_03_31_14-4799355109250203557"
logger: "org.apache.beam.fn.harness.control.BeamFnControlClient"
*message: "Exception while trying to handle InstructionRequest -28"*
portability_worker_id: "1"
thread: "16"
worker: "testpipeline-pabloem-0529-05290331-75o8-harness-htz8"
}
The root of the issue is that the ProcessBundleDescriptors are invalid. The
RunnerHarnesses are not setting the
org.apache.beam.model.pipeline.v1.RunnerApi.IsBounded which breaks the
specification and leads to this error.
Imported from Jira
[BEAM-7452](https://issues.apache.org/jira/browse/BEAM-7452). Original Jira may
contain additional context.
Reported by: [email protected].
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]