[
https://issues.apache.org/jira/browse/BEAM-6545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16757848#comment-16757848
]
Kenneth Knowles commented on BEAM-6545:
---------------------------------------
I have done a manual audit of the remaining uses in our codebase. Relevant
occurrences listed below:
{code}
$ grep -r decodeBase64 *
...
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/UngroupedShuffleReaderFactory.java:
decodeBase64(getString(spec,
WorkerPropertyNames.SHUFFLE_READER_CONFIG)),
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/GroupingShuffleReaderFactory.java:
decodeBase64(getString(spec,
WorkerPropertyNames.SHUFFLE_READER_CONFIG)),
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/GroupingShuffleReaderFactory.java:
decodeBase64(getString(spec,
WorkerPropertyNames.SHUFFLE_READER_CONFIG)),
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/ShuffleSinkFactory.java:
decodeBase64(getString(spec, WorkerPropertyNames.SHUFFLE_WRITER_CONFIG,
null)),
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/util/common/worker/ByteArrayShufflePosition.java:
return ByteArrayShufflePosition.of(decodeBase64(position));
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WorkerCustomSources.java:
decodeBase64(serializedSplits.get(index)), "UnboundedSource
split");
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WorkerCustomSources.java:
Base64.decodeBase64(getString(spec, SERIALIZED_SOURCE)),
"Source");
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/PartitioningShuffleReaderFactory.java:
decodeBase64(getString(spec,
WorkerPropertyNames.SHUFFLE_READER_CONFIG)),
sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigquery/FakeBigQueryServices.java:
ByteArrayInputStream input = new
ByteArrayInputStream(Base64.decodeBase64(query));
{code}
In each case, a misplaced {{null}} would crash elsewhere so it cannot regress
due to decodeBase64 enforcing it.
> NPE when decoding null base 64 strings
> --------------------------------------
>
> Key: BEAM-6545
> URL: https://issues.apache.org/jira/browse/BEAM-6545
> Project: Beam
> Issue Type: Bug
> Components: runner-dataflow
> Affects Versions: 2.9.0
> Reporter: Ahmet Altay
> Assignee: Kenneth Knowles
> Priority: Major
> Fix For: 2.10.0
>
> Time Spent: 40m
> Remaining Estimate: 0h
>
> **ByteArrayShufflePosition.fromBase64 is marked with a @Nullable argument,
> however it does not properly handle null inputs resulting in NPE.
> This seems like an unintended change we picked up from the dependency:
> google-http-java-client/ switched from apache commons to guava
> ([https://github.com/googleapis/google-http-java-client/commit/990c534f0e5103a142b0639c12c90cb990a00cfd#diff-97264fba16d690a26d63fbbc992af937)]
>
>
> and decodeBase64 behaves differently in both cases. Former can handle null by
> returning null, latter will throw NPE.
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)