[ 
https://issues.apache.org/jira/browse/BEAM-6545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16757848#comment-16757848
 ] 

Kenneth Knowles commented on BEAM-6545:
---------------------------------------

I have done a manual audit of the remaining uses in our codebase. Relevant 
occurrences listed below:

{code}
$ grep -r decodeBase64 *
...
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/UngroupedShuffleReaderFactory.java:
        decodeBase64(getString(spec, 
WorkerPropertyNames.SHUFFLE_READER_CONFIG)),
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/GroupingShuffleReaderFactory.java:
          decodeBase64(getString(spec, 
WorkerPropertyNames.SHUFFLE_READER_CONFIG)),
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/GroupingShuffleReaderFactory.java:
        decodeBase64(getString(spec, 
WorkerPropertyNames.SHUFFLE_READER_CONFIG)),
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/ShuffleSinkFactory.java:
        decodeBase64(getString(spec, WorkerPropertyNames.SHUFFLE_WRITER_CONFIG, 
null)),
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/util/common/worker/ByteArrayShufflePosition.java:
    return ByteArrayShufflePosition.of(decodeBase64(position));
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WorkerCustomSources.java:
              decodeBase64(serializedSplits.get(index)), "UnboundedSource 
split");
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WorkerCustomSources.java:
                Base64.decodeBase64(getString(spec, SERIALIZED_SOURCE)), 
"Source");
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/PartitioningShuffleReaderFactory.java:
        decodeBase64(getString(spec, 
WorkerPropertyNames.SHUFFLE_READER_CONFIG)),
sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigquery/FakeBigQueryServices.java:
    ByteArrayInputStream input = new 
ByteArrayInputStream(Base64.decodeBase64(query));
{code}

In each case, a misplaced {{null}} would crash elsewhere so it cannot regress 
due to decodeBase64 enforcing it.

> NPE when decoding null base 64 strings
> --------------------------------------
>
>                 Key: BEAM-6545
>                 URL: https://issues.apache.org/jira/browse/BEAM-6545
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-dataflow
>    Affects Versions: 2.9.0
>            Reporter: Ahmet Altay
>            Assignee: Kenneth Knowles
>            Priority: Major
>             Fix For: 2.10.0
>
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> **ByteArrayShufflePosition.fromBase64 is marked with a @Nullable argument, 
> however it does not properly handle null inputs resulting in NPE.
> This seems like an unintended change we picked up from the dependency: 
> google-http-java-client/ switched from apache commons to guava 
> ([https://github.com/googleapis/google-http-java-client/commit/990c534f0e5103a142b0639c12c90cb990a00cfd#diff-97264fba16d690a26d63fbbc992af937)]
>  
>  
> and decodeBase64 behaves differently in both cases. Former can handle null by 
> returning null, latter will throw NPE.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to