Josh Di Fabio created BEAM-2229:
-----------------------------------
Summary: GcsFileSystem can create invalid Metadata
Key: BEAM-2229
URL: https://issues.apache.org/jira/browse/BEAM-2229
Project: Beam
Issue Type: Bug
Components: sdk-java-gcp
Affects Versions: 2.0.0
Reporter: Josh Di Fabio
Assignee: Daniel Halperin
Priority: Trivial
This is the first issue I've raised on Apache's JIRA; if I have made any
mistakes in compiling this ticket then I apologise and would welcome any
feedback.
When matching a path spec, {{GcsFileSystem.toMetadata()}} will sometimes
attempt to build an instance of
{{org.apache.beam.sdk.io.fs.MatchResult.Metadata}} without first setting
{{sizeBytes}}\[1\]. This always results in an error in the autovalue-generated
builder for {{MatchResult.Metadata}} as {{sizeBytes}} is a required field\[2\].
I propose that {{GcsFileSystem}} set {{sizeBytes}} to {{0}} when there is no
size returned by GCS, which will presumably happen when the path spec refers
either to a directory, or to a non-existent file.
{{GcsFileSystem.toMetadata()}} could be updated as follows:
*Before*
{code:java}
if (size != null) {
ret.setSizeBytes(size.longValue());
}
{code}
*After*
{code:java}
if (size != null) {
ret.setSizeBytes(size.longValue());
} else {
ret.setSizeBytes(0);
}
{code}
\[1\]
https://github.com/apache/beam/blob/5bfd3e049c0ca0744165b0243a645e8e427032d5/sdks/java/extensions/google-cloud-platform-core/src/main/java/org/apache/beam/sdk/extensions/gcp/storage/GcsFileSystem.java#L240-L242
\[2\]
https://gist.github.com/joshdifabio/fe543b97e02e7ddac8edb73be38deb06#file-autovalue_matchresult_metadata-java-L102-L110
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)