[ 
https://issues.apache.org/jira/browse/BEAM-2229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16003253#comment-16003253
 ] 

ASF GitHub Bot commented on BEAM-2229:
--------------------------------------

GitHub user dhalperi opened a pull request:

    https://github.com/apache/beam/pull/2992

    [BEAM-2229] GcsFileSystem: handle empty files

    And add tests.
    
    R: @lukecwik 

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/dhalperi/beam b2229-gcs-empty-files

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/beam/pull/2992.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2992
    
----
commit cc60d62010b2ac3d06450fcb5241248543f24af0
Author: Dan Halperin <[email protected]>
Date:   2017-05-09T18:29:20Z

    [BEAM-2229] GcsFileSystem: handle empty files
    
    And add tests

----


> GcsFileSystem attempts to create invalid Metadata
> -------------------------------------------------
>
>                 Key: BEAM-2229
>                 URL: https://issues.apache.org/jira/browse/BEAM-2229
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-java-gcp
>    Affects Versions: 2.0.0
>            Reporter: Josh Di Fabio
>            Assignee: Daniel Halperin
>            Priority: Trivial
>             Fix For: 2.0.0
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> This is the first issue I've raised on Apache's JIRA; if I have made any 
> mistakes in compiling this ticket then I apologise and would welcome any 
> feedback.
> When matching a path spec, {{GcsFileSystem.toMetadata()}} will sometimes 
> attempt to build an instance of 
> {{org.apache.beam.sdk.io.fs.MatchResult.Metadata}} without first setting 
> {{sizeBytes}}\[1\]. This always results in an error in the 
> autovalue-generated builder for {{MatchResult.Metadata}} as {{sizeBytes}} is 
> a required field\[2\].
> I propose that {{GcsFileSystem}} set {{sizeBytes}} to {{0}} when there is no 
> size returned by GCS, which will presumably happen when the path spec refers 
> either to a directory, or to a non-existent file. 
> {{GcsFileSystem.toMetadata()}} could be updated as follows:
> *Before*
> {code:java}
>     if (size != null) {
>       ret.setSizeBytes(size.longValue());
>     }
> {code}
> *After*
> {code:java}
>     if (size != null) {
>       ret.setSizeBytes(size.longValue());
>     } else {
>       ret.setSizeBytes(0);
>     }
> {code}
> \[1\] 
> https://github.com/apache/beam/blob/5bfd3e049c0ca0744165b0243a645e8e427032d5/sdks/java/extensions/google-cloud-platform-core/src/main/java/org/apache/beam/sdk/extensions/gcp/storage/GcsFileSystem.java#L240-L242
> \[2\] 
> https://gist.github.com/joshdifabio/fe543b97e02e7ddac8edb73be38deb06#file-autovalue_matchresult_metadata-java-L102-L110



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to