[
https://issues.apache.org/jira/browse/BEAM-12857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Kenneth Knowles updated BEAM-12857:
-----------------------------------
Labels: (was: stale-P2)
> Unable to write to GCS due to IndexOutOfBoundsException in FileSystems
> ----------------------------------------------------------------------
>
> Key: BEAM-12857
> URL: https://issues.apache.org/jira/browse/BEAM-12857
> Project: Beam
> Issue Type: Bug
> Components: io-java-gcp
> Affects Versions: 2.31.0, 2.32.0
> Environment: Beam 2.31.0/2.32.0, Java 11, GCP Dataflow
> Reporter: Patrick Lucas
> Priority: P2
>
> I have a simple batch job, running on Dataflow, that reads from a GCS bucket,
> filters the data, and windows and writes the matching data back to a
> different path in the same bucket.
> The job seems to succeed in reading and filtering the data, as well as
> writing temporary files to GCS, but appears to fail when trying to rename the
> temporary files to their final destination.
> The IndexOutOfBoundsException is thrown from
> [FileSystems.java:429|https://github.com/apache/beam/blob/v2.32.0/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileSystems.java#L429]
> (in 2.32.0), when the code calls {{.get(0)}} on the list returned by a call
> to {{MatchResult#metadata()}}.
> The javadoc for
> [{{MatchResult#metadata()}}|https://github.com/apache/beam/blob/v2.32.0/sdks/java/core/src/main/java/org/apache/beam/sdk/io/fs/MatchResult.java#L75-L80]
> says,
> {code:java}
> /**
> * {@link Metadata} of matched files. Note that if {@link #status()} is
> {@link Status#NOT_FOUND},
> * this may either throw a {@link java.io.FileNotFoundException} or return
> an empty list,
> * depending on the {@link EmptyMatchTreatment} used in the {@link
> FileSystems#match} call.
> */
> {code}
> So possibly GCS is not returning any metadata for the (missing) destination
> object? That seems unlikely, as I would expect many others would have already
> run into this, but I don't see how this could be caused by my user code.
> I have tested this on 2.31.0 and 2.32.0 getting the same error, but it's
> worth noting that the logic in FileSystems.java changed a decent amount
> recently in [#15301|https://github.com/apache/beam/pull/15301], maybe having
> an effect on this, but I haven't been able to test it since I'm working in a
> closed environment and can only easily use released versions of Beam. Once a
> version containing this change is released, I will upgrade and try again.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)