[ 
https://issues.apache.org/jira/browse/BEAM-8889?focusedWorklogId=618225&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-618225
 ]

ASF GitHub Bot logged work on BEAM-8889:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 02/Jul/21 19:32
            Start Date: 02/Jul/21 19:32
    Worklog Time Spent: 10m 
      Work Description: chamikaramj commented on a change in pull request 
#14817:
URL: https://github.com/apache/beam/pull/14817#discussion_r663208862



##########
File path: 
sdks/java/extensions/google-cloud-platform-core/src/main/java/org/apache/beam/sdk/extensions/gcp/util/GcsUtil.java
##########
@@ -107,6 +110,7 @@ public GcsUtil create(PipelineOptions options) {
           storageBuilder.getHttpRequestInitializer(),
           gcsOptions.getExecutorService(),
           hasExperiment(options, "use_grpc_for_gcs"),
+          gcsOptions.getGcpCredential(),

Review comment:
       This is not backwards compatible. What if gcpCredentials is not provided 
? (I assume default credentials will be used but we should make sure that this 
does not result in a regression).

##########
File path: 
sdks/java/extensions/google-cloud-platform-core/src/main/java/org/apache/beam/sdk/extensions/gcp/util/GcsUtil.java
##########
@@ -763,7 +751,7 @@ public void onSuccess(StorageObject response, HttpHeaders 
httpHeaders)
           @Override
           public void onFailure(GoogleJsonError e, HttpHeaders httpHeaders) 
throws IOException {
             IOException ioException;
-            if (errorExtractor.itemNotFound(e)) {
+            if (e.getCode() == HttpStatusCodes.STATUS_CODE_NOT_FOUND) {

Review comment:
       So this is a regression ? Note that Beam file IO connectors are very 
sensitives to changes in behavior of rename/copy/delete etc. since current 
behavior is carefully implemented (after many bugs) to be correct when there 
are step failures and retries.

##########
File path: 
sdks/java/extensions/google-cloud-platform-core/src/main/java/org/apache/beam/sdk/extensions/gcp/util/GcsUtil.java
##########
@@ -400,6 +402,22 @@ public SeekableByteChannel open(GcsPath path) throws 
IOException {
     return googleCloudStorage.open(new StorageResourceId(path.getBucket(), 
path.getObject()));
   }
 
+  /**
+   * Opens an object in GCS.
+   *
+   * <p>Returns a SeekableByteChannel that provides access to data in the 
bucket.
+   *
+   * @param path the GCS filename to read from
+   * @param readOptions Fine-grained options for behaviors of retries, 
buffering, etc.
+   * @return a SeekableByteChannel that can read the object data
+   */
+  @VisibleForTesting
+  SeekableByteChannel open(GcsPath path, GoogleCloudStorageReadOptions 
readOptions)

Review comment:
       Where is this used ?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

            Worklog Id:     (was: 618225)
    Remaining Estimate: 126.5h  (was: 126h 40m)
            Time Spent: 41.5h  (was: 41h 20m)

> Make GcsUtil use GoogleCloudStorage
> -----------------------------------
>
>                 Key: BEAM-8889
>                 URL: https://issues.apache.org/jira/browse/BEAM-8889
>             Project: Beam
>          Issue Type: Improvement
>          Components: io-java-gcp
>    Affects Versions: 2.16.0
>            Reporter: Esun Kim
>            Assignee: VASU NORI
>            Priority: P2
>              Labels: gcs
>             Fix For: 2.22.0
>
>   Original Estimate: 168h
>          Time Spent: 41.5h
>  Remaining Estimate: 126.5h
>
> [GcsUtil|https://github.com/apache/beam/blob/master/sdks/java/extensions/google-cloud-platform-core/src/main/java/org/apache/beam/sdk/extensions/gcp/util/GcsUtil.java]
>  is a primary class to access Google Cloud Storage on Apache Beam. Current 
> implementation directly creates GoogleCloudStorageReadChannel and 
> GoogleCloudStorageWriteChannel by itself to read and write GCS data rather 
> than using 
> [GoogleCloudStorage|https://github.com/GoogleCloudPlatform/bigdata-interop/blob/master/gcsio/src/main/java/com/google/cloud/hadoop/gcsio/GoogleCloudStorage.java]
>  which is an abstract class providing basic IO capability which eventually 
> creates channel objects. This request is about updating GcsUtil to use 
> GoogleCloudStorage to create read and write channel, which is expected 
> flexible because it can easily pick up the new change; e.g. new channel 
> implementation using new protocol without code change.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to