[ 
https://issues.apache.org/jira/browse/FLINK-11838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17284518#comment-17284518
 ] 

Xintong Song edited comment on FLINK-11838 at 2/15/21, 12:05 AM:
-----------------------------------------------------------------

Thanks [~galenwarren], I think we are good to move forward to the PR review.

Concerning "compose on persist", I'm leaning towards keeping it simple & stupid 
until we come up with a thorough plan. My concern for throwing something "not 
obvious which is better" to the users as a configuration option is that, we 
don't known when later we have a thorough plan whether this option will be 
needed by (or worse, conflict with) the plan. Configuration options are 
considered public interfaces, thus incompatible changes (e.g., removing an 
exist option) are not forbidden but would be good to avoid.

I think it doesn't hurt to implement "compose on persist" in a separate commit 
and decide whether to merge it during the PR review. Even we decide not to 
merge it for the first step, the PR can serve as a place to keep the un-merged 
implementation, and if we decide to do that later we can cherry-pick the 
un-merged commit from the PR if needed.


was (Author: xintongsong):
Thanks [~galenwarren], I think we are good to move forward to the PR review.

Concerning "compose on persist", I'm leaning towards keeping it simple & stupid 
until we come up with a thorough plan. My concern for throwing something "not 
obvious which is better" to the users as a configuration option is that, we 
don't known when later we have a thorough plan whether this option will be 
needed by (or worse, conflict with) the plan. Configuration options are 
considered public interfaces, thus incompatible changes (e.g., removing an 
exist option) are not forbidden but would be good to avoid.

> Create RecoverableWriter for GCS
> --------------------------------
>
>                 Key: FLINK-11838
>                 URL: https://issues.apache.org/jira/browse/FLINK-11838
>             Project: Flink
>          Issue Type: New Feature
>          Components: Connectors / FileSystem
>    Affects Versions: 1.8.0
>            Reporter: Fokko Driesprong
>            Assignee: Galen Warren
>            Priority: Major
>              Labels: pull-request-available, usability
>             Fix For: 1.13.0
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> GCS supports the resumable upload which we can use to create a Recoverable 
> writer similar to the S3 implementation:
> https://cloud.google.com/storage/docs/json_api/v1/how-tos/resumable-upload
> After using the Hadoop compatible interface: 
> https://github.com/apache/flink/pull/7519
> We've noticed that the current implementation relies heavily on the renaming 
> of the files on the commit: 
> https://github.com/apache/flink/blob/master/flink-filesystems/flink-hadoop-fs/src/main/java/org/apache/flink/runtime/fs/hdfs/HadoopRecoverableFsDataOutputStream.java#L233-L259
> This is suboptimal on an object store such as GCS. Therefore we would like to 
> implement a more GCS native RecoverableWriter 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to