1u0 commented on a change in pull request #9169: [FLINK-12998][docs] Update documentation for file systems loading as plugins URL: https://github.com/apache/flink/pull/9169#discussion_r307721513
########## File path: docs/ops/filesystems/s3.zh.md ########## @@ -59,23 +59,24 @@ For some cases, however, e.g., for using S3 as YARN's resource storage dir, it m Flink provides two file systems to talk to Amazon S3, `flink-s3-fs-presto` and `flink-s3-fs-hadoop`. Both implementations are self-contained with no dependency footprint, so there is no need to add Hadoop to the classpath to use them. - - `flink-s3-fs-presto`, registered under the scheme *"s3://"* and *"s3p://"*, is based on code from the [Presto project](https://prestodb.io/). + - `flink-s3-fs-presto`, registered under the scheme *s3://* and *s3p://*, is based on code from the [Presto project](https://prestodb.io/). You can configure it the same way you can [configure the Presto file system](https://prestodb.io/docs/0.187/connector/hive.html#amazon-s3-configuration) by placing adding the configurations to your `flink-conf.yaml`. Presto is the recommended file system for checkpointing to S3. - - - `flink-s3-fs-hadoop`, registered under *"s3://"* and *"s3a://"*, based on code from the [Hadoop Project](https://hadoop.apache.org/). + + - `flink-s3-fs-hadoop`, registered under *s3://* and *s3a://*, based on code from the [Hadoop Project](https://hadoop.apache.org/). The file system can be [configured exactly like Hadoop's s3a](https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html#S3A) by placing adding the configurations to your `flink-conf.yaml`. Shaded Hadoop is the only S3 file system with support for the [StreamingFileSink]({{ site.baseurl}}/dev/connectors/streamfile_sink.html). - + Both `flink-s3-fs-hadoop` and `flink-s3-fs-presto` register default FileSystem -wrappers for URIs with the `s3://` scheme, `flink-s3-fs-hadoop` also registers -for `s3a://` and `flink-s3-fs-presto` also registers for `s3p://`, so you can +wrappers for URIs with the *s3://* scheme, `flink-s3-fs-hadoop` also registers +for *s3a://* and `flink-s3-fs-presto` also registers for *s3p://*, so you can use this to use both at the same time. For example, the job uses the [StreamingFileSink]({{ site.baseurl}}/dev/connectors/streamfile_sink.html) which only supports Hadoop, but uses Presto for checkpointing. -In this case, it is advised to use explicitly *"s3a://"* as a scheme for the sink (Hadoop) and *"s3p://"* for checkpointing (Presto). - -To use either `flink-s3-fs-hadoop` or `flink-s3-fs-presto`, copy the respective JAR file from the `opt` directory to the `lib` directory of your Flink distribution before starting Flink, e.g. +In this case, it is advised to explicitly use *s3a://* as a scheme for the sink (Hadoop) and *s3p://* for checkpointing (Presto). + +To use `flink-s3-fs-hadoop` or `flink-s3-fs-presto`, copy the respective JAR file from the `opt` directory to the `plugins` directory of your Flink distribution before starting Flink, e.g. {% highlight bash %} -cp ./opt/flink-s3-fs-presto-{{ site.version }}.jar ./lib/ +mkdir ./plugins/s3-fs-presto Review comment: The name of the containing folder doesn't matter much. The only convention, that it should be in Flink's `/plugins` dir one folder deep, one folder per plugin, afair. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
