Github user aljoscha commented on a diff in the pull request:
https://github.com/apache/flink/pull/4961#discussion_r149636089
--- Diff: flink-filesystems/flink-s3-fs-presto/README.md ---
@@ -0,0 +1,28 @@
+This project is a wrapper around the S3 file system from the Presto
project which shades all dependencies.
+Initial simple tests seem to indicate that it responds slightly faster
+and in a bit more lightweight manner to write/read/list requests, compared
+to the Hadoop s3a FS, but it has some semantic differences.
+
+We also relocate the shaded Hadoop version to allow running in a different
+setup. For this to work, however, we needed to adapt Hadoop's
`Configuration`
+class to load a (shaded) `core-default-shaded.xml` configuration with the
+relocated class names of classes loaded via reflection
+(in the fute, we may need to extend this to `mapred-default.xml` and
`hdfs-defaults.xml` and their respective configuration classes).
+
+# Changing the Hadoop Version
+
+If you want to change the Hadoop version this project depends on, the
following
+steps are required to keep the shading correct:
+
+1. copy `org/apache/hadoop/conf/Configuration.java` from the respective
Hadoop jar file (from `com.facebook.presto.hadoop/hadoop-apache2`) to this
project
+ - adapt the `Configuration` class by replacing `core-default.xml` with
`core-default-shaded.xml`.
+2. copy `core-default.xml` from the respective Hadoop jar (from
`com.facebook.presto.hadoop/hadoop-apache2`) file to this project as
+ - `src/main/resources/core-default-shaded.xml` (replacing every
occurence of `org.apache.hadoop` with
`org.apache.flink.fs.s3presto.shaded.org.apache.hadoop`)
+ - `src/test/resources/core-site.xml` (as is)
+3. verify the shaded jar:
+ - does not contain any unshaded classes except for
`org.apache.flink.fs.s3presto.S3FileSystemFactory`
+ - every other classes should be under
`org.apache.flink.fs.s3presto.shaded`
--- End diff --
nit: "classes"
---