AHeise commented on a change in pull request #16559:
URL: https://github.com/apache/flink/pull/16559#discussion_r682825092
##########
File path:
flink-filesystems/flink-s3-fs-presto/src/main/resources/META-INF/NOTICE
##########
@@ -34,14 +34,25 @@ This project bundles the following dependencies under the
Apache Software Licens
- joda-time:joda-time:2.5
- org.apache.commons:commons-configuration2:2.1.1
- org.apache.commons:commons-lang3:3.3.2
-- org.apache.hadoop:hadoop-annotations:3.1.0
-- org.apache.hadoop:hadoop-aws:3.1.0
-- org.apache.hadoop:hadoop-auth:3.1.0
-- org.apache.hadoop:hadoop-common:3.1.0
+- org.apache.commons:commons-text:1.4
+- org.apache.hadoop:hadoop-annotations:3.3.1
+- org.apache.hadoop:hadoop-aws:3.3.1
+- org.apache.hadoop:hadoop-auth:3.3.1
+- org.apache.hadoop:hadoop-common:3.3.1
+- org.apache.hadoop.thirdparty:hadoop-shaded-guava:1.1.1
+- org.apache.hadoop.thirdparty:hadoop-shaded-protobuf_3_7:1.1.1
- org.apache.htrace:htrace-core4:4.1.0-incubating
-- org.apache.httpcomponents:httpcore:4.4.14
- org.apache.httpcomponents:httpclient:4.5.13
+- org.apache.httpcomponents:httpcore:4.4.14
+- org.apache.kerby:kerby-asn1:1.0.1
+- org.apache.kerby:kerb-core:1.0.1
+- org.apache.kerby:kerby-pkix:1.0.1
+- org.apache.kerby:kerby-util:1.0.1
+- org.codehaus.woodstox:stax2-api:4.2.1
+- org.xerial.snappy:snappy-java:1.1.8.3
- org.weakref:jmxutils:1.19
+- org.wildfly.openssl:wildfly-openssl:1.0.7.Final
+- dnsjava:dnsjava:2.1.7
Review comment:
So `dnsjava` is definitively declared incorrectly then. Thanks for
checking!
##########
File path: flink-core/src/main/java/org/apache/flink/core/fs/FileSystem.java
##########
@@ -248,6 +248,8 @@
ImmutableMultimap.<String, String>builder()
.put("wasb", "flink-fs-azure-hadoop")
.put("wasbs", "flink-fs-azure-hadoop")
+ .put("abfs", "flink-fs-azure-hadoop")
+ .put("abfss", "flink-fs-azure-hadoop")
Review comment:
Just to clarify: Adding the entries will not change anything if
everything is setup correctly. However, in case the user forgets to add the fs
(as plugin or lib), this list can give a meaningful error message. So thanks
for updating it!
##########
File path:
flink-filesystems/flink-s3-fs-hadoop/src/main/java/org/apache/flink/fs/s3hadoop/HadoopS3AccessHelper.java
##########
@@ -52,8 +55,22 @@
private final InternalWriteOperationHelper s3accessHelper;
public HadoopS3AccessHelper(S3AFileSystem s3a, Configuration conf) {
Review comment:
Yes, I think it would be best to have 2 commits ultimately.
In the first one, you are bumping the hadoop version and adjust this class +
license/notice files. In the second commit, your actual changes come in.
##########
File path:
flink-filesystems/flink-azure-fs-hadoop/src/main/java/org/apache/flink/fs/azurefs/AbstractAzureFSFactory.java
##########
@@ -75,18 +76,27 @@ public void configure(Configuration config) {
@Override
public FileSystem create(URI fsUri) throws IOException {
checkNotNull(fsUri, "passed file system URI object should not be
null");
- LOG.info("Trying to load and instantiate Azure File System");
+ LOG.info("Trying to load and instantiate Azure File System for {}",
fsUri);
return new HadoopFileSystem(createInitializedAzureFS(fsUri,
flinkConfig));
}
- // uri is of the form:
wasb(s)://[email protected]/testDir
+ // uri is of the form:
wasb(s)://[email protected]/testDir (or)
+ // abfs(s):////[email protected]/testDir
private org.apache.hadoop.fs.FileSystem createInitializedAzureFS(
URI fsUri, Configuration flinkConfig) throws IOException {
org.apache.hadoop.conf.Configuration hadoopConfig =
configLoader.getOrLoadHadoopConfig();
-
- org.apache.hadoop.fs.FileSystem azureFS = new NativeAzureFileSystem();
- azureFS.initialize(fsUri, hadoopConfig);
-
- return azureFS;
+ String scheme = fsUri.getScheme();
+
+ if (scheme.startsWith("wasb")) {
Review comment:
From code structure, it now looks much cleaner. We could go one step
further and pull the `initialize` back into this method. Then all
`createAzureFS` are really just one liners.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]