François Wagner created BEAM-2429:
-------------------------------------
Summary: Conflicting filesystems with used of HadoopFileSystem
Key: BEAM-2429
URL: https://issues.apache.org/jira/browse/BEAM-2429
Project: Beam
Issue Type: Bug
Components: sdk-java-extensions
Affects Versions: 2.0.0
Reporter: François Wagner
Assignee: Davor Bonaci
I'm facing issue when trying to use HadoopFileSystem in my pipeline. It looks
like HadoopFileSystem is registring itself under the `file` schema
(https://github.com/apache/beam/pull/2777/files#diff-330bd0854dcab6037ef0e52c05d68eb2L79),
hence the following Exception is thrown when trying to register
HadoopFileSystem.
java.lang.IllegalStateException: Scheme: [file] has conflicting filesystems:
[org.apache.beam.sdk.io.LocalFileSystem,
org.apache.beam.sdk.io.hdfs.HadoopFileSystem]
at
org.apache.beam.sdk.io.FileSystems.verifySchemesAreUnique(FileSystems.java:498)
What is the correct way to handle `hdfs` url out of the box with TextIO &
AvroIO ?
String[] args = new String[]{
"--hdfsConfiguration=[{\"dfs.client.use.datanode.hostname\":
\"true\"}]"};
HadoopFileSystemOptions options = PipelineOptionsFactory
.fromArgs(args)
.withValidation()
.as(HadoopFileSystemOptions.class);
Pipeline pipeline = Pipeline.create(options);
configuration.add(config);
options.setHdfsConfiguration(configuration);
Pipeline pipeline = Pipeline.create(options);
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)