[
https://issues.apache.org/jira/browse/ORC-508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17270349#comment-17270349
]
Owen O'Malley commented on ORC-508:
-----------------------------------
Ok, I'm going to take a look at this today. At a high level, here are how many
references to each of the Hadoop classes that we currently have:
{{ 46 import org.apache.hadoop.fs.Path;
46 import org.apache.hadoop.conf.Configuration;
39 import org.apache.hadoop.fs.FileSystem;
12 import org.apache.hadoop.fs.FSDataInputStream;
10 import org.apache.hadoop.io.Text;
6 import org.apache.hadoop.io.BytesWritable;
5 import org.apache.hadoop.fs.FileStatus;
5 import org.apache.hadoop.fs.FSDataOutputStream;
3 import org.apache.hadoop.util.Progressable;
3 import org.apache.hadoop.fs.permission.FsPermission;
2 import org.apache.hadoop.io.DataOutputBuffer;
2 import org.apache.hadoop.fs.Seekable;
2 import org.apache.hadoop.fs.PositionedReadable;
1 import org.apache.hadoop.util.VersionInfo;
1 import org.apache.hadoop.io.WritableComparator;
1 import org.apache.hadoop.io.IntWritable;}}
> Add a reader/writer that does not depend on Hadoop FileSystem
> -------------------------------------------------------------
>
> Key: ORC-508
> URL: https://issues.apache.org/jira/browse/ORC-508
> Project: ORC
> Issue Type: Improvement
> Components: Java
> Reporter: Ismaël Mejía
> Priority: Major
>
> It seems that the default implementation classes of Orc today depend on
> Hadoop FS objects to write. This is not ideal for APIs that do not rely on
> Hadoop. For some context I was taking a look at adding support for Apache
> Beam, but Beam's API supports multiple filesystems with a more generic
> abstraction that relies on Java's Channels and Streams APIs and delegate
> directly to Distributed FS e.g. Google Cloud Storage, Amazon S3, etc. It
> would be really nice to have such support in the core implementation and to
> maybe split the Hadoop dependencies implementation into its own module in the
> future.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)