Ismaël Mejía created ORC-508:
--------------------------------
Summary: Add a reader/writer that do not depend on Hadoop
FileSystem
Key: ORC-508
URL: https://issues.apache.org/jira/browse/ORC-508
Project: ORC
Issue Type: Improvement
Components: Java
Reporter: Ismaël Mejía
It seems that the default implementation classes of Orc today depend on Hadoop
FS objects to write. This is not ideal for APIs that do not rely on Hadoop. For
some context I was taking a look at adding support for Apache Beam, but Beam's
API supports multiple filesystems with a more generic abstraction that relies
on Java's Channels and Streams APIs. That delegate directly to Distributed FS
e.g. Google Cloud Storage, Amazon S3, etc. It would be really nice to have such
support in the core implementation and to maybe split the hadoop depending
implementation into its own module in the future.
After a look at some parts of the `orc-core`
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)