Igor Suhorukov created ARROW-17303:
--------------------------------------
Summary: [Java] Read "arrow" (IPC and streaming) files using
org.apache.arrow.dataset.jni.NativeDatasetFactory
Key: ARROW-17303
URL: https://issues.apache.org/jira/browse/ARROW-17303
Project: Apache Arrow
Issue Type: Improvement
Components: Java
Affects Versions: 9.0.0
Reporter: Igor Suhorukov
Fetch "arrow" (IPC and streaming) files using
org.apache.arrow.dataset.jni.NativeDatasetFactory in Java API. This
functionality required to implement Arrow file/Stream input format in my use
case to process large amount of existing geospatial ARROW format data in Apache
Spark data source. Optimized Analytics Package (OAP) for Spark also can
leverage this feature of Dataset on JVM. They use FileSystemDatasetFactory in
this [[Spark
gazelle_plugin|https://github.com/oap-project/gazelle_plugin/blob/b28ec129211d4a4fb360b6b137847c36545e66f6/arrow-data-source/standard/src/main/scala/com/intel/oap/spark/sql/execution/datasources/v2/arrow/ArrowUtils.scala#L77]|https://github.com/oap-project/gazelle_plugin/blob/b28ec129211d4a4fb360b6b137847c36545e66f6/arrow-data-source/standard/src/main/scala/com/intel/oap/spark/sql/execution/datasources/v2/arrow/ArrowUtils.scala#L77]
adapter
--
This message was sent by Atlassian Jira
(v8.20.10#820010)