[
https://issues.apache.org/jira/browse/ARROW-17303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated ARROW-17303:
-----------------------------------
Labels: pull-request-available (was: )
> [Java] Read "arrow" (IPC and streaming) files using
> org.apache.arrow.dataset.jni.NativeDatasetFactory
> ------------------------------------------------------------------------------------------------------
>
> Key: ARROW-17303
> URL: https://issues.apache.org/jira/browse/ARROW-17303
> Project: Apache Arrow
> Issue Type: Improvement
> Components: Java
> Affects Versions: 9.0.0
> Reporter: Igor Suhorukov
> Priority: Major
> Labels: pull-request-available
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Fetch "arrow" (IPC and streaming) files using
> org.apache.arrow.dataset.jni.NativeDatasetFactory in Java API. This
> functionality required to implement Arrow file/Stream input format in my use
> case to process large amount of existing geospatial ARROW format data in
> Apache Spark data source. Optimized Analytics Package (OAP) for Spark also
> can leverage this feature of Dataset on JVM. They use
> FileSystemDatasetFactory in this [[Spark
> gazelle_plugin|https://github.com/oap-project/gazelle_plugin/blob/b28ec129211d4a4fb360b6b137847c36545e66f6/arrow-data-source/standard/src/main/scala/com/intel/oap/spark/sql/execution/datasources/v2/arrow/ArrowUtils.scala#L77]|https://github.com/oap-project/gazelle_plugin/blob/b28ec129211d4a4fb360b6b137847c36545e66f6/arrow-data-source/standard/src/main/scala/com/intel/oap/spark/sql/execution/datasources/v2/arrow/ArrowUtils.scala#L77]
> adapter
--
This message was sent by Atlassian Jira
(v8.20.10#820010)