pronzato commented on issue #315:
URL: https://github.com/apache/arrow-cookbook/issues/315#issuecomment-1631111519
Thank you David, I managed to get it working with your guidance - looks
great.
Now I'm onto the next task which is to read Parquet files remotely from
HDFS but I'm getting the following exception:
*java.lang.RuntimeException*: Got HDFS URI but Arrow compiled without HDFS
support
at
org.apache.arrow.dataset.file.JniWrapper.makeFileSystemDatasetFactory(*Native
Method*)
at
org.apache.arrow.dataset.file.FileSystemDatasetFactory.createNative(
*FileSystemDatasetFactory.java:35*)
at org.apache.arrow.dataset.file.FileSystemDatasetFactory.<init>(
*FileSystemDatasetFactory.java:31*)
but according to the latest 12.0.1 online docs it says "HDFS support is
included in the official Apache Arrow Java package releases and can be used
directly without re-building the source code."
https://arrow.apache.org/docs/java/dataset.html#read-data-from-hdfs
Am I missing a step or is the doc incorrect and I need to rebuild the libs
or is there another place to download the libs with HDFS support?
On Mon, Jul 10, 2023 at 2:41 PM David Li ***@***.***> wrote:
> You'd have to subclass ArrowReader and implement a facade over the
> iterator yourself, in this case
>
> —
> Reply to this email directly, view it on GitHub
>
<https://github.com/apache/arrow-cookbook/issues/315#issuecomment-1629502140>,
> or unsubscribe
>
<https://github.com/notifications/unsubscribe-auth/ACO7PHDD6ASE43SLKC2U6KLXPREGRANCNFSM6AAAAAA2DO2TY4>
> .
> You are receiving this because you authored the thread.Message ID:
> ***@***.***>
>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]