Re: Help with Java API and RecordBatch creation

Wes McKinney Sun, 05 Aug 2018 13:32:02 -0700

hi Alberto,

Have you looked at the relevant usage of Arrow in Apache Spark? See


https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/arrow/ArrowConverters.scala

and related modules.

On your first question, my understanding is that

* ArrowRecordBatch represents the in-memory record batch and
* RecordBatch (in org.apache.arrow.flatbuf) is for the serialized
record batch metadata, called the "data header" commonly (defined in
Message.fbs)

- Wes

On Sun, Aug 5, 2018 at 9:13 AM, ALBERTO Bocchinfuso
<[email protected]> wrote:
>
> Good morning,
>
> I have to use apache arrow with scala, so I’m using the Java API from scala, 
> but I’m confused, I hope that someone is going to clarify something for me.
>
> First of all, what is the difference between ArrowRecordBatch (in 
> org.apache.arrow.vector.ipc.message) and RecordBatch (in 
> org.apache.arrow.flatbuf)?
> In this regard, if a coder wants to use arrow just for IPC, should she 
> consider only the classes in the package org.apache.arrow.vector, or should 
> she learn also how to use the other packages, particularly io.netty.buffer 
> and org.apache.arrow.memory and org.apache.arrow.flatbuf?
>
> I don’t understand how to perform in java everything that is done in python 
> like in the documentation pages:
>              http://arrow.apache.org/docs/python/data.html
>              http://arrow.apache.org/docs/python/ipc.html
>
> I’d like to understand how I can create what in python is called a 
> RecordBatch, and serialize it in a stream, for example to write it on a file 
> or whatever.
> I think ArrowRecordBatch can be created by using the constructors, once you 
> built a list of ArrowFieldNode (I haven’t understood what this class stands 
> for, to be honest) and ArrowBuff (I haven’t understood how to create one, I 
> think that I should instantiate an ArrowByteBufAllocator though alloc(), but 
> then I wouldn’t know how to procede...), but I’m not sure.
> I hope that my doubts are going to be cleared.
>
> Thank you,
> Alberto
>

Re: Help with Java API and RecordBatch creation

Reply via email to