Example of writing to and reading from a file: https://github.com/apache/arrow/blob/master/java/vector/src/test/java/org/apache/arrow/vector/file/TestArrowFile.java Similarly, in case you don't want to go through a file: Unloading a vector into buffers and loading from buffers: https://github.com/apache/arrow/blob/master/java/vector/src/test/java/org/apache/arrow/vector/TestVectorUnloadLoad.java The VectorLoader/Unloader are used to read/write FIles
On Wed, Apr 26, 2017 at 10:31 AM, Li Jin <ice.xell...@gmail.com> wrote: > Thanks for the various pointers. I was looking at ArrowFileWriter/Reader > and got a little bit confused. > > So what I am trying to do is to convert a list of spark rows into some > arrow format in java ( I will probably go with the file format for now), > send the bytes to python, deserialize it into a pyarrow table. > > What is what I currently plan to do: > (1) convert the rows to one or more arrow batch record (Use the > ValueVectors) > (2) serialize the arrow batch records send it over to python (Not sure to > use here, ArrowFileWriter?) > (3) deserialize the bytes into pyarrow.Table using pyarrow.FileReader > > I *think* ArrowFileWriter is what I should use to send data over in (2), > but: > (1) I would need to turn the arrow batch records into a VectorSchemaRoot > by doing sth like > this > https://github.com/icexelloss/spark/blob/pandas-udf/sql/ > core/src/test/scala/org/apache/spark/sql/ArrowConvertersSuite.scala#L226 > (2) I am not sure how do I write all the data in a vector schema root using > ArrowFileWriter. > > Does this sound the right thing to do? > > Thanks, > Li > > On Tue, Apr 25, 2017 at 8:52 PM, Wes McKinney <wesmck...@gmail.com> wrote: > > > Also, now that we have a website that is easier to write content for (in > > Markdown), it would be great if some Java developers could volunteer some > > time to write user-facing documentation to go with the Javadocs. > > > > On Tue, Apr 25, 2017 at 8:51 PM, Wes McKinney <wesmck...@gmail.com> > wrote: > > > > > There is also https://github.com/apache/arrow/blob/master/java/ > > > veator/src/test/java/org/apache/arrow/vector/file/ > > TestArrowStreamPipe.java > > > > > > On Tue, Apr 25, 2017 at 8:46 PM, Li Jin <ice.xell...@gmail.com> wrote: > > > > > >> Thanks Julien. I will follow > > >> https://github.com/apache/arrow/blob/990e2bde758ac8bc6e4497a > > >> e1bc37f89b71bb5cf/java/vector/src/test/java/org/apache/ > > >> arrow/vector/stream/MessageSerializerTest.java#L91 > > >> > > > > > > > > > -- Julien