trxcllnt commented on pull request #2035:
URL: https://github.com/apache/arrow/pull/2035#issuecomment-694369864


   @t829702 no, the Arrow JSON IPC representation is only used to validate 
integration tests between the different Arrow implementations. It is _not_ an 
optimized or ergonomic way to interact with Arrow.
   
   This [`csv-to-arrow-js` 
example](https://github.com/trxcllnt/csv-to-arrow-js/blob/f2596045474ce1742e3089da48a5c83a6005be90/index.js#L28-L38)
 is closer to what you'd need. It uses a CSV parsing library to yield rows as 
JSON objects, then transforms the JSON rows into Arrow RecordBatches, which are 
then serialized and flushed to stdout.
   
   There are a few strategies to convert arbitrary JavaScript types into Arrow 
tables, and the strategy you pick depends on your needs. They all use the 
Builder classes under the hood, and generally follow this pattern:
   1. Define the types of the data you will be constructing
   2. Construct a Builder for that type (via `Builder.new()`, or related stream 
equivalents)
   3. Write values to the Builder
   4. Flush the Builder to yield a Vector of the values written up to that point
   5. Repeat steps 3 and 4 as necessary for all your data
   6. Once all your data has been serialized, call `builder.finish()` to yield 
the last chunk
   
   See [this 
comment](https://github.com/apache/arrow/blob/7a532edeabc6f30838e5a53dfef35f37fdf99737/js/src/builder.ts#L88-L104)
 on the Builder constructor for a basic example, or the [`throughIterable()` 
implementation](https://github.com/apache/arrow/blob/7a532edeabc6f30838e5a53dfef35f37fdf99737/js/src/builder.ts#L501-L510)
 for how to handle things like flushing after reaching a row count or byte size 
`highWaterMark`.
   
   I also have this [higher-level 
example](https://codepen.io/trxcllnt/pen/NWPMpPN?editors=0010) that uses the 
`Vector.from()` method to convert existing in-memory JSON data into an Arrow 
StructVector (and `Vector.from()` 
[uses](https://github.com/apache/arrow/blob/7a532edeabc6f30838e5a53dfef35f37fdf99737/js/src/vector/index.ts#L120-L121)
 the Builder classes internally).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to