[
https://issues.apache.org/jira/browse/ARROW-6574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16934903#comment-16934903
]
Paul Taylor commented on ARROW-6574:
------------------------------------
[~akre54] This is the JSON IPC format which is only suitable for integration
tests between the different Arrow implementations.
You can use the Vector
[Builders|https://github.com/apache/arrow/blob/b2785d38a110c8fd8a3d7c957cd78d8911607a5e/js/src/builder.ts#L54]
to encode to arbitrary JS objects into Arrow Vectors and Tables.
The raw Builder APIs allow you to control every aspect of the chunking and
flushing behavior, but as a consequence are relatively low-level. There are
higher-level APIs for transforming values from iterables, async iterables, node
streams, or DOM streams. You can see examples of usage [in the tests
here|https://github.com/apache/arrow/blob/b2785d38a110c8fd8a3d7c957cd78d8911607a5e/js/test/unit/builders/builder-tests.ts#L261],
or see [this example|https://github.com/trxcllnt/csv-to-arrow-js] converting a
CSV row stream to Arrow.
Lastly if your values are already in memory, you can call `Vector.from()` with
an Arrow type and an iterable (or async-iterable) of JS values, and it'll use
the Builders to return a Vector of the specified type:
{code:javascript}
// create from a list of numbers or a Float32Array (zero-copy) -- all values
will be valid
const f32 = Float32Vector.from([1.1, 2.5, 3.7]);
// or a different style, handy if inferring the types at runtime
// values in the `nullValues` array will be treated as NULL, and written in the
validity bitmap
const f32 = Vector.from({
nullValues: [-1, NaN],
type: new Arrow.Float32(),
values: [1.1, -1, 2.5, 3.7, NaN],
});
// ^ result: [1.1, null, 2.5, 3.7, null]
// or with values from an AsyncIterator
const f32 = await Vector.from({
type: new Arrow.Float32(),
values: (async function*() { yield* [1.1, 2.5, 3.7]; }())
});
{code}
> [JS] TypeError with utf8 and JSONVectorLoader.readData
> ------------------------------------------------------
>
> Key: ARROW-6574
> URL: https://issues.apache.org/jira/browse/ARROW-6574
> Project: Apache Arrow
> Issue Type: Bug
> Components: JavaScript
> Affects Versions: 0.14.1
> Environment: node v10.16.0, OSX 10.14.5
> Reporter: Adam M Krebs
> Priority: Major
>
> Minimal repro:
>
> {code:javascript}
> const fields = [
> {
> name: 'first_name',
> type: {name: 'utf8'},
> nullable: false,
> children: [],
> },
> ];
> Table.from({
> schema: {fields},
> batches: [{
> count: 1,
> columns: [{
> name: 'first_name',
> count: 1,
> VALIDITY: [],
> DATA: ['Fred']
> }]
> }]
> });{code}
>
> Output:
> {code:java}
> /[snip]/node_modules/apache-arrow/visitor/vectorloader.js:92
> readData(type, { offset } = this.nextBufferRange()) {
> ^TypeError: Cannot destructure property `offset` of
> 'undefined' or 'null'.
> at JSONVectorLoader.readData
> (/[snip]/node_modules/apache-arrow/visitor/vectorloader.js:92:38)
> at JSONVectorLoader.visitUtf8
> (/[snip]/node_modules/apache-arrow/visitor/vectorloader.js:46:188)
> at JSONVectorLoader.visit
> (/[snip]/node_modules/apache-arrow/visitor.js:28:48)
> at JSONVectorLoader.visit
> (/[snip]/node_modules/apache-arrow/visitor/vectorloader.js:40:22)
> at nodes.map (/[snip]/node_modules/apache-arrow/visitor.js:25:44)
> at Array.map (<anonymous>)
> at JSONVectorLoader.visitMany
> (/[snip]/node_modules/apache-arrow/visitor.js:25:22)
> at RecordBatchJSONReaderImpl._loadVectors
> (/[snip]/node_modules/apache-arrow/ipc/reader.js:523:107)
> at RecordBatchJSONReaderImpl._loadRecordBatch
> (/[snip]/node_modules/apache-arrow/ipc/reader.js:209:79)
> at RecordBatchJSONReaderImpl.next
> (/[snip]/node_modules/apache-arrow/ipc/reader.js:280:42){code}
>
>
> Looks like the `nextBufferRange` call is returning `undefined`, due to an
> out-of-bounds `buffersIndex`.
>
> Happy to provide more info if needed. Seems to only affect utf8 types and
> nothing else.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)