Samuel Schneck created ARROW-18208:
--------------------------------------
Summary: JS: tableFromJSON cannot handle nested objects containing
strings
Key: ARROW-18208
URL: https://issues.apache.org/jira/browse/ARROW-18208
Project: Apache Arrow
Issue Type: Bug
Reporter: Samuel Schneck
```
$ node
const g = require('apache-arrow')
g.tableFromJSON([\{a: [ { b: "hi" } ]}])
```
The dictionary types:
TYPE Dictionary \{indices: Int32, dictionary: Utf8, isOrdered: false, id:
12}dictionary: Utf8 {}id: 12indices: Int32 \{isSigned: true, bitWidth:
32}isOrdered: falseArrayType: (...)children: (...)typeId: (...)valueType:
(...)[[Prototype]]: Dictionary
typecomparator.ts:191 OTHER
OTHER Dictionary \{indices: Int32, dictionary: Utf8, isOrdered: false, id:
14}dictionary: Utf8typeId: (...)[[Prototype]]: Utf8id: 14indices: Int32
\{isSigned: true, bitWidth: 32}isOrdered: falseArrayType: (...)children:
(...)typeId: (...)valueType: (...)[[Prototype]]: Dictionary
This happens here:
else if (arraysCount + nullsCount === value.length) {
const array = value;
const childType = inferType(array[array.findIndex((ary) => ary !=
null)]);
if (array.every((ary) => ary == null || (0,
typecomparator_js_1.compareTypes)(childType, inferType(ary)))) {
return new dtypes.List(new schema_js_1.Field('', childType, true));
}
}
So we're always instantiating a new dictionary type, with a new id, when we do
inferType(ary), so this is never going to succeed.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)