Samuel Schneck created ARROW-18208:
--------------------------------------

             Summary: JS: tableFromJSON cannot handle nested objects containing 
strings
                 Key: ARROW-18208
                 URL: https://issues.apache.org/jira/browse/ARROW-18208
             Project: Apache Arrow
          Issue Type: Bug
            Reporter: Samuel Schneck


```

$ node

const g = require('apache-arrow')

g.tableFromJSON([\{a: [ { b: "hi" } ]}])

```

 

The dictionary types:

 

TYPE Dictionary \{indices: Int32, dictionary: Utf8, isOrdered: false, id: 
12}dictionary: Utf8 {}id: 12indices: Int32 \{isSigned: true, bitWidth: 
32}isOrdered: falseArrayType: (...)children: (...)typeId: (...)valueType: 
(...)[[Prototype]]: Dictionary
typecomparator.ts:191 OTHER 

 

OTHER Dictionary \{indices: Int32, dictionary: Utf8, isOrdered: false, id: 
14}dictionary: Utf8typeId: (...)[[Prototype]]: Utf8id: 14indices: Int32 
\{isSigned: true, bitWidth: 32}isOrdered: falseArrayType: (...)children: 
(...)typeId: (...)valueType: (...)[[Prototype]]: Dictionary

 

This happens here:

    else if (arraysCount + nullsCount === value.length) {
        const array = value;
        const childType = inferType(array[array.findIndex((ary) => ary != 
null)]);
        if (array.every((ary) => ary == null || (0, 
typecomparator_js_1.compareTypes)(childType, inferType(ary)))) {
            return new dtypes.List(new schema_js_1.Field('', childType, true));
        }
    }

 

So we're always instantiating a new dictionary type, with a new id, when we do 
inferType(ary), so this is never going to succeed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to