[
https://issues.apache.org/jira/browse/ARROW-1693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16250811#comment-16250811
]
ASF GitHub Bot commented on ARROW-1693:
---------------------------------------
trxcllnt commented on issue #1294: ARROW-1693: [JS] Fix reading C++
dictionary-encoded vectors
URL: https://github.com/apache/arrow/pull/1294#issuecomment-344140344
@wesm the Jest docs on snapshot testing highlight its utility testing React
components, but it's really just a form of test code generation. The tests
evaluate [all
combinations](https://github.com/trxcllnt/arrow/blob/generate-js-test-files/js/test/table-tests.ts#L22)
of `source lib x arrow format` (in reality: `[c++, java] x [file, stream]`)
for each of the generated files (nested, simple, decimal, datetime, primitive,
primitive-empty, dictionary, and struct_example), so there are quite a few
assertions.
<details><summary>
Snapshots capture a bit of runtime type info that would otherwise have to be
asserted explicitly, for example that calling `uint64Vector.get(i)` returns a
`Uint32Array` of two elements:</summary>
```
exports[`readBuffers cpp stream primitive reads each batch as an Array of
Vectors 167`] = `
Uint32Array [
12840890,
0,
]
`;
```
</details>
<details><summary>
They're also helpful catching regressions (or comparing against pandas) in
`Table.toString()`:
</summary><p>
```
exports[`Table cpp file nested toString({ index: true }) prints a pretty
Table with an Index column 1`] = `
"Index, list_nullable, struct_nullable
0, null, [null,\\"tmo7qBM\\"]
1, [1685103474], [-583988484,null]
2, [1981297353], [-749108100,\\"yGRfkmw\\"]
3, [-2032422645,-2111456179,-895490422], [820115077,null]
4, null, null
5, [null,-434891054,-864560986], null
6, null, [986507083,\\"U6xvhr7\\"]
7, null, null
8, null, [null,null]
9, null, null
10, [-498865952], null
11, null, [null,\\"ctyWPJf\\"]
12, null, [null,null]
13, [-1076160763,-792439045,-656549144,null], null
14, null, [1234093448,null]
15, [null,null,1882910932], null
16, null, [934007407,\\"9QUyEm5\\"]"
`;
```
</p></details>
<h6></h6>
It also gives reviewers a chance to see what the tests produce, so if `get`
on a Uint64Array starts returning a `Long` object instead of a `Uint32Array`,
we can flag that in a code review. That said, it sounds like the JSON reader
should be able to do most of this validation.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> [JS] Error reading dictionary-encoded integration test files
> ------------------------------------------------------------
>
> Key: ARROW-1693
> URL: https://issues.apache.org/jira/browse/ARROW-1693
> Project: Apache Arrow
> Issue Type: Bug
> Components: JavaScript
> Reporter: Brian Hulette
> Assignee: Brian Hulette
> Labels: pull-request-available
> Fix For: 0.8.0
>
> Attachments: dictionary-cpp.arrow, dictionary-java.arrow,
> dictionary.json
>
>
> The JS implementation crashes when reading the dictionary test case from the
> integration tests.
> To replicate, first generate the test files with java and cpp impls:
> {code}
> $ cd ${ARROW_HOME}/integration/
> $ python -c 'from integration_test import generate_dictionary_case;
> generate_dictionary_case().write("dictionary.json")'
> $ ../cpp/debug/debug/json-integration-test --integration
> --json=dictionary.json --arrow=dictionary-cpp.arrow --mode=JSON_TO_ARROW
> $ java -cp
> ../java/tools/target/arrow-tools-0.8.0-SNAPSHOT-jar-with-dependencies.jar
> org.apache.arrow.tools.Integration -c JSON_TO_ARROW -a dictionary-java.arrow
> -j dictionary.json
> {code}
> Attempt to read the files with the JS impl:
> {code}
> $ cd ${ARROW_HOME}/js/
> $ ./bin/arrow2csv.js -s dict1_0 -f ../integration/dictionary-{java,cpp}.arrow
> {code}
> Both files result in an error for me on
> [a8f51858|https://github.com/apache/arrow/commit/a8f518588fda471b2e3cc8e0f0064e7c4bb99899]:
> {{TypeError: Cannot read property 'buffer' of undefined}}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)