[ 
https://issues.apache.org/jira/browse/ARROW-1693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16250811#comment-16250811
 ] 

ASF GitHub Bot commented on ARROW-1693:
---------------------------------------

trxcllnt commented on issue #1294: ARROW-1693: [JS] Fix reading C++ 
dictionary-encoded vectors
URL: https://github.com/apache/arrow/pull/1294#issuecomment-344140344
 
 
   @wesm the Jest docs on snapshot testing highlight its utility testing React 
components, but it's really just a form of test code generation. The tests 
evaluate [all 
combinations](https://github.com/trxcllnt/arrow/blob/generate-js-test-files/js/test/table-tests.ts#L22)
 of `source lib x arrow format` (in reality: `[c++, java] x [file, stream]`) 
for each of the generated files (nested, simple, decimal, datetime, primitive, 
primitive-empty, dictionary, and struct_example), so there are quite a few 
assertions.
   
   <details><summary>
   Snapshots capture a bit of runtime type info that would otherwise have to be 
asserted explicitly, for example that calling `uint64Vector.get(i)` returns a 
`Uint32Array` of two elements:</summary>
   
   ```
   exports[`readBuffers cpp stream primitive reads each batch as an Array of 
Vectors 167`] = `
   Uint32Array [
     12840890,
     0,
   ]
   `;
   ```
   </details>
   
   <details><summary>
   They're also helpful catching regressions (or comparing against pandas) in 
`Table.toString()`:
   </summary><p>
   
   ```
   exports[`Table cpp file nested toString({ index: true }) prints a pretty 
Table with an Index column 1`] = `
   "Index,                            list_nullable,        struct_nullable
       0,                                     null,       [null,\\"tmo7qBM\\"]
       1,                             [1685103474],      [-583988484,null]
       2,                             [1981297353], [-749108100,\\"yGRfkmw\\"]
       3,     [-2032422645,-2111456179,-895490422],       [820115077,null]
       4,                                     null,                   null
       5,             [null,-434891054,-864560986],                   null
       6,                                     null,  [986507083,\\"U6xvhr7\\"]
       7,                                     null,                   null
       8,                                     null,            [null,null]
       9,                                     null,                   null
      10,                             [-498865952],                   null
      11,                                     null,       [null,\\"ctyWPJf\\"]
      12,                                     null,            [null,null]
      13, [-1076160763,-792439045,-656549144,null],                   null
      14,                                     null,      [1234093448,null]
      15,                   [null,null,1882910932],                   null
      16,                                     null,  [934007407,\\"9QUyEm5\\"]"
   `;
   ```
   </p></details>
   <h6></h6>
   
   It also gives reviewers a chance to see what the tests produce, so if `get` 
on a Uint64Array starts returning a `Long` object instead of a `Uint32Array`, 
we can flag that in a code review. That said, it sounds like the JSON reader 
should be able to do most of this validation.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


> [JS] Error reading dictionary-encoded integration test files
> ------------------------------------------------------------
>
>                 Key: ARROW-1693
>                 URL: https://issues.apache.org/jira/browse/ARROW-1693
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: JavaScript
>            Reporter: Brian Hulette
>            Assignee: Brian Hulette
>              Labels: pull-request-available
>             Fix For: 0.8.0
>
>         Attachments: dictionary-cpp.arrow, dictionary-java.arrow, 
> dictionary.json
>
>
> The JS implementation crashes when reading the dictionary test case from the 
> integration tests.
> To replicate, first generate the test files with java and cpp impls:
> {code}
> $ cd ${ARROW_HOME}/integration/
> $ python -c 'from integration_test import generate_dictionary_case; 
> generate_dictionary_case().write("dictionary.json")'
> $ ../cpp/debug/debug/json-integration-test --integration 
> --json=dictionary.json --arrow=dictionary-cpp.arrow --mode=JSON_TO_ARROW
> $ java -cp 
> ../java/tools/target/arrow-tools-0.8.0-SNAPSHOT-jar-with-dependencies.jar 
> org.apache.arrow.tools.Integration -c JSON_TO_ARROW -a dictionary-java.arrow 
> -j dictionary.json
> {code}
> Attempt to read the files with the JS impl:
> {code}
> $ cd ${ARROW_HOME}/js/
> $ ./bin/arrow2csv.js -s dict1_0 -f ../integration/dictionary-{java,cpp}.arrow
> {code}
> Both files result in an error for me on 
> [a8f51858|https://github.com/apache/arrow/commit/a8f518588fda471b2e3cc8e0f0064e7c4bb99899]:
> {{TypeError: Cannot read property 'buffer' of undefined}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to