Yngve Kristiansen created ARROW-5274:
----------------------------------------

             Summary: [Javascript] Wrong array type for countBy
                 Key: ARROW-5274
                 URL: https://issues.apache.org/jira/browse/ARROW-5274
             Project: Apache Arrow
          Issue Type: Bug
            Reporter: Yngve Kristiansen


The {{countBy}} function is not returning correct histograms, as it seems to 
select the wrong array type for the indexing.

The following line in countBy seems to be causing the problems:

{{const countByteLength = Math.ceil(Math.log(vector.dictionary.length) / 
Math.log(256));}}

For example, if the dictionary length is 3, yet the indices length is 1 
million, the result of this expression will be 1, which will lead to a 
Uint8Array being used, again resulting in overflows.

Codepen example
[https://codepen.io/Yngve92/pen/mYdWrr]

If I switch the expression to: {{const countByteLength = 
Math.ceil(Math.log(vector.length) / Math.log(256));}} it seems to be working 
all right, but I am not sure if this is correct.

The expression is on L63, L189 in src/compute/dataframe.ts.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to