[ 
https://issues.apache.org/jira/browse/ARROW-10221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17476930#comment-17476930
 ] 

Dominik Moritz commented on ARROW-10221:
----------------------------------------

I think it's nice to have the fast {{toArray}}. We do have {{toJSON}}, which 
always takes the slow path. However, I agree that it could be confusing to 
accidentally miss nulls. And always using {{toJSON}} isn't good since it would 
always take the slow pass even when we have no nulls. 

Btw, I found that {{NaN}} is a valid value in typed arrays even though null is 
not. 

I think two three best solutions are
* Leave this as they are and ask people to use {{toJSON}} if they want to 
guarantee to have nulls. We could add a way for users to get the null mask 
(there already is {{isValid(index)}} but you need to ask for each value). Or we 
could have a way for people to define null values (e.g. as -1 or NaN)
* Take the slow path when the data type for the vector is nullable. 
Unfortunately, a user would have no easy option to take the fast pass anymore. 
Maybe we could have an option to force the fast pass?

I'm not happy with either but 

> [JS] toArray() method ignores nulls on some types.
> --------------------------------------------------
>
>                 Key: ARROW-10221
>                 URL: https://issues.apache.org/jira/browse/ARROW-10221
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: JavaScript
>    Affects Versions: 0.17.1
>            Reporter: Ben Schmidt
>            Priority: Minor
>              Labels: pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> The .toArray() javascript method of vectors includes a shortcut to return the 
> underlying typed array; but this doesn't respect null values, and so can 
> return the wrong number.
>  
> ```
> v = arrow.Vector.from(\{values: [1, 2, 3, 4, 5, null, 6],type: new 
> arrow.Int32()})
> v.toArray()[5] // Incorrectly returns '0'
> v.get(5) // Correctly returns null
> ```
>  
> Solution: Eliminate the fast method, always return Javascript arrays. It 
> might be better to keep the old method in cases where there are guaranteed no 
> nulls.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to