[jira] [Commented] (ARROW-1952) [JS] 32b dense vector coercion

Brian Hulette (JIRA) Tue, 02 Jan 2018 07:36:31 -0800

    [ 
https://issues.apache.org/jira/browse/ARROW-1952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16308244#comment-16308244
 ]


Brian Hulette commented on ARROW-1952:
--------------------------------------

[~lmeyerov] agreed that this is something we should address somehow, we 
definitely want the API to be easy to pick up.

[~paul.e.taylor] and I had a relevant discussion regarding accessors for 
Vectors like Int64. We decided that {{Vector.get(idx)}} should always return 
the most basic data types possible without asserting any specific wrapper class 
or conversion (like the \[hi, lo\] array for Int64). If we want to add any 
other specialized accessors, they can follow a {{get<Special>(idx)}} convention 
(sort of like 
[{{DictionaryVector.getKey(idx)}}|https://github.com/apache/arrow/blob/master/js/src/vector/dictionary.ts#L34)}}]).
That way users can always extend the Vector classes to add their own 
specialized accessors, and we could also add our own specialized accessors for 
convenience, like {{Int64Vector.getBigInt(idx)}} which uses some BigInteger 
class and/or {{Int64Vector.getLow(idx)}} which returns just the lo bits.

All of that being said, I like your idea of elevating those specialized 
accessors to the vector level (#2). What would the API look like? Perhaps a 
method like {{Int64Vector.asInt32()}}, or {{Int32Vector.from()}} that returns a 
Vector with a {{get(idx)}} accessor that returns low bits?


(After re-reading your initial description I _think_ you may be suggesting a 
Vector that accesses the hi/lo bits consecutively, rather than just the lo bits 
for each value as I initially thought (i.e. {{.get(0)}} returns 0's hi bits, 
{{.get(1)}} returns 0's lo bits, etc...). Either way, I like the model of doing 
this type of conversion at the Vector level.)

> [JS] 32b dense vector coercion
> ------------------------------
>
>                 Key: ARROW-1952
>                 URL: https://issues.apache.org/jira/browse/ARROW-1952
>             Project: Apache Arrow
>          Issue Type: New Feature
>            Reporter: Leo Meyerovich
>            Priority: Minor
>
> JS APIs, for better or worse, is quite 32b centric. Currently, JS Arrow does 
> a good job of information-preserving flattening, e.g., 64i vector into an 
> array of [hi, lo] int32s.  Something similar for timestamps. ... However .... 
> in getting some Arrow code to load into a legacy system, I'm finding myself 
> to be writing a _lot_ of lossy flatteners in userland.  Doing it there seems 
> brittle, error-prone, incurs friction for adoption, and if put in the core 
> lib, enable reuse across libs.
> I can imagine at least 2 reasonable interfaces for this:
> (1) 64b Vector -> 32b flat array (typed or otherwise). This is the naive, 
> simple thing.
> (2) 64b Vector -> 32b Vector , and reuse whatever 32b vector -> flat array 
> logic will available anyways. This helps stay in the symbolic abstraction 
> longer, so may be smarter.
> Thoughts?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (ARROW-1952) [JS] 32b dense vector coercion

Reply via email to