[
https://issues.apache.org/jira/browse/ARROW-1952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16308244#comment-16308244
]
Brian Hulette commented on ARROW-1952:
--------------------------------------
[~lmeyerov] agreed that this is something we should address somehow, we
definitely want the API to be easy to pick up.
[~paul.e.taylor] and I had a relevant discussion regarding accessors for
Vectors like Int64. We decided that {{Vector.get(idx)}} should always return
the most basic data types possible without asserting any specific wrapper class
or conversion (like the \[hi, lo\] array for Int64). If we want to add any
other specialized accessors, they can follow a {{get<Special>(idx)}} convention
(sort of like
[{{DictionaryVector.getKey(idx)}}|https://github.com/apache/arrow/blob/master/js/src/vector/dictionary.ts#L34)}}]).
That way users can always extend the Vector classes to add their own
specialized accessors, and we could also add our own specialized accessors for
convenience, like {{Int64Vector.getBigInt(idx)}} which uses some BigInteger
class and/or {{Int64Vector.getLow(idx)}} which returns just the lo bits.
All of that being said, I like your idea of elevating those specialized
accessors to the vector level (#2). What would the API look like? Perhaps a
method like {{Int64Vector.asInt32()}}, or {{Int32Vector.from()}} that returns a
Vector with a {{get(idx)}} accessor that returns low bits?
(After re-reading your initial description I _think_ you may be suggesting a
Vector that accesses the hi/lo bits consecutively, rather than just the lo bits
for each value as I initially thought (i.e. {{.get(0)}} returns 0's hi bits,
{{.get(1)}} returns 0's lo bits, etc...). Either way, I like the model of doing
this type of conversion at the Vector level.)
> [JS] 32b dense vector coercion
> ------------------------------
>
> Key: ARROW-1952
> URL: https://issues.apache.org/jira/browse/ARROW-1952
> Project: Apache Arrow
> Issue Type: New Feature
> Reporter: Leo Meyerovich
> Priority: Minor
>
> JS APIs, for better or worse, is quite 32b centric. Currently, JS Arrow does
> a good job of information-preserving flattening, e.g., 64i vector into an
> array of [hi, lo] int32s. Something similar for timestamps. ... However ....
> in getting some Arrow code to load into a legacy system, I'm finding myself
> to be writing a _lot_ of lossy flatteners in userland. Doing it there seems
> brittle, error-prone, incurs friction for adoption, and if put in the core
> lib, enable reuse across libs.
> I can imagine at least 2 reasonable interfaces for this:
> (1) 64b Vector -> 32b flat array (typed or otherwise). This is the naive,
> simple thing.
> (2) 64b Vector -> 32b Vector , and reuse whatever 32b vector -> flat array
> logic will available anyways. This helps stay in the symbolic abstraction
> longer, so may be smarter.
> Thoughts?
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)