[
https://issues.apache.org/jira/browse/ARROW-17257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17575456#comment-17575456
]
Aldrin Montana commented on ARROW-17257:
----------------------------------------
Discussed initial directions of this with [~sakras] and [~michalno].
In general, _KeyColumnArray_ tries to minimize it's memory footprint as much as
possible, whereas _ArraySpan_ tries to be a non-owning version of _ArrayData_.
Work in ARROW-8991 is going to take the following direction: _KeyColumnArray_
is essentially a flattened _ArraySpan_, so when using the _Hashing32_ and
_Hashing64_ functions, the a _KeyColumnArray_ will be constructed from an
_ArraySpan_.
Otherwise, the 2 classes are very similar and it's okay if _ArraySpan_
eventually replaces/subsumes _KeyColumnArray_ as long as performance
regressions are not introduced.
> [C++] Unify KeyColumnArray and ArraySpan
> ----------------------------------------
>
> Key: ARROW-17257
> URL: https://issues.apache.org/jira/browse/ARROW-17257
> Project: Apache Arrow
> Issue Type: Improvement
> Components: C++
> Reporter: Weston Pace
> Priority: Major
>
> Both of these are essentially non-owning views into ArrayData. They were
> developed somewhat independently but share a pretty similar structure. I
> don't think we need both and we should unify on a common type for simplicity
> provided we can show no real performance difference.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)