[ 
https://issues.apache.org/jira/browse/ARROW-17257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17575456#comment-17575456
 ] 

Aldrin Montana commented on ARROW-17257:
----------------------------------------

Discussed initial directions of this with [~sakras] and [~michalno].

In general, _KeyColumnArray_ tries to minimize it's memory footprint as much as 
possible, whereas _ArraySpan_ tries to be a non-owning version of _ArrayData_.

Work in ARROW-8991 is going to take the following direction: _KeyColumnArray_ 
is essentially a flattened _ArraySpan_, so when using the _Hashing32_ and 
_Hashing64_ functions, the a _KeyColumnArray_ will be constructed from an 
_ArraySpan_.

Otherwise, the 2 classes are very similar and it's okay if _ArraySpan_ 
eventually replaces/subsumes _KeyColumnArray_ as long as performance 
regressions are not introduced.

> [C++] Unify KeyColumnArray and ArraySpan
> ----------------------------------------
>
>                 Key: ARROW-17257
>                 URL: https://issues.apache.org/jira/browse/ARROW-17257
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++
>            Reporter: Weston Pace
>            Priority: Major
>
> Both of these are essentially non-owning views into ArrayData.  They were 
> developed somewhat independently but share a pretty similar structure.  I 
> don't think we need both and we should unify on a common type for simplicity 
> provided we can show no real performance difference.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to