[
https://issues.apache.org/jira/browse/ARROW-12301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17342843#comment-17342843
]
Neal Richardson commented on ARROW-12301:
-----------------------------------------
Anything relating to hashing should be coordinated with the ongoing query
engine work (ARROW-12633); cc [~michalno]
> [C++][Compute] Use generic hash-aggregate for DictionaryArrays
> --------------------------------------------------------------
>
> Key: ARROW-12301
> URL: https://issues.apache.org/jira/browse/ARROW-12301
> Project: Apache Arrow
> Issue Type: Improvement
> Components: C++
> Reporter: Rok Mihevc
> Priority: Major
>
> When calculating unique for chunked DictionaryArrays we currently run through
> all chunks and unify their dictionaries and then collect chunk indices. We
> could avoid the dictionary unification by using a generic hash.
> [See discussion here|https://github.com/apache/arrow/pull/9683] and
> [here|https://issues.apache.org/jira/browse/ARROW-10403]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)