[ 
https://issues.apache.org/jira/browse/ARROW-38?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney closed ARROW-38.
-----------------------------
    Resolution: Won't Fix

This issue is ill-defined. We will have to address hashing of nested types in 
the course of implementing kernels like Unique, or hash joins

> C++: Algorithms for using nested types in a hash table context
> --------------------------------------------------------------
>
>                 Key: ARROW-38
>                 URL: https://issues.apache.org/jira/browse/ARROW-38
>             Project: Apache Arrow
>          Issue Type: New Feature
>          Components: C++
>            Reporter: Wes McKinney
>            Priority: Major
>
> Computing hash values (and performing equality comparisons) for top-level 
> slots in nested-type data (for example, computing DISTINCT on a 
> {{List<List<Int32>>}}, related: ARROW-32) can be fairly complex. 
> Additionally, value slots at any level of the type tree can be null. 
> We should explore various algorithms for their performance and memory use in 
> practical settings. For example, one can compute a contiguous "record" / byte 
> array resulting from a depth-first traversal of a single value slot for the 
> purposes of computing a hash value or comparing with another slot. If anyone 
> has other ideas from past experiences I would be keen to learn more.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to