[ 
https://issues.apache.org/jira/browse/ARROW-13532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michal Nowakiewicz updated ARROW-13532:
---------------------------------------
    Labels: pull-request-available query-engine  (was: pull-request-available)

> [C++][Compute] Join: add set membership test method to the grouper
> ------------------------------------------------------------------
>
>                 Key: ARROW-13532
>                 URL: https://issues.apache.org/jira/browse/ARROW-13532
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++
>            Reporter: Michal Nowakiewicz
>            Assignee: Michal Nowakiewicz
>            Priority: Major
>              Labels: pull-request-available, query-engine
>             Fix For: 6.0.0
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> Hash table used in group by provides one main method: map. 
> This method will find an existing matching key in the hash table and output 
> the corresponding group id, it the key already has been inserted in the hash 
> table. Otherwise it will insert a new key and assign a new group id value to 
> it.
> This interface is tailored for the group by. In order to reuse the same hash 
> table implementation in join, there must be a way to skip insertion of new 
> keys into the hash table when looking up existing keys. When join processes 
> probe side it needs to filter input rows based on finding a match in the hash 
> table, but keeping hash table immutable and not automatically adding missing 
> keys to it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to