[
https://issues.apache.org/jira/browse/ARROW-8464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andrew Lamb updated ARROW-8464:
-------------------------------
Description:
Usecases: Efficiently process large columns of low cardinality Strings
* BatchIterator should accept both DictionaryBatch and RecordBatch
* Type Coercion optimizer rule should inject expression for converting
dictionary value types to index types (for equality expressions, and IN(values,
...)
* Physical expression would lookup index for dictionary values referenced in
the query so that at runtime, only indices are being compared per batch
was:
* BatchIterator should accept both DictionaryBatch and RecordBatch
* Type Coercion optimizer rule should inject expression for converting
dictionary value types to index types (for equality expressions, and IN(values,
...)
* Physical expression would lookup index for dictionary values referenced in
the query so that at runtime, only indices are being compared per batch
> [Rust] [DataFusion] Add support for dictionary types
> ----------------------------------------------------
>
> Key: ARROW-8464
> URL: https://issues.apache.org/jira/browse/ARROW-8464
> Project: Apache Arrow
> Issue Type: Improvement
> Components: Rust - DataFusion
> Reporter: Andy Grove
> Priority: Major
>
> Usecases: Efficiently process large columns of low cardinality Strings
>
> * BatchIterator should accept both DictionaryBatch and RecordBatch
> * Type Coercion optimizer rule should inject expression for converting
> dictionary value types to index types (for equality expressions, and
> IN(values, ...)
> * Physical expression would lookup index for dictionary values referenced in
> the query so that at runtime, only indices are being compared per batch
--
This message was sent by Atlassian Jira
(v8.3.4#803005)