[
https://issues.apache.org/jira/browse/ARROW-7047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Neal Richardson resolved ARROW-7047.
------------------------------------
Fix Version/s: 1.0.0
Resolution: Fixed
Issue resolved by pull request 5813
[https://github.com/apache/arrow/pull/5813]
> [C++][Dataset] Filter expressions should not require exact type match
> ---------------------------------------------------------------------
>
> Key: ARROW-7047
> URL: https://issues.apache.org/jira/browse/ARROW-7047
> Project: Apache Arrow
> Issue Type: New Feature
> Components: C++ - Dataset
> Reporter: Neal Richardson
> Assignee: Ben Kietzman
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.0.0
>
> Time Spent: 7h 10m
> Remaining Estimate: 0h
>
> It's not trivial for users to be able to ensure that scalars are of identical
> type to the fields they relate to in Expressions. For one, FieldExpressions
> don't contain a type reference, so at the time when I construct
> {{field_ref("col1") > scalar(42)}}, I don't know exactly what type col1 is to
> be able to ensure that scalar(42) matches. Even if it were available, I
> wouldn't be able to determine what type to make it if the expression were
> {{(field_ref("col1") + field_ref("col2")) > scalar(42)}}.
> We should allow CompareExpressions to cast the inputs as necessary. This
> should be among integer types and floating point types, and across integers
> and floats too. Likewise among date/timestamp types, and probably if
> comparing a string scalar against a date/timestamp column, the string should
> be parsed as a datetime. We also need to think about DictionaryTypes (though
> in practice this is moot until we have a comparison kernels that work on
> strings).
> [~fsaintjacques][~bkietz]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)