Neal Richardson created ARROW-7047:
--------------------------------------
Summary: [C++][Dataset] Filter expressions should not require
exact type match
Key: ARROW-7047
URL: https://issues.apache.org/jira/browse/ARROW-7047
Project: Apache Arrow
Issue Type: New Feature
Components: C++ - Dataset
Reporter: Neal Richardson
It's not trivial for users to be able to ensure that scalars are of identical
type to the fields they relate to in Expressions. For one, FieldExpressions
don't contain a type reference, so at the time when I construct
{{field_ref("col1") > scalar(42)}}, I don't know exactly what type col1 is to
be able to ensure that scalar(42) matches. Even if it were available, I
wouldn't be able to determine what type to make it if the expression were
{{(field_ref("col1") + field_ref("col2")) > scalar(42)}}.
We should allow CompareExpressions to cast the inputs as necessary. This should
be among integer types and floating point types, and across integers and floats
too. Likewise among date/timestamp types, and probably if comparing a string
scalar against a date/timestamp column, the string should be parsed as a
datetime. We also need to think about DictionaryTypes (though in practice this
is moot until we have a comparison kernels that work on strings).
[~fsaintjacques][~bkietz]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)