[
https://issues.apache.org/jira/browse/ARROW-18334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Weston Pace updated ARROW-18334:
--------------------------------
Description:
The expression simplification currently has a small set of functions which it
knows are commutative (IsBinaryAssociativeCommutative). "add" (and
"add_checked" are in this list. This should be ok for add(timestamp,duration)
since this boils down to add(int64,int64) which is commutative. However, the
way the kernels are currently implemented, we are getting the incorrect output
type.
Concretely, we have kernels:
{noformat}
add_checked<Timestamp,Duration>() -> types[0]
add_checked<Duration,Timestamp>() -> types[1]
{noformat}
A call is made with expression {{field_ref("x") + duration_literal}}. This
call is bound to {{add_checked<Timestamp, Duration>}}. However, the expression
is then simplified to {{duration_literal + field_ref("x")}}. Oddly enough, the
math in this case is correct, since it is just addition, but the output type is
not. It assigns an output type of duration instead of timestamp.
was:
In general, we get away with this, since most kernels are of the shape {{FOO(T,
T)}}. However, when the input types differ, this can be a problem.
Concretely, we have kernels:
{noformat}
add_checked<Timestamp,Duration>() -> types[0]
add_checked<Duration,Timestamp>() -> types[1]
{noformat}
A call is made with expression {{field_ref("x") + duration_literal}}. This
call is bound to {{add_checked<Timestamp, Duration>}}. However, the expression
is then simplified to {{duration_literal + field_ref("x")}}. Oddly enough, the
math in this case is correct, since it is just addition, but the output type is
not. It assigns an output type of duration instead of timestamp.
> add function for timestamp/duration is not commutative
> ------------------------------------------------------
>
> Key: ARROW-18334
> URL: https://issues.apache.org/jira/browse/ARROW-18334
> Project: Apache Arrow
> Issue Type: Bug
> Components: C++
> Reporter: Weston Pace
> Priority: Major
>
> The expression simplification currently has a small set of functions which it
> knows are commutative (IsBinaryAssociativeCommutative). "add" (and
> "add_checked" are in this list. This should be ok for
> add(timestamp,duration) since this boils down to add(int64,int64) which is
> commutative. However, the way the kernels are currently implemented, we are
> getting the incorrect output type.
> Concretely, we have kernels:
> {noformat}
> add_checked<Timestamp,Duration>() -> types[0]
> add_checked<Duration,Timestamp>() -> types[1]
> {noformat}
> A call is made with expression {{field_ref("x") + duration_literal}}. This
> call is bound to {{add_checked<Timestamp, Duration>}}. However, the
> expression is then simplified to {{duration_literal + field_ref("x")}}.
> Oddly enough, the math in this case is correct, since it is just addition,
> but the output type is not. It assigns an output type of duration instead of
> timestamp.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)