Hi Stefano,
I implemented the overlap according to Calcite's implementation. Maybe
they changed the behavior in the mean time. I agree we should try to
stay in sync with Calcite. What do other DB vendors do? Feel free to
open an issue about this.
Regards,
Timo
Am 30.05.17 um 14:24 schrieb Stefano Bortoli:
Hi all,
I am playing around with the table API, and I have a doubt about temporal
operator overlaps. In particular, a test in the
scalarFunctionsTest.testOverlaps checks for false the following intervals:
testAllApis(
temporalOverlaps("2011-03-10 05:02:02".toTimestamp, 0.second,
"2011-03-10 05:02:02".toTimestamp, "2011-03-10 05:02:01".toTimestamp),
"temporalOverlaps(toTimestamp('2011-03-10 05:02:02'), 0.second, " +
"'2011-03-10 05:02:02'.toTimestamp, '2011-03-10
05:02:01'.toTimestamp)",
"(TIMESTAMP '2011-03-10 05:02:02', INTERVAL '0' SECOND) OVERLAPS " +
"(TIMESTAMP '2011-03-10 05:02:02', TIMESTAMP '2011-03-10 05:02:01')",
"false")
Basically, the compared intervals overlap just by one of the extreme. The
interpretation of the time.scala implementation is
AND(
>=(DATETIME_PLUS(CAST('2011-03-10
05:02:02'):TIMESTAMP(3) NOT NULL, 0), CAST('2011-03-10 05:02:02'):TIMESTAMP(3) NOT
NULL),
>=(CAST('2011-03-10 05:02:01'):TIMESTAMP(3) NOT NULL,
CAST('2011-03-10 05:02:02'):TIMESTAMP(3) NOT NULL)
),
Where the result is false as the second clause is not satisfied.
However, latest calcite master compiles the overlaps as follows:
[AND
(
>=( CASE(
<=(2011-03-10 05:02:02,
DATETIME_PLUS(2011-03-10 05:02:02, 0)), DATETIME_PLUS(2011-03-10 05:02:02, 0),
2011-03-10 05:02:02
),
CASE(
<=(2011-03-10 05:02:02,
2011-03-10 05:02:01), 2011-03-10 05:02:02, 2011-03-10 05:02:01
)
),
>=( CASE(
<=(2011-03-10 05:02:02,
2011-03-10 05:02:01), 2011-03-10 05:02:01, 2011-03-10 05:02:02
),
CASE(
<=(2011-03-10 05:02:02,
DATETIME_PLUS(2011-03-10 05:02:02, 0)), 2011-03-10 05:02:02,
DATETIME_PLUS(2011-03-10 05:02:02, 0)
)
)
)
]
Where the result is true.
I believe the issue is about interpreting the extremes as part of the
overlapping intervals or not. Flink does not consider the intervals as
overlapping (as the test shows), whereas Calcite implements the test including
them.
Which one should be preserved?
I think that calcite implementation is correct, and overlapping extremes should
be considered. What do you think?
Best,
Stefano