Hi Haisheng,
I spend some time debugging and find the possible root cause. In BasicSqlType, the value of `toString` result and instance member `digest` might be different. For TIMESTAMP without precision (precision = -1), toString returns TIMESTAMP and digest is TIMESTAMP(0), which conflicts with real TIMESTAMP(0) with 0 as precision. `Interner<RelDataType> DATATYPE_CACHE = Interners.newWeakInterner()` makes sure SqlType is singleton between invocations, TIMESTAMP(0) might return SqlType - TIMESTAMP without precision literally which is wrong. Because the global cache uses weak reference, so the cache would usually invalidate after Java GC, which mitigates the impact of the hidden bug. As for the specific environment Linux JDK 11 situation, I could not give a reasonable explanation yet but I suppose the cause is clear. I find `org.apache.calcite.rel.rules.DateRangeRulesTest` could create TIMESTAMP without precision, just right before the errors occurred, and the error always starts from org.apache.calcite.test.SqlLimitsTest > testPrintLimits() according to the two failure builds per your mail. If the analysis above makes sense, I can create an issue in JIRA and fix the problem to make digest & toString return the same value for TIMESTAMP. Best, Xu Haisheng Yuan <[email protected]> 于2020年4月14日周二 上午10:00写道: > Hi, > > There are frequently flakey test on Linux(JDK 11) related with > TIMESTAMP(0) vs TIMESTAMP. > > Below is failure test from [1]. > FAILURE 0.0sec, org.apache.calcite.test.SqlToRelConverterExtendedTest > > testTableValuedFunctionTumbleWithSubQueryParam() > 1663 org.opentest4j.AssertionFailedError: plan ==> expected: < > 1664 LogicalProject(ORDERID=[$0], ROWTIME=[$1], window_start=[$2], > window_end=[$3]) > 1665 LogicalTableFunctionScan(invocation=[TUMBLE($1, DESCRIPTOR($1), > 60000:INTERVAL MINUTE)], rowType=[RecordType(INTEGER ORDERID, TIMESTAMP(0) > ROWTIME, TIMESTAMP(0) window_start, TIMESTAMP(0) window_end)]) > 1666 LogicalProject(ORDERID=[$0], ROWTIME=[$1]) > 1667 LogicalTableScan(table=[[CATALOG, SALES, SHIPMENTS]]) > 1668 > but was: < > 1669 LogicalProject(ORDERID=[$0], ROWTIME=[$1], window_start=[$2], > window_end=[$3]) > 1670 LogicalTableFunctionScan(invocation=[TUMBLE($1, DESCRIPTOR($1), > 60000:INTERVAL MINUTE)], rowType=[RecordType(INTEGER ORDERID, TIMESTAMP > ROWTIME, TIMESTAMP window_start, TIMESTAMP window_end)]) > 1671 LogicalProject(ORDERID=[$0], ROWTIME=[$1]) > 1672 LogicalTableScan(table=[[CATALOG, SALES, SHIPMENTS]]) > 1673 > > > Below is failure test from [2]: > testPrintLimits > The only diff is TIMESTAMP(0) vs TIMESTAMP > > Any clue? > > [1] https://github.com/apache/calcite/runs/576436249 > [2] https://github.com/apache/calcite/runs/584340845 > > > - Haisheng > > -- Best regards, Xu Zhang (张旭)
