[
https://issues.apache.org/jira/browse/CALCITE-5133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
itxiangkui updated CALCITE-5133:
--------------------------------
Description:
I have a piece of data, the example is as follows:
||channel||gettime||
|c1|2022-04-27 00:00:00|
|c2|2022-04-27 00:00:00|
|c3|2022-04-27 00:00:00|
|c4|2022-04-27 00:00:00|
|c5|2022-04-27 00:00:00|
|c6|2022-04-27 00:00:00|
{color:#00875a}it will be ok to query the dataset...{color}
{code:java}
select channel
from`tidb_noah_mars`.`noah_global`.`mpays`
where gettime>='2022-04-27 00:00:00'
and gettime<='2022-04-27 14:00:00'
and channel in ('c1','c2'...'c19')
| JdbcToEnumerableConverter
JdbcProject(channel=[$11])
JdbcFilter(condition=[AND(SEARCH($38, Sarg[[2022-04-27 00:00:00..2022-04-27
14:00:00]]), SEARCH($11, Sarg[_UTF-8'360_assistant':VARCHAR(14) CHARACTER SET
"UTF-8" |{code}
{color:#ff0000}it can not query the dataset...{color}
{code:java}
select channel
from xxx
where gettime>='2022-04-27 00:00:00'
and gettime<='2022-04-27 14:00:00'
and channel in ('c1','c2'....'c20')
| JdbcToEnumerableConverter
JdbcProject(channel=[$0])
JdbcJoin(condition=[=($0, $2)], joinType=[inner])
JdbcProject(channel=[$11], gettime=[$38])
JdbcFilter(condition=[SEARCH($38, Sarg[[2022-04-27 00:00 |{code}
The difference is that the in sub-collection of the latter will have more than
20 elements, while the former has only 19
This leads to a huge difference in execution plans
When the latter sql was executed, I found out when I debugged,
org.apache.calcite.rel.rel2sql.visit(Values e)
in the method:
{code:java}
} else {
query = SqlStdOperatorTable.UNION_ALL.createCall(
new SqlNodeList(list, POS));
} {code}
Here list.size = 20, but the result of the *query* is
{code:java}
SELECT 'wsss' AS `ROW_VALUE`
UNION ALL
SELECT 'xxx' AS `ROW_VALUE`
{code}
*There are two lines,not 20 lines...*
It sounds weird and I can't explain more details...
was:
I have a piece of data, the example is as follows:
||channel||gettime||
|c1|2022-04-27 00:00:00|
|c2|2022-04-27 00:00:00|
|c3|2022-04-27 00:00:00|
|c4|2022-04-27 00:00:00|
|c5|2022-04-27 00:00:00|
|c6|2022-04-27 00:00:00|
{color:#00875a}it will be ok to query the dataset...{color}
{code:java}
select channel
from`tidb_noah_mars`.`noah_global`.`mpays`
where gettime>='2022-04-27 00:00:00'
and gettime<='2022-04-27 14:00:00'
and channel in ('c1','c2'...'c19')
| JdbcToEnumerableConverter
JdbcProject(channel=[$11])
JdbcFilter(condition=[AND(SEARCH($38, Sarg[[2022-04-27 00:00:00..2022-04-27
14:00:00]]), SEARCH($11, Sarg[_UTF-8'360_assistant':VARCHAR(14) CHARACTER SET
"UTF-8" |{code}
{color:#ff0000}it can not query the dataset...{color}
{code:java}
select channel
from xxx
where gettime>='2022-04-27 00:00:00'
and gettime<='2022-04-27 14:00:00'
and channel in ('c1','c2'....'c20')
| JdbcToEnumerableConverter
JdbcProject(channel=[$0])
JdbcJoin(condition=[=($0, $2)], joinType=[inner])
JdbcProject(channel=[$11], gettime=[$38])
JdbcFilter(condition=[SEARCH($38, Sarg[[2022-04-27 00:00 |{code}
The difference is that the in sub-collection of the latter will have more than
20 elements, while the former has only 19
This leads to a huge difference in execution plans
When the latter sql was executed, I found out when I debugged,
org.apache.calcite.rel.rel2sql.visit(Values e)
in the method:
{code:java}
} else {
query = SqlStdOperatorTable.UNION_ALL.createCall(
new SqlNodeList(list, POS));
} {code}
Here list.size = 20, but the result of the *query* is
{code:java}
SELECT 'wsss' AS `ROW_VALUE`
UNION ALL
SELECT 'xxx' AS `ROW_VALUE`
{code}
*There are two lines*
It sounds weird and I can't explain more details...
> JdbcValues lost some values when the value of in was more than 20
> -----------------------------------------------------------------
>
> Key: CALCITE-5133
> URL: https://issues.apache.org/jira/browse/CALCITE-5133
> Project: Calcite
> Issue Type: Bug
> Reporter: itxiangkui
> Priority: Major
>
>
> I have a piece of data, the example is as follows:
> ||channel||gettime||
> |c1|2022-04-27 00:00:00|
> |c2|2022-04-27 00:00:00|
> |c3|2022-04-27 00:00:00|
> |c4|2022-04-27 00:00:00|
> |c5|2022-04-27 00:00:00|
> |c6|2022-04-27 00:00:00|
>
> {color:#00875a}it will be ok to query the dataset...{color}
> {code:java}
> select channel
> from`tidb_noah_mars`.`noah_global`.`mpays`
> where gettime>='2022-04-27 00:00:00'
> and gettime<='2022-04-27 14:00:00'
> and channel in ('c1','c2'...'c19')
> | JdbcToEnumerableConverter
> JdbcProject(channel=[$11])
> JdbcFilter(condition=[AND(SEARCH($38, Sarg[[2022-04-27
> 00:00:00..2022-04-27 14:00:00]]), SEARCH($11,
> Sarg[_UTF-8'360_assistant':VARCHAR(14) CHARACTER SET "UTF-8" |{code}
>
> {color:#ff0000}it can not query the dataset...{color}
> {code:java}
> select channel
> from xxx
> where gettime>='2022-04-27 00:00:00'
> and gettime<='2022-04-27 14:00:00'
> and channel in ('c1','c2'....'c20')
> | JdbcToEnumerableConverter
> JdbcProject(channel=[$0])
> JdbcJoin(condition=[=($0, $2)], joinType=[inner])
> JdbcProject(channel=[$11], gettime=[$38])
> JdbcFilter(condition=[SEARCH($38, Sarg[[2022-04-27 00:00 |{code}
>
> The difference is that the in sub-collection of the latter will have more
> than 20 elements, while the former has only 19
> This leads to a huge difference in execution plans
>
> When the latter sql was executed, I found out when I debugged,
> org.apache.calcite.rel.rel2sql.visit(Values e)
> in the method:
> {code:java}
> } else {
> query = SqlStdOperatorTable.UNION_ALL.createCall(
> new SqlNodeList(list, POS));
> } {code}
>
> Here list.size = 20, but the result of the *query* is
> {code:java}
> SELECT 'wsss' AS `ROW_VALUE`
> UNION ALL
> SELECT 'xxx' AS `ROW_VALUE`
> {code}
> *There are two lines,not 20 lines...*
> It sounds weird and I can't explain more details...
>
--
This message was sent by Atlassian Jira
(v8.20.7#820007)