[
https://issues.apache.org/jira/browse/CALCITE-5133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17532149#comment-17532149
]
itxiangkui commented on CALCITE-5133:
-------------------------------------
{code:java}
0: jdbc:avatica:remote:url=http://localhost:6> !outputformat vertical
0: jdbc:avatica:remote:url=http://localhost:6>
0: jdbc:avatica:remote:url=http://localhost:6> explain plan for
. . . . . . . . . . . . . . . . . . semicolon> select
aid,concat(substring_index(gettime,' ',1),' 00:00:00') as
gettime,paymoney,currency,gameid,channel
. . . . . . . . . . . . . . . . . . semicolon>
from`xxxx_catalog`.`xxx_database`.`xxx_table`
. . . . . . . . . . . . . . . . . . semicolon> where gettime>='2022-04-27
00:00:00'
. . . . . . . . . . . . . . . . . . semicolon> and gettime<='2022-04-27
14:00:00'
. . . . . . . . . . . . . . . . . . semicolon> and channel in
('common','meitushop','bilibili_sdk','nearme_vivo','xiaomi_app','huawei','uc_platform','4399com','myapp','oppo',
'steamchannel', 'asd','xxxxx_test1', 'TestA', 'firefly03', 'juefeng',
'test_hnr_1', 'dev01', '360_assistant', 'noahgame')
. . . . . . . . . . . . . . . . . . semicolon> ;
PLAN EnumerableCalc(expr#0..6=[{inputs}], expr#7=[CAST($t5):VARCHAR CHARACTER
SET "UTF-8"], expr#8=[_UTF-8' '], expr#9=[1], expr#10=[substring_index($t7,
$t8, $t9)], expr#11=[_UTF-8' 00:00:00'], expr#12=[CONCAT($t10, $t11)],
aid=[$t2], gettime=[$t12], paymoney=[$t3], currency=[$t4], gameid=[$t1],
channel=[$t0])
JdbcToEnumerableConverter
JdbcJoin(condition=[=($0, $6)], joinType=[inner])
JdbcProject(channel=[$11], gameid=[$13], aid=[$17], paymoney=[$29],
currency=[$31], gettime=[$38])
JdbcFilter(condition=[SEARCH($38, Sarg[[2022-04-27 00:00:00..2022-04-27
14:00:00]])])
TidbJdbcTableScan(table=[[catalog, database, table]])
JdbcValues(tuples=[[{ _UTF-8'common' }, { _UTF-8'meitushop' }, {
_UTF-8'bilibili_sdk' }, { _UTF-8'nearme_vivo' }, { _UTF-8'xiaomi_app' }, {
_UTF-8'huawei' }, { _UTF-8'uc_platform' }, { _UTF-8'4399com' }, { _UTF-8'myapp'
}, { _UTF-8'oppo' }, { _UTF-8'steamchannel' }, { _UTF-8'asd' }, {
_UTF-8'xxxxx_test1' }, { _UTF-8'TestA' }, { _UTF-8'firefly03' }, {
_UTF-8'juefeng' }, { _UTF-8'test_hnr_1' }, { _UTF-8'dev01' }, {
_UTF-8'360_assistant' }, { _UTF-8'noahgame' }]]) {code}
Remark:
1. I have defined a three-level structure of catalog->database->table, the
back-end database uses Mysql, and I want to use calcite's SQL to query MySQL
data
2. I rewrote TidbJdbcTableScan based on
JdbcCatalog/JdbcSchema/JdbcTable/JdbcTableScan, the code is almost the same,
and I made some custom bug fixes
3. There is no problem on the explain plan, but there is a union all problem in
the sql identified by mysql on the physical execution plan of the back-end mysql
I'm not sure if it was my 2nd clause that broke calcite's execution plan, but
personally it shouldn't be the problem....
> JdbcValues lost some values when the value of in was more than 20
> -----------------------------------------------------------------
>
> Key: CALCITE-5133
> URL: https://issues.apache.org/jira/browse/CALCITE-5133
> Project: Calcite
> Issue Type: Bug
> Components: core
> Reporter: itxiangkui
> Priority: Major
> Fix For: 1.30.0
>
>
>
> I have a piece of data, the example is as follows:
> ||channel||gettime||
> |c1|2022-04-27 00:00:00|
> |c2|2022-04-27 00:00:00|
> |c3|2022-04-27 00:00:00|
> |c4|2022-04-27 00:00:00|
> |c5|2022-04-27 00:00:00|
> |c6|2022-04-27 00:00:00|
>
> {color:#00875a}it will be ok to query the dataset...{color}
> {code:java}
> select channel
> from`tidb_noah_mars`.`noah_global`.`mpays`
> where gettime>='2022-04-27 00:00:00'
> and gettime<='2022-04-27 14:00:00'
> and channel in ('c1','c2'...'c19')
> | JdbcToEnumerableConverter
> JdbcProject(channel=[$11])
> JdbcFilter(condition=[AND(SEARCH($38, Sarg[[2022-04-27
> 00:00:00..2022-04-27 14:00:00]]), SEARCH($11,
> Sarg[_UTF-8'360_assistant':VARCHAR(14) CHARACTER SET "UTF-8" |{code}
>
> {color:#ff0000}it can not query the dataset...{color}
> {code:java}
> select channel
> from xxx
> where gettime>='2022-04-27 00:00:00'
> and gettime<='2022-04-27 14:00:00'
> and channel in ('c1','c2'....'c20')
> | JdbcToEnumerableConverter
> JdbcProject(channel=[$0])
> JdbcJoin(condition=[=($0, $2)], joinType=[inner])
> JdbcProject(channel=[$11], gettime=[$38])
> JdbcFilter(condition=[SEARCH($38, Sarg[[2022-04-27 00:00 |{code}
>
> The difference is that the in sub-collection of the latter will have more
> than 20 elements, while the former has only 19
> This leads to a huge difference in execution plans
>
> When the latter sql was executed, I found out when I debugged,
> org.apache.calcite.rel.rel2sql.visit(Values e)
> in the method:
> {code:java}
> } else {
> query = SqlStdOperatorTable.UNION_ALL.createCall(
> new SqlNodeList(list, POS));
> } {code}
>
> Here list.size = 20, but the result of the *query* is
> {code:java}
> SELECT 'wsss' AS `ROW_VALUE`
> UNION ALL
> SELECT 'xxx' AS `ROW_VALUE`
> {code}
> *There are two lines,not 20 lines...*
> It sounds weird and I can't explain more details...
>
--
This message was sent by Atlassian Jira
(v8.20.7#820007)