[
https://issues.apache.org/jira/browse/KYLIN-5007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yaqian Zhang updated KYLIN-5007:
--------------------------------
Fix Version/s: v3.1.3
> queries with limit clause may fail when string dimension is encoded in
> integer type
> -----------------------------------------------------------------------------------
>
> Key: KYLIN-5007
> URL: https://issues.apache.org/jira/browse/KYLIN-5007
> Project: Kylin
> Issue Type: Bug
> Components: Query Engine
> Affects Versions: v3.0.2
> Reporter: Congling Xia
> Assignee: Congling Xia
> Priority: Major
> Fix For: v3.1.3
>
> Attachments: image-2021-06-10-10-03-54-775.png
>
>
> Hi, team.
> Recently we encounter a problem that queries may fail if there is a LIMIT in
> the SQL. The SQL looks like:
> {code}
> select gid from some_table group by gid limit 100
> {code}
> The error message is like the following:
> {code:java}
> Not sorted! last: source_v1=null,...,gid=276,... fetched:
> source_v1=null,...,gid=100506,...
> {code}
> After searching the issues list, we find it is similar with KYLIN-2425,
> KYLIN-3089, and KYLIN-4942. We notice that these problems are not completely
> resolved.
> It is an row-key encoding problem, the cube uses integer:4 to encode string
> column _gid_:
> !image-2021-06-10-10-03-54-775.png|width=571,height=141!
> As [~kangkaisen] mensioned in KYLIN-3089, comparator in
> SortMergedPartitionResultIterator is different from the one in
> SortedIteratorMergerWithLimit. SortedIteratorMergerWithLimit compares tuple
> of dimensions in their origin data type "string" rather than the encoded data
> type "integer" in rowkeys. In the exception message above, 276<100506 is
> false because they are compared in "string" type.
> It may be resolved by skipping limit pushdown when column type and encoding
> type may produce different comparing results, but it may lead such queries to
> be slower.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)