[jira] [Commented] (KYLIN-6047) Error Occurs When the Number of Values in an IN Clause Reaches 20

Guoliang Sun (Jira) Wed, 19 Feb 2025 01:36:09 -0800


    [ 
https://issues.apache.org/jira/browse/KYLIN-6047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17928363#comment-17928363
 ]


Guoliang Sun commented on KYLIN-6047:
-------------------------------------

h3. Dev Design

After clarifying the issues above, the only definitive fix is to **implement 
the conversion of the `Row` operator into a logical plan that Spark can 
recognize.  

To clarify Spark's behavior in handling user SQL, we push the query directly to 
Spark for execution and refer to its logical plan. From the highlighted issues, 
we can derive the following two key tasks:  

1. Add Support for Matching `ROW` Operator:
   - Extend the original column values to handle the `ROW` operator.  
   - This ensures that the `ROW` operator (e.g., `(a, b)`) is correctly 
interpreted and processed.  

2. Enhance Handling of `IN` Operator:
   - Similarly, extend the `IN` operator to support multiple columns 
corresponding to multiple values (i.e., the `ROW` operator).  
   - This ensures compatibility with cases like `(a, b) IN ((1, 2), (3, 4))`.  

By implementing these two enhancements, we ensure that Kylin can properly 
convert Calcite's logical plan into Spark's logical plan, avoiding errors and 
performance issues caused by unsupported operators or excessive condition 
values.  

> Error Occurs When the Number of Values in an IN Clause Reaches 20
> -----------------------------------------------------------------
>
>                 Key: KYLIN-6047
>                 URL: https://issues.apache.org/jira/browse/KYLIN-6047
>             Project: Kylin
>          Issue Type: Bug
>    Affects Versions: 5.0.0
>            Reporter: Guoliang Sun
>            Priority: Major
>         Attachments: image-2025-02-19-17-21-17-187.png
>
>
> h3. Temporary Solution
> Increase the value of `kylin.query.convert-in-to-or-threshold`. However, 
> setting this parameter too high may lead to performance issues, as there 
> could be cases where the number of values in the `IN` clause exceeds 100. A 
> fix is required to address this issue properly.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (KYLIN-6047) Error Occurs When the Number of Values in an IN Clause Reaches 20

Reply via email to