baibaichen opened a new issue, #6561:
URL: https://github.com/apache/incubator-gluten/issues/6561

   ### Backend
   
   CH (ClickHouse)
   
   ### Bug description
   
   下面两个 SQL 会失败:
   
   1.  **Map cannot have a key of type Nullable(String)**
   ```sql
   select
    map_from_arrays(
      transform(map_keys(map('t1','a','t2','b')), v->v),
      array('a','b')) as b from range(10)
   ```
   
   2. **Logical error: 'Cannot capture column 1 because it has incompatible 
type: got String, but Nullable(String) is expected.**
    ```sql
   select  transform(map_values(map('t1','a','t2','b')), v->v) from range(10)
   ```
   
   我调研了问题 2,  `map_values(map('t1','a','t2','b'))`  会在 spark 这里变成数组常量,最终的类型是 
`Array(String)`, 但是`LambdaFunction v->v`  期望的类型  `Nullable(String)`,  这是因为 
map_values 的类型是 `Array(Nullable(String))`  
   
   
![image](https://github.com/user-attachments/assets/6d430f76-0d94-4025-8f02-498d93611cb5)
   
![image](https://github.com/user-attachments/assets/f300bd28-b545-40fe-8bc9-998e15a00a6a)
   
   **难点**是,Substrait 提供的 List 数组,没有包含类型,所以我们无法
   
   ```c++
   class ArrayTransform : public FunctionParser
   {
   public:
       static constexpr auto name = "transform";
      //...
       const DB::ActionsDAG::Node * parse(const 
substrait::Expression_ScalarFunction & substrait_func,
           DB::ActionsDAGPtr & actions_dag) const
       {
           auto ch_func_name = getCHFunctionName(substrait_func);
           auto lambda_args = collectLambdaArguments(*plan_parser, 
substrait_func.arguments()[1].value().scalar_function());
           auto parsed_args = parseFunctionArguments(substrait_func, 
actions_dag);
           assert(parsed_args.size() == 2);
           if (lambda_args.size() == 1)
           {
              // lambda_args  => LambdaFunction  的输入类型?
              // parsed_args[0] => 数组,类似是 Array(String)
              // parsed_args[1] => LambdaFunction  
               return toFunctionNode(actions_dag, ch_func_name, 
{parsed_args[1], parsed_args[0]});
           }
        ....
   ```
   
   看起来 parsed_args[0] 的类型,需要根据 lambda_args 的 类型进行修正。
   
   问题1应该是类似的类型匹配问题。
   
   ### Spark version
   
   None
   
   ### Spark configurations
   
   _No response_
   
   ### System information
   
   _No response_
   
   ### Relevant logs
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to