Flyangz opened a new issue, #1604: URL: https://github.com/apache/auron/issues/1604
**Describe the bug** As mentioned in this issue, https://github.com/apache/datafusion/issues/16903, after upgrading DataFusion to version 49, the output of the MD5 function defaults to `Utf8View` due to the changes in https://github.com/apache/datafusion/pull/16290. This data format is not fully supported in Auron. If the result of the MD5 function is used as the input for a hash, an error will occur. **To Reproduce** The following test case can be added in AuronFunctionSuite: ``` test("md5 function") { withTable("t1") { sql("create table t1 using parquet as select 'spark' as c1, '3.x' as version") val functions = """ |select b.md5 |from ( | select c1, version from t1 |) a join ( | select md5(concat(c1, version)) as md5 from t1 |) b on md5(concat(a.c1, a.version)) = b.md5 |""".stripMargin val df = sql(functions) checkAnswer(df, Seq(Row("9ff36a3857e29335d03cf6bef2147119"))) } } ``` This will result in the following error: Caused by: java.lang.RuntimeException: task panics: Execution error: Execution error: output_with_sender[Project] error: Execution error: output_with_sender[BroadcastJoin] error: Execution error: Unsupported data type in hasher: Utf8View **Expected behavior** The MD5 function should work correctly. Perhaps full support for `Utf8View` in Auron will not be available soon. If that's the case, the MD5 function could revert to the old logic that does not convert the return value to a StringViewArray. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
