[jira] [Commented] (HUDI-2259) [SQL]Support referencing subquery with column aliases by table alias in merge into

ASF GitHub Bot (Jira) Thu, 12 Aug 2021 06:19:37 -0700


    [ 
https://issues.apache.org/jira/browse/HUDI-2259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17398063#comment-17398063
 ]


ASF GitHub Bot commented on HUDI-2259:
--------------------------------------

dongkelun commented on pull request #3380:
URL: https://github.com/apache/hudi/pull/3380#issuecomment-897632707


   @pengzhiwei2018 Hi,When I test Spark3, I find that Spark SQL for Hoodie with 
Spark3 uses the source code of Spark, but columns aliases in Merge Into is not 
supported in Spark3, it will throw the following exception: 'Columns aliases 
are not allowed in MERGE.'.I think there are two solutions, one is to modify 
the source code of Spark3 to make Spark support, the other is to write code in 
hudi-spark3 to implement Spark SQL for Hoodie, but I personally feel that this 
is a big change, I do not know if I understand correctly. So I was hoping you 
could help with some advice.
   
   ` // org.apache.spark.sql.catalyst.parserAstBuilder
   
   val sourceTableAlias = getTableAliasWithoutColumnAlias(ctx.sourceAlias, 
"MERGE")
   private def getTableAliasWithoutColumnAlias(
         ctx: TableAliasContext, op: String): Option[String] = {
       if (ctx == null) {
         None
       } else {
         val ident = ctx.strictIdentifier()
         if (ctx.identifierList() != null) {
           throw new ParseException(s"Columns aliases are not allowed in $op.", 
ctx.identifierList())
         }
         if (ident != null) Some(ident.getText) else None
       }
     }`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


> [SQL]Support referencing subquery with column aliases by table alias in merge 
> into
> ----------------------------------------------------------------------------------
>
>                 Key: HUDI-2259
>                 URL: https://issues.apache.org/jira/browse/HUDI-2259
>             Project: Apache Hudi
>          Issue Type: Improvement
>          Components: Spark Integration
>            Reporter: 董可伦
>            Assignee: 董可伦
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 0.9.0
>
>
>  
>  Example:
> {code:java}
> val tableName = "test_hudi_table"
> spark.sql(
> s"""
> create table ${tableName} (
> id int,
> name string,
> price double,
> ts long
> ) using hudi
> options (
> primaryKey = 'id',
> type = 'cow'
> )
> location '/tmp/${tableName}'
> """.stripMargin)
> spark.sql(
> s"""
> merge into $tableName as t0
> using (
> select 1, 'a1', 12, 1003
> ) s0 (id,name,price,ts)
> on s0.id = t0.id
> when matched and id != 1 then update set *
> when matched and s0.id = 1 then delete
> when not matched then insert *
> """.stripMargin)
> {code}
> It will throw an exception:
> {code:java}
> Exception in thread "main" org.apache.spark.sql.AnalysisException: Cannot 
> resolve 's0.id in (`s0.id` = `t0.id`), the input columns is: id#4, name#5, 
> price#6, ts#7, _hoodie_commit_time#8, _hoodie_commit_seqno#9, 
> _hoodie_record_key#10, _hoodie_partition_path#11, _hoodie_file_name#12, 
> id#13, name#14, price#15, ts#16L;Exception in thread "main" 
> org.apache.spark.sql.AnalysisException: Cannot resolve 's0.id in (`s0.id` = 
> `t0.id`), the input columns is: id#4, name#5, price#6, ts#7, 
> _hoodie_commit_time#8, _hoodie_commit_seqno#9, _hoodie_record_key#10, 
> _hoodie_partition_path#11, _hoodie_file_name#12, id#13, name#14, price#15, 
> ts#16L; at 
> org.apache.spark.sql.hudi.analysis.HoodieResolveReferences.org$apache$spark$sql$hudi$analysis$HoodieResolveReferences$$resolveExpressionFrom(HoodieAnalysis.scala:292)
>  at 
> org.apache.spark.sql.hudi.analysis.HoodieResolveReferences$$anonfun$apply$1.applyOrElse(HoodieAnalysis.scala:160)
>  at 
> org.apache.spark.sql.hudi.analysis.HoodieResolveReferences$$anonfun$apply$1.applyOrElse(HoodieAnalysis.scala:103)
>  at 
> org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$$anonfun$resolveOperatorsUp$1$$anonfun$apply$1.apply(AnalysisHelper.scala:90)
>  at 
> org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$$anonfun$resolveOperatorsUp$1$$anonfun$apply$1.apply(AnalysisHelper.scala:90)
>  at 
> org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:69)
>  at 
> org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$$anonfun$resolveOperatorsUp$1.apply(AnalysisHelper.scala:89)
>  at 
> org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$$anonfun$resolveOperatorsUp$1.apply(AnalysisHelper.scala:86)
>  at 
> org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:194)
>  at 
> org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$class.resolveOperatorsUp(AnalysisHelper.scala:86)
>  at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsUp(LogicalPlan.scala:29){code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HUDI-2259) [SQL]Support referencing subquery with column aliases by table alias in merge into

Reply via email to