[jira] [Updated] (HUDI-4799) improve analyzer exception tip when can not resolve expression

KnightChess (Jira) Wed, 07 Sep 2022 07:24:40 -0700


     [ 
https://issues.apache.org/jira/browse/HUDI-4799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


KnightChess updated HUDI-4799:
------------------------------
    Description: 
sql:
merge into hudi_mor_pk_cbfield_tbl3 as target 
using (select id, dt from delete_sync_test where concat_ws('1', '2') and id = 
1) as source 
on target.id = source.id 
when matched then update set target.ts = source.dt

the source table is unresolved

then Exception desc is doubtful
{noformat}
Exception in thread "main" org.apache.spark.sql.AnalysisException: Cannot 
resolve 'target.id,'source.id in (target.id = source.id), the input columns is: 
['id, 'dt, _hoodie_commit_time#0, _hoodie_commit_seqno#1, _hoodie_record_key#2, 
_hoodie_partition_path#3, _hoodie_file_name#4, id#5L, ts#6L, name#7]
        at 
org.apache.spark.sql.hudi.analysis.HoodieResolveReferences.org$apache$spark$sql$hudi$analysis$HoodieResolveReferences$$resolveExpressionFrom(HoodieAnalysis.scala:568)
        at 
org.apache.spark.sql.hudi.analysis.HoodieResolveReferences$$anonfun$apply$1.applyOrElse(HoodieAnalysis.scala:351)
        at 
org.apache.spark.sql.hudi.analysis.HoodieResolveReferences$$anonfun$apply$1.applyOrElse(HoodieAnalysis.scala:265)
{noformat}


after imporve:
it is easy find error

{noformat}
*Exception in thread "main" org.apache.spark.sql.AnalysisException: cannot 
resolve '(concat_ws('1', '2') AND (spark_catalog.hudi_test.delete_sync_test.id 
= 1))' due to data type mismatch: differing types in '(concat_ws('1', '2') AND 
(spark_catalog.hudi_test.delete_sync_test.id = 1))' (string and boolean).; line 
2 pos 49;
'SubqueryAlias source
+- 'Project ['id, 'dt]
   +- 'Filter (concat_ws(1, 2) AND (id#13 = 1))
      +- SubqueryAlias spark_catalog.hudi_test.delete_sync_test
         +- Relation 
hudi_test.delete_sync_test[_hoodie_commit_time#8,_hoodie_commit_seqno#9,_hoodie_record_key#10,_hoodie_partition_path#11,_hoodie_file_name#12,id#13,age#14,name#15,dt#16]
 parquet

        at 
org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42)
        at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$$nestedInanonfun$checkAnalysis$1$2.applyOrElse(CheckAnalysis.scala:190)
{noformat}


 

 

  was:
sql:
merge into hudi_mor_pk_cbfield_tbl3 as target 
using (select id, dt from delete_sync_test where concat_ws('1', '2') and id = 
1) as source 
on target.id = source.id 
when matched then update set target.ts = source.dt


then Exception desc is doubtful

{noformat}
Exception in thread "main" org.apache.spark.sql.AnalysisException: Cannot 
resolve 'target.id,'source.id in (target.id = source.id), the input columns is: 
['id, 'dt, _hoodie_commit_time#0, _hoodie_commit_seqno#1, _hoodie_record_key#2, 
_hoodie_partition_path#3, _hoodie_file_name#4, id#5L, ts#6L, name#7]
        at 
org.apache.spark.sql.hudi.analysis.HoodieResolveReferences.org$apache$spark$sql$hudi$analysis$HoodieResolveReferences$$resolveExpressionFrom(HoodieAnalysis.scala:568)
        at 
org.apache.spark.sql.hudi.analysis.HoodieResolveReferences$$anonfun$apply$1.applyOrElse(HoodieAnalysis.scala:351)
        at 
org.apache.spark.sql.hudi.analysis.HoodieResolveReferences$$anonfun$apply$1.applyOrElse(HoodieAnalysis.scala:265)
{noformat}


after imporve:
it is easy find error
{code:shell}
Exception in thread "main" org.apache.spark.sql.AnalysisException: cannot 
resolve '(concat_ws('1', '2') AND (spark_catalog.hudi_test.delete_sync_test.id 
= 1))' due to data type mismatch: differing types in '(concat_ws('1', '2') AND 
(spark_catalog.hudi_test.delete_sync_test.id = 1))' (string and boolean).; line 
2 pos 49;
'SubqueryAlias source
+- 'Project ['id, 'dt]
   +- 'Filter (concat_ws(1, 2) AND (id#13 = 1))
      +- SubqueryAlias spark_catalog.hudi_test.delete_sync_test
         +- Relation 
hudi_test.delete_sync_test[_hoodie_commit_time#8,_hoodie_commit_seqno#9,_hoodie_record_key#10,_hoodie_partition_path#11,_hoodie_file_name#12,id#13,age#14,name#15,dt#16]
 parquet

        at 
org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42)
        at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$$nestedInanonfun$checkAnalysis$1$2.applyOrElse(CheckAnalysis.scala:190)

{code}



 

 


> improve analyzer exception tip when can not resolve expression
> --------------------------------------------------------------
>
>                 Key: HUDI-4799
>                 URL: https://issues.apache.org/jira/browse/HUDI-4799
>             Project: Apache Hudi
>          Issue Type: Improvement
>          Components: spark-sql
>            Reporter: KnightChess
>            Assignee: KnightChess
>            Priority: Major
>
> sql:
> merge into hudi_mor_pk_cbfield_tbl3 as target 
> using (select id, dt from delete_sync_test where concat_ws('1', '2') and id = 
> 1) as source 
> on target.id = source.id 
> when matched then update set target.ts = source.dt
> the source table is unresolved
> then Exception desc is doubtful
> {noformat}
> Exception in thread "main" org.apache.spark.sql.AnalysisException: Cannot 
> resolve 'target.id,'source.id in (target.id = source.id), the input columns 
> is: ['id, 'dt, _hoodie_commit_time#0, _hoodie_commit_seqno#1, 
> _hoodie_record_key#2, _hoodie_partition_path#3, _hoodie_file_name#4, id#5L, 
> ts#6L, name#7]
>       at 
> org.apache.spark.sql.hudi.analysis.HoodieResolveReferences.org$apache$spark$sql$hudi$analysis$HoodieResolveReferences$$resolveExpressionFrom(HoodieAnalysis.scala:568)
>       at 
> org.apache.spark.sql.hudi.analysis.HoodieResolveReferences$$anonfun$apply$1.applyOrElse(HoodieAnalysis.scala:351)
>       at 
> org.apache.spark.sql.hudi.analysis.HoodieResolveReferences$$anonfun$apply$1.applyOrElse(HoodieAnalysis.scala:265)
> {noformat}
> after imporve:
> it is easy find error
> {noformat}
> *Exception in thread "main" org.apache.spark.sql.AnalysisException: cannot 
> resolve '(concat_ws('1', '2') AND 
> (spark_catalog.hudi_test.delete_sync_test.id = 1))' due to data type 
> mismatch: differing types in '(concat_ws('1', '2') AND 
> (spark_catalog.hudi_test.delete_sync_test.id = 1))' (string and boolean).; 
> line 2 pos 49;
> 'SubqueryAlias source
> +- 'Project ['id, 'dt]
>    +- 'Filter (concat_ws(1, 2) AND (id#13 = 1))
>       +- SubqueryAlias spark_catalog.hudi_test.delete_sync_test
>          +- Relation 
> hudi_test.delete_sync_test[_hoodie_commit_time#8,_hoodie_commit_seqno#9,_hoodie_record_key#10,_hoodie_partition_path#11,_hoodie_file_name#12,id#13,age#14,name#15,dt#16]
>  parquet
>       at 
> org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42)
>       at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$$nestedInanonfun$checkAnalysis$1$2.applyOrElse(CheckAnalysis.scala:190)
> {noformat}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HUDI-4799) improve analyzer exception tip when can not resolve expression

Reply via email to