yihua commented on PR #10826:
URL: https://github.com/apache/hudi/pull/10826#issuecomment-1982004268

   > @yihua @danny0405 @beyond1920:
   > 
   > Without adding the optimizer, we get the exception: 
`java.lang.RuntimeException: After applying rule 
org.apache.spark.sql.catalyst.optimizer.FoldablePropagation in batch Operator 
Optimization before Inferring Filters, the structural integrity of the plan is 
broken.`
   > 
   > Here is the plan before and after the FoldablePropagation optimizer step 
when running the test "Test ignoring case for MOR table":
   > 
   > ```
   > Before:
   > Project [ID#10, NAME#11, price#12, TS#13, dt#14]
   > +- Join LeftOuter, (id#20 = id#10)
   >    :- Project [1 AS id#10, a1 AS NAME#11, 111 AS price#12, 1111 AS ts#13, 
2021-05-05 AS DT#14]
   >    :  +- OneRowRelation
   >    +- Project [ID#20]
   >       +- Relation 
default.h0[_hoodie_commit_time#15,_hoodie_commit_seqno#16,_hoodie_record_key#17,_hoodie_partition_path#18,_hoodie_file_name#19,ID#20,NAME#21,price#22,TS#23L,dt#24]
 org.apache.hudi.EmptyRelation@67cd84f9
   > 
   > After:
   > Project [1 AS id#10, a1 AS NAME#11, 111 AS price#12, 1111 AS ts#13, 
2021-05-05 AS DT#14]
   > +- Join LeftOuter, (id#20 = 1)
   >    :- Project [1 AS id#10, a1 AS NAME#11, 111 AS price#12, 1111 AS ts#13, 
2021-05-05 AS DT#14]
   >    :  +- OneRowRelation
   >    +- Project [ID#20]
   >       +- Relation 
default.h0[_hoodie_commit_time#15,_hoodie_commit_seqno#16,_hoodie_record_key#17,_hoodie_partition_path#18,_hoodie_file_name#19,ID#20,NAME#21,price#22,TS#23L,dt#24]
 org.apache.hudi.EmptyRelation@67cd84f9
   > ```
   > 
   > @KnightChess can we avoid running the optimizer here if we find a way to 
rename all id#10 to ID#10 and not just the outermost projection?
   
   If the optimizer overhead is not low and makes the plan obscure, do we 
really need this fix?  We can explicitly mention to the user in the docs that 
the SQL query has to be field names that have to be case-sensitive.  Or is this 
a really critical issue?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to