cloud-fan commented on issue #24637: [SPARK-27707][SQL] Prune unnecessary 
nested fields from Generate
URL: https://github.com/apache/spark/pull/24637#issuecomment-571943552
 
 
   We hit an exception caused by this rule. The plan becomes invalid after 
optimization
   ```
   +- *(2) !Project [_gen_alias_68718#68718L AS cardinality#68575L, 
_gen_alias_68719#68719 AS durationSec#68576, _gen_alias_68720#68720 AS 
group#68578, _gen_alias_68721#68721 AS jobUuid#68579, _gen_alias_68722#68722 AS 
suite#68584, _gen_alias_68723#68723 AS testcase#68585, 
sha1(cast(_gen_alias_68721#68721 as binary)) AS jobSha#68600, 
sha1(cast(concat(_gen_alias_68722#68722, -, _gen_alias_68720#68720, -, 
cast(_gen_alias_68718#68718L as string), -, _gen_alias_68723#68723) as binary)) 
AS caseSha#68615]
         +- *(2) Generate explode(results#64594), false, [flattenRuns#68572]
            +- *(2) Project [results#64594]
               +- *(2) Sort [startTime#68717 DESC NULLS LAST], true, 0
   ```
   
   We generate `_gen_alias` attributes in the parent `Project` but they are not 
available in the child `Generate`.
   
   @viirya Can you help to take a look? thanks!

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to