peter-toth commented on issue #28041: [SPARK-30564][SQL] Improved extra new 
line and comment remove
URL: https://github.com/apache/spark/pull/28041#issuecomment-606644695
 
 
   I opened https://github.com/apache/spark/pull/28083 with your suggested 
changes that basically reverts `Block.length` and changes `HashAggregateExec` 
to place comments properly. It's much smaller and fixes the performance issue 
observed in `WideSchemaBenchmark`.
   
   But I'm not sure that the changes:
   - at `Block.length` to stip comments before returning length and
   - at `CodeFormatter.stripExtraNewLinesAndComments` in this PR to strip 
comments much faster
   
   are useless as these 2 functions are used at other places too: 
   - 
https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala#L1053
   - 
https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala#L160
   - 
https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala#L926
   
   These usages look more general and as far as I see we have many comments 
scattered around in the generated code.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to