peter-toth commented on issue #28041: [SPARK-30564][SQL] Improved extra new line and comment remove URL: https://github.com/apache/spark/pull/28041#issuecomment-606644695 I opened https://github.com/apache/spark/pull/28083 with your suggested changes that basically reverts `Block.length` and changes `HashAggregateExec` to place comments properly. It's much smaller and fixes the performance issue observed in `WideSchemaBenchmark`. But I'm not sure that the changes: - at `Block.length` to stip comments before returning length and - at `CodeFormatter.stripExtraNewLinesAndComments` in this PR to strip comments much faster are useless as these 2 functions are used at other places too: - https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala#L1053 - https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala#L160 - https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala#L926 These usages look more general and as far as I see we have many comments scattered around in the generated code.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
