peter-toth commented on issue #28041: [SPARK-30564][SQL] Improved extra new 
line and comment remove
URL: https://github.com/apache/spark/pull/28041#issuecomment-612643258
 
 
   I looked into this issue a bit deeper. Let me share my thoughts:
   - I still don't think reverting the comment removal chage that @maropu added 
to `Block.length` would be a good idea for 2 reasons:
     - Pretty big amount of the source code we currently generate is comment or 
extra whitespace/newline. I admit that if we slowly update all comments to use 
`CodegenContext.registerComment` can help to reduce code size, but why don't we 
remove the comment removal logic only when we reach that desired state?
     - `CodegenContext.registerComment` has some issue here as it also adds 
some comment placeholders to the source code. If we don't remove those comment 
placeholders in `Block.length` then method splitting can be different when 
`spark.sql.codegen.comments` is enabled or disabled, which can be very 
confusing.
   - Since `Block.length` is still a costly operation in this PR, this commit:
   
https://github.com/apache/spark/pull/28041/commits/f7fbe1777651cd93c1f8a33547854c5ba61963ee
 helped to further reduce run time:
   ```
   deeply nested struct field r/w:           Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   
------------------------------------------------------------------------------------------------------------------------
   250 deep x 400 rows (read in-mem)                   137            164       
   27          0.7        1367.6       0.2X
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to