Re: [PR] [SPARK-53652][SQL]Codegen for MergeRowExec [spark]

via GitHub Mon, 02 Feb 2026 13:43:53 -0800


karuppayya commented on PR #52399:
URL: https://github.com/apache/spark/pull/52399#issuecomment-3837531978


   @juliuszsompolski @viirya Thanks for your reviews, I will have them handled.
   I agree that we need to validate that the ionterpreted path and codegen path 
generate the same results.
   Like mentioned, I didnt see that a general pattern and didnot add it 
initially.
   
   I see this 
`org.apache.spark.sql.catalyst.expressions.aggregate.TestWithAndWithoutCodegen`,
  under `expressions.aggregaate`.
   Should we move this at a top level in `sql/catalyst` or create a new one for 
use in `sql/core`?
   (I think we could reuse sql/catalyst by moving to top level)
   
   > Is there benchmark numbers showing actual performance improvements?
   
   @viirya This implementation follows the standard pattern of codegen-enabling 
operators to allow for **operator fusing** and to eliminate the **overhead of 
virtual function calls**. Since most Spark operators are codegen-enabled, it 
makes sense to have MergeRows follow suit.
   But if you feel strongly that we need empirical data to justify the 
implementation, I’m happy to add a benchmark file to verify the gains
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [SPARK-53652][SQL]Codegen for MergeRowExec [spark]

Reply via email to