karuppayya commented on PR #52399: URL: https://github.com/apache/spark/pull/52399#issuecomment-3837531978
@juliuszsompolski @viirya Thanks for your reviews, I will have them handled. I agree that we need to validate that the ionterpreted path and codegen path generate the same results. Like mentioned, I didnt see that a general pattern and didnot add it initially. I see this `org.apache.spark.sql.catalyst.expressions.aggregate.TestWithAndWithoutCodegen`, under `expressions.aggregaate`. Should we move this at a top level in `sql/catalyst` or create a new one for use in `sql/core`? (I think we could reuse sql/catalyst by moving to top level) > Is there benchmark numbers showing actual performance improvements? @viirya This implementation follows the standard pattern of codegen-enabling operators to allow for **operator fusing** and to eliminate the **overhead of virtual function calls**. Since most Spark operators are codegen-enabled, it makes sense to have MergeRows follow suit. But if you feel strongly that we need empirical data to justify the implementation, I’m happy to add a benchmark file to verify the gains -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
