cloud-fan commented on a change in pull request #28105: [SPARK-31316][SQL]
SQLQueryTestSuite: Display the total generate time for generated java code.
URL: https://github.com/apache/spark/pull/28105#discussion_r402820156
##########
File path:
sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala
##########
@@ -567,6 +567,26 @@ object WholeStageCodegenExec {
def isTooManyFields(conf: SQLConf, dataType: DataType): Boolean = {
numOfNestedFields(dataType) > conf.wholeStageMaxNumFields
}
+
+ // The whole codegen generates java code on the driver side and sends it to
the Executor side
+ // for execution after compilation. The whole codegen can bring significant
performance
+ // improvements in large data and distributed environments. However, in the
test environment,
+ // due to the small amount of data, the time to generate Java code takes up
a major part of the
+ // entire runtime. So we summarize the total generation time and output it
to the execution log
+ // for easy analysis and view.
+ private val _generateJavaTime = new LongAccumulator
Review comment:
#28081 is different as we compile the code at the executor side. Yea let's
use `AtomicLong`
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]