[GitHub] spark pull request #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId i...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/20419 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId i...
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/20419#discussion_r167958486 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -1226,14 +1226,21 @@ class CodegenContext { /** * Register a comment and return the corresponding place holder + * + * @param placeholderId a string for a place holder + * @param force whether to force registering the comments */ - def registerComment(text: => String): String = { + def registerComment( + text: => String, + placeholderId: String = "", + force: Boolean = false): String = { // By default, disable comments in generated code because computing the comments themselves can // be extremely expensive in certain cases, such as deeply-nested expressions which operate over // inputs with wide schemas. For more details on the performance issues that motivated this // flat, see SPARK-15680. -if (SparkEnv.get != null && SparkEnv.get.conf.getBoolean("spark.sql.codegen.comments", false)) { - val name = freshName("c") +if (force || + SparkEnv.get != null && SparkEnv.get.conf.getBoolean("spark.sql.codegen.comments", false)) { --- End diff -- IIUC, the answer seems to be No. See [this discussion](https://github.com/apache/spark/pull/19449#discussion_r143291724). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId i...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/20419#discussion_r167947791 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -1226,14 +1226,21 @@ class CodegenContext { /** * Register a comment and return the corresponding place holder + * + * @param placeholderId a string for a place holder + * @param force whether to force registering the comments */ - def registerComment(text: => String): String = { + def registerComment( + text: => String, + placeholderId: String = "", + force: Boolean = false): String = { // By default, disable comments in generated code because computing the comments themselves can // be extremely expensive in certain cases, such as deeply-nested expressions which operate over // inputs with wide schemas. For more details on the performance issues that motivated this // flat, see SPARK-15680. -if (SparkEnv.get != null && SparkEnv.get.conf.getBoolean("spark.sql.codegen.comments", false)) { - val name = freshName("c") +if (force || + SparkEnv.get != null && SparkEnv.get.conf.getBoolean("spark.sql.codegen.comments", false)) { --- End diff -- Now, the question is whether we can use `SQLConf.get` in this case. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId i...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/20419#discussion_r167853559 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -1226,14 +1226,21 @@ class CodegenContext { /** * Register a comment and return the corresponding place holder + * + * @param placeholderId a string for a place holder + * @param force whether to force registering the comments */ - def registerComment(text: => String): String = { + def registerComment( + text: => String, + placeholderId: String = "", + force: Boolean = false): String = { // By default, disable comments in generated code because computing the comments themselves can // be extremely expensive in certain cases, such as deeply-nested expressions which operate over // inputs with wide schemas. For more details on the performance issues that motivated this // flat, see SPARK-15680. -if (SparkEnv.get != null && SparkEnv.get.conf.getBoolean("spark.sql.codegen.comments", false)) { - val name = freshName("c") +if (force || + SparkEnv.get != null && SparkEnv.get.conf.getBoolean("spark.sql.codegen.comments", false)) { --- End diff -- unrelated, but should this be a SQL conf? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId i...
Github user rednaxelafx commented on a diff in the pull request: https://github.com/apache/spark/pull/20419#discussion_r167775177 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -1226,14 +1226,21 @@ class CodegenContext { /** * Register a comment and return the corresponding place holder + * + * @param placeholderId a string for a place holder --- End diff -- Nit: can we rephrase this ScalaDoc a bit, maybe like: ```scala /** * ... * @param placeholderId an optionally specified identifier for the comment's placeholder. The caller should make sure this identifier is unique within the compilation unit. If this argument is not specified, a fresh identifier will be automatically created and used as the placeholder. * ... */ ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId i...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/20419#discussion_r164951129 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -1227,12 +1227,13 @@ class CodegenContext { /** * Register a comment and return the corresponding place holder */ - def registerComment(text: => String): String = { + def registerComment(text: => String, forceComment: Boolean = false): String = { --- End diff -- Nit: rename it to `force`? Also adding the text? ``` @param force whether to force registering the comments ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId i...
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/20419#discussion_r164509380 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala --- @@ -542,6 +542,7 @@ case class WholeStageCodegenExec(child: SparkPlan)(val codegenStageId: Int) s"""Codegend pipeline for stage (id=$codegenStageId) |${this.treeString.trim}""".stripMargin)} final class $className extends ${classOf[BufferedRowIterator].getName} { +${ctx.registerComment(s"codegenStageId=$codegenStageId", true)} --- End diff -- sure, done --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId i...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/20419#discussion_r164321458 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala --- @@ -542,6 +542,7 @@ case class WholeStageCodegenExec(child: SparkPlan)(val codegenStageId: Int) s"""Codegend pipeline for stage (id=$codegenStageId) |${this.treeString.trim}""".stripMargin)} final class $className extends ${classOf[BufferedRowIterator].getName} { +${ctx.registerComment(s"codegenStageId=$codegenStageId", true)} --- End diff -- Instead of putting this comment inside the class, how about let it on top of it? Like: ```/* 001 */ public Object generate(Object[] references) { /* 002 */ return new GeneratedIteratorForCodegenStage1(references); /* 003 */ } /* 004 */ /* 005 */ // codegenStageId=1 /* 006 */ final class GeneratedIteratorForCodegenStage1 extends org.apache.spark.sql.execution.BufferedRowIterator { /* 007 */ private Object[] references; ... ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId i...
GitHub user kiszk opened a pull request: https://github.com/apache/spark/pull/20419 [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comment ## What changes were proposed in this pull request? This PR always adds `codegenStageId` in comment of the generated class. ``` /* 001 */ public Object generate(Object[] references) { /* 002 */ return new GeneratedIteratorForCodegenStage1(references); /* 003 */ } /* 004 */ /* 005 */ final class GeneratedIteratorForCodegenStage1 extends org.apache.spark.sql.execution.BufferedRowIterator { /* 006 */ // id=1 /* 007 */ private Object[] references; ... ``` ## How was this patch tested? Existing tests You can merge this pull request into a Git repository by running: $ git pull https://github.com/kiszk/spark SPARK-23032-follow Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/20419.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #20419 commit 6ee910611e7b6311f4ec1c5afb0ce5274fcf6805 Author: Kazuaki Ishizaki Date: 2018-01-29T00:35:06Z initial commit --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org