[GitHub] spark pull request #20612: [SPARK-23424][SQL]Add codegenStageId in comment

2018-02-20 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/20612


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20612: [SPARK-23424][SQL]Add codegenStageId in comment

2018-02-20 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/20612#discussion_r169313393
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala
 ---
@@ -1226,14 +1226,24 @@ class CodegenContext {
 
   /**
* Register a comment and return the corresponding place holder
+   *
+   * @param placeholderId an optionally specified identifier for the 
comment's placeholder.
+   *  The caller should make sure this identifier is 
unique within the
+   *  compilation unit. If this argument is not 
specified, a fresh identifier
+   *  will be automatically created and used as the 
placeholder.
+   * @param force whether to force registering the comments
*/
-  def registerComment(text: => String): String = {
+   def registerComment(
+   text: => String,
+   placeholderId: String = "",
+   force: Boolean = false): String = {
 // By default, disable comments in generated code because computing 
the comments themselves can
 // be extremely expensive in certain cases, such as deeply-nested 
expressions which operate over
 // inputs with wide schemas. For more details on the performance 
issues that motivated this
 // flat, see SPARK-15680.
-if (SparkEnv.get != null && 
SparkEnv.get.conf.getBoolean("spark.sql.codegen.comments", false)) {
-  val name = freshName("c")
+if (force ||
+  SparkEnv.get != null && 
SparkEnv.get.conf.getBoolean("spark.sql.codegen.comments", false)) {
+  val name = if (placeholderId != "") placeholderId else freshName("c")
--- End diff --

+1


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20612: [SPARK-23424][SQL]Add codegenStageId in comment

2018-02-19 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/20612#discussion_r169224985
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala
 ---
@@ -1226,14 +1226,24 @@ class CodegenContext {
 
   /**
* Register a comment and return the corresponding place holder
+   *
+   * @param placeholderId an optionally specified identifier for the 
comment's placeholder.
+   *  The caller should make sure this identifier is 
unique within the
+   *  compilation unit. If this argument is not 
specified, a fresh identifier
+   *  will be automatically created and used as the 
placeholder.
+   * @param force whether to force registering the comments
*/
-  def registerComment(text: => String): String = {
+   def registerComment(
+   text: => String,
+   placeholderId: String = "",
+   force: Boolean = false): String = {
 // By default, disable comments in generated code because computing 
the comments themselves can
 // be extremely expensive in certain cases, such as deeply-nested 
expressions which operate over
 // inputs with wide schemas. For more details on the performance 
issues that motivated this
 // flat, see SPARK-15680.
-if (SparkEnv.get != null && 
SparkEnv.get.conf.getBoolean("spark.sql.codegen.comments", false)) {
-  val name = freshName("c")
+if (force ||
+  SparkEnv.get != null && 
SparkEnv.get.conf.getBoolean("spark.sql.codegen.comments", false)) {
+  val name = if (placeholderId != "") placeholderId else freshName("c")
--- End diff --

although the caller should guarantee `placeholderId` is unique, shall we 
add a check here for safe? e.g. 
`assert(!placeHolderToComments.containsKey("placeholderId"))`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20612: [SPARK-23424][SQL]Add codegenStageId in comment

2018-02-14 Thread kiszk
GitHub user kiszk opened a pull request:

https://github.com/apache/spark/pull/20612

[SPARK-23424][SQL]Add codegenStageId in comment

## What changes were proposed in this pull request?

This PR always adds `codegenStageId` in comment of the generated class. 
This is a replication of #20419  for post-Spark 2.3.

```
/* 001 */ public Object generate(Object[] references) {
/* 002 */   return new GeneratedIteratorForCodegenStage1(references);
/* 003 */ }
/* 004 */
/* 005 */ // codegenStageId=1
/* 006 */ final class GeneratedIteratorForCodegenStage1 extends 
org.apache.spark.sql.execution.BufferedRowIterator {
/* 007 */   private Object[] references;
...
```

## How was this patch tested?

Existing tests

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kiszk/spark SPARK-23424

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/20612.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #20612


commit 1bf0640264533e0830cc339c37931328e2a622eb
Author: Kazuaki Ishizaki 
Date:   2018-02-14T19:22:20Z

initial commit




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org