[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-02-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20419
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-02-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20419
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87405/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-02-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20419
  
**[Test build #87405 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87405/testReport)**
 for PR 20419 at commit 
[`0ad1e72`](https://github.com/apache/spark/commit/0ad1e72c5ec874d688e08b0cfa0a297867be5275).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-02-13 Thread kiszk
Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/20419
  
I see. I will open a new JIRA tomorrow.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-02-13 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/20419
  
@kiszk This is unable to merge to 2.3. Could you open a new JIRA for this?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-02-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20419
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-02-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20419
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/863/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-02-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20419
  
**[Test build #87405 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87405/testReport)**
 for PR 20419 at commit 
[`0ad1e72`](https://github.com/apache/spark/commit/0ad1e72c5ec874d688e08b0cfa0a297867be5275).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-02-12 Thread kiszk
Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/20419
  
ping @rednaxelafx


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-02-06 Thread kiszk
Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/20419
  
ping @rednaxelafx


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-02-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20419
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87084/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-02-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20419
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-02-05 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20419
  
**[Test build #87084 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87084/testReport)**
 for PR 20419 at commit 
[`cb7a16b`](https://github.com/apache/spark/commit/cb7a16b1e1abdb7dcb45f2a18085dda0cae8c12f).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-02-05 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20419
  
**[Test build #87084 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87084/testReport)**
 for PR 20419 at commit 
[`cb7a16b`](https://github.com/apache/spark/commit/cb7a16b1e1abdb7dcb45f2a18085dda0cae8c12f).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-02-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20419
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/605/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-02-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20419
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-02-05 Thread kiszk
Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/20419
  
Retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-02-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20419
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-02-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20419
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87076/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-02-05 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20419
  
**[Test build #87076 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87076/testReport)**
 for PR 20419 at commit 
[`cb7a16b`](https://github.com/apache/spark/commit/cb7a16b1e1abdb7dcb45f2a18085dda0cae8c12f).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-02-05 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20419
  
**[Test build #87076 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87076/testReport)**
 for PR 20419 at commit 
[`cb7a16b`](https://github.com/apache/spark/commit/cb7a16b1e1abdb7dcb45f2a18085dda0cae8c12f).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-02-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20419
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/598/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-02-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20419
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-02-01 Thread rednaxelafx
Github user rednaxelafx commented on the issue:

https://github.com/apache/spark/pull/20419
  
@kiszk For this specific kind of usage, I don't think using a hardcoded 
stable ID will be a problem.
The comment we're talking about is the kind the can only appear once in a 
single whole-stage codegen unit -- there's only one `GeneratedIterator` class 
with the `processNext()` method as the main entry point. Providing such a 
mechanism allows codegen developers to mark important information without 
risking to affect codegen cache behavior.

For safety, we can add a runtime check. Let's assume this new method is 
called `ctx.registerCommentWithId()`, then inside this method we can implement 
something very similar to `ctx.freshName()`, but instead of producing a new 
name with a potential integer suffix appended, we can either log a warning 
message or throw an exception. That way the developer would have an easy way to 
catch this.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-02-01 Thread kiszk
Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/20419
  
@rednaxelafx I understand your concern when `ctx.registerComment()` is 
conditionally called.

If we call `ctx.registerComment()` with the specific identified multiple 
times, how do we handle? (e.g. `Expression.genCode()` may be called multiple 
times from different contexts.

I am sorry that I will slowly response next few days due to a short 
vacation.



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-01-31 Thread rednaxelafx
Github user rednaxelafx commented on the issue:

https://github.com/apache/spark/pull/20419
  
I gave it a bit more thought. Here's an alternative proposal: instead of 
using a "force comment" mechanism in the current form (which still gets a 
`ctx.freshName("c")` that the caller has no control over), why don't we provide 
an alternative API that also forces comment, but allows the caller to specify a 
(within the `CodegenContext`) unique ID for the placeholder comment?

In this case, instead of using `/*wholestagecodegen_c*/` as the 
placeholder, which can be brittle if someone adds new calls to 
`ctx.registerComment()` in `WholeStageCodegenExec`, we can call 
`ctx.registerComment(comment, placeholderId = "wsc_codegenStageId")`, which we 
can be sure that it'll end up creating a placeholder comment of 
`/*wsc_codegenStageId*/`, stable across all whole-stage generated main classes, 
that would never affect the codegen cache behavior.

@kiszk WDYT?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-01-31 Thread rednaxelafx
Github user rednaxelafx commented on the issue:

https://github.com/apache/spark/pull/20419
  
I did the experiment that I asked about at 
https://github.com/apache/spark/pull/20419#issuecomment-361359825 , and 
verified that under the current implementation, this PR will not affect the 
codegen cache hit behavior.

But the way it's working is a bit brittle. The `body` field of the 
`CodeAndComment` returned from `WholeStageCodegenExec.doCodeGen()` looks like 
the following, when `spark.sql.codegen.comments=false && 
spark.sql.codegen.useIdInClassName=false`:
```java
/*wholestagecodegen_c*/
final class GeneratedIterator extends 
org.apache.spark.sql.execution.BufferedRowIterator {
```
Notice the `/*wholestagecodegen_c*/` placeholder comment. It's the artifact 
produced by the `ctx.registerComment()` call, where `wholestagecodegen_` is a 
prefix automatically derived from the calling class, and `c` is from a 
`ctx.freshName("c")`, subject to an integer suffix when names collide.

The way it's used right now, because from `WholeStageCodegenExec` we're 
only making very few calls to `ctx.registerComment()` and all of them happen in 
a deterministic order, that's how this identifier embedded in the placeholder 
comment would be the same no matter what codegen stage it is.

But let's assume if someone doesn't know about this, and then accidentally 
added code in `WholeStageCodegenExec` that conditionally calls 
`ctx.registerComment()`, then this placeholder comment is NOT guaranteed to be 
the same, and that will affect the codegen cache hit behavior.

I'd say functional wise this PR is good to go; maybe we want to add a 
comment somewhere in `WholeStageCodegenExec` to note the subtlety, or maybe 
think about an alternative sometime in the future.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-01-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20419
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86882/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-01-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20419
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-01-31 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20419
  
**[Test build #86882 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86882/testReport)**
 for PR 20419 at commit 
[`f07dc2d`](https://github.com/apache/spark/commit/f07dc2dffb625c44b5c33d97b56b719564c61d51).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-01-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20419
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/436/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-01-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20419
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-01-31 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20419
  
**[Test build #86882 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86882/testReport)**
 for PR 20419 at commit 
[`f07dc2d`](https://github.com/apache/spark/commit/f07dc2dffb625c44b5c33d97b56b719564c61d51).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-01-30 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/20419
  
Then, we should add a test case to ensure it will not be broken. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-01-30 Thread rednaxelafx
Github user rednaxelafx commented on the issue:

https://github.com/apache/spark/pull/20419
  
@gatorsmile Not directly. The `CodeAndComment` case class is just a 
"container", it doesn't handle what gets into the `body` field. When we force 
embed a comment, it'll leave a comment as a placeholder in the generated code, 
which is in the `body` (the actual comment contents are in the `comment` map). 
I was just curious if any reviews knows off the top of their heads whether or 
not this placeholder may affect equality.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-01-30 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/20419
  
@rednaxelafx Does the following 
[codes](https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala#L1286-L1294)
 address your concern?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20419
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20419
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86774/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-01-29 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20419
  
**[Test build #86774 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86774/testReport)**
 for PR 20419 at commit 
[`4a09eac`](https://github.com/apache/spark/commit/4a09eac999ba223d379b96ac8335a2fc50ff52c9).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-01-29 Thread rednaxelafx
Github user rednaxelafx commented on the issue:

https://github.com/apache/spark/pull/20419
  
@kiszk SGTM and LGTM. Let's ship it!

One more question on the side: with the `forceComment = true`, are we fully 
sure that won't affect the equality of `CodeAndComment`?
The whole point of this PR is to have a way to embed the ID into the 
non-executable code of the generated code so that it won't affect the codegen 
cache hit, right?

Could you please post an example of either the generated code or the 
metrics, e.g. a `SortMergeJoin` on two identical `spark.range(3)`s, and confirm 
that even when the two `range()`s are codegen's into different 
`codegenStageId`s, with the `spark.sql.codegen.useIdInClassName` turned off, 
the two stages would still hit the codegen cache? Basically to verify the 
example I gave in 
https://github.com/apache/spark/pull/20224#issuecomment-357091842 still hits 
the codegen cache when `spark.sql.codegen.useIdInClassName=false`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-01-29 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20419
  
**[Test build #86774 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86774/testReport)**
 for PR 20419 at commit 
[`4a09eac`](https://github.com/apache/spark/commit/4a09eac999ba223d379b96ac8335a2fc50ff52c9).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-01-29 Thread kiszk
Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/20419
  
I always leave the comment regardless of the 
`spark.sql.codegen.useIdInClassName` for a unified way to access the ID from 
the comment.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20419
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/341/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20419
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-01-28 Thread rednaxelafx
Github user rednaxelafx commented on the issue:

https://github.com/apache/spark/pull/20419
  
LGTM, and +1 on @viirya 's idea. I like it better for the comment to be on 
top of the class declaration instead of inside it; but I'm okay either way if 
others have strong opinion otherwise. As long as the comment line is right next 
to the class declaration line, that's good enough for easy pattern matching.

BTW, to save a few characters from the generated code, maybe we don't need 
to generate this comment when the `spark.sql.codegen.useIdInClassName` is 
turned on, since the `codegenStageId` is in the class name already.
But for the ease of having a unified way to access the ID from the comment, 
I like having this comment always there. I guess @kiszk and @viirya are on the 
same side on this one for always having the comment, right? In that case I'd 
agree and let's get this in :-)


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-01-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20419
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86745/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-01-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20419
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-01-28 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20419
  
**[Test build #86745 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86745/testReport)**
 for PR 20419 at commit 
[`6ee9106`](https://github.com/apache/spark/commit/6ee910611e7b6311f4ec1c5afb0ce5274fcf6805).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-01-28 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20419
  
**[Test build #86745 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86745/testReport)**
 for PR 20419 at commit 
[`6ee9106`](https://github.com/apache/spark/commit/6ee910611e7b6311f4ec1c5afb0ce5274fcf6805).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-01-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20419
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/316/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-01-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20419
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-01-28 Thread kiszk
Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/20419
  
ping @rednaxelafx


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org