[GitHub] spark pull request #20036: [SPARK-18016][SQL][FOLLOW-UP] Code Generation: Co...

2017-12-27 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/20036


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20036: [SPARK-18016][SQL][FOLLOW-UP] Code Generation: Co...

2017-12-27 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/20036#discussion_r158776742
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/joins/SortMergeJoinExec.scala
 ---
@@ -440,8 +440,9 @@ case class SortMergeJoinExec(
 val spillThreshold = getSpillThreshold
 val inMemoryThreshold = getInMemoryThreshold
 
+// inline mutable state since not many join operations in a task
--- End diff --

nit: `Inline`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20036: [SPARK-18016][SQL][FOLLOW-UP] Code Generation: Co...

2017-12-27 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/20036#discussion_r158776770
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/HashAggregateExec.scala
 ---
@@ -587,20 +587,24 @@ case class HashAggregateExec(
   fastHashMapClassName, groupingKeySchema, bufferSchema).generate()
 ctx.addInnerClass(generatedMap)
 
+// inline mutable state since not many aggregation operations in a 
task
--- End diff --

nit: `Inline`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20036: [SPARK-18016][SQL][FOLLOW-UP] Code Generation: Co...

2017-12-27 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/20036#discussion_r158776762
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/HashAggregateExec.scala
 ---
@@ -587,20 +587,24 @@ case class HashAggregateExec(
   fastHashMapClassName, groupingKeySchema, bufferSchema).generate()
 ctx.addInnerClass(generatedMap)
 
+// inline mutable state since not many aggregation operations in a 
task
 fastHashMapTerm = ctx.addMutableState(fastHashMapClassName, 
"vectorizedHastHashMap",
-  v => s"$v = new $fastHashMapClassName();")
-ctx.addMutableState(s"java.util.Iterator", 
"vectorizedFastHashMapIter")
+  v => s"$v = new $fastHashMapClassName();", forceInline = true)
+ctx.addMutableState(s"java.util.Iterator", 
"vectorizedFastHashMapIter",
+  forceInline = true)
   } else {
 val generatedMap = new RowBasedHashMapGenerator(ctx, 
aggregateExpressions,
   fastHashMapClassName, groupingKeySchema, bufferSchema).generate()
 ctx.addInnerClass(generatedMap)
 
+// inline mutable state since not many aggregation operations in a 
task
--- End diff --

nit: `Inline`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20036: [SPARK-18016][SQL][FOLLOW-UP] Code Generation: Co...

2017-12-27 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/20036#discussion_r158776541
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala
 ---
@@ -283,7 +283,7 @@ case class InputAdapter(child: SparkPlan) extends 
UnaryExecNode with CodegenSupp
 
   override def doProduce(ctx: CodegenContext): String = {
 // Right now, InputAdapter is only used when there is one input RDD.
-// inline mutable state since an inputAdaptor in a task
+// inline mutable state since an InputAdapter is used once in a task 
for WholeStageCodegen
--- End diff --

nit: `inline` -> `Inline`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20036: [SPARK-18016][SQL][FOLLOW-UP] Code Generation: Co...

2017-12-26 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/20036#discussion_r158761989
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala
 ---
@@ -283,7 +283,7 @@ case class InputAdapter(child: SparkPlan) extends 
UnaryExecNode with CodegenSupp
 
   override def doProduce(ctx: CodegenContext): String = {
 // Right now, InputAdapter is only used when there is one input RDD.
-// inline mutable state since an inputAdaptor in a task
+// inline mutable state since an InputAdapter in a task
--- End diff --

sure, done


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20036: [SPARK-18016][SQL][FOLLOW-UP] Code Generation: Co...

2017-12-26 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/20036#discussion_r158753678
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala
 ---
@@ -299,7 +299,7 @@ class CodegenContext {
   def initMutableStates(): String = {
 // It's possible that we add same mutable state twice, e.g. the 
`mergeExpressions` in
 // `TypedAggregateExpression`, we should call `distinct` here to 
remove the duplicated ones.
-val initCodes = mutableStateInitCode.distinct
+val initCodes = mutableStateInitCode.distinct.map(_ + "\n")
--- End diff --

ah, good catch!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20036: [SPARK-18016][SQL][FOLLOW-UP] Code Generation: Co...

2017-12-26 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/20036#discussion_r158752728
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala
 ---
@@ -283,7 +283,7 @@ case class InputAdapter(child: SparkPlan) extends 
UnaryExecNode with CodegenSupp
 
   override def doProduce(ctx: CodegenContext): String = {
 // Right now, InputAdapter is only used when there is one input RDD.
-// inline mutable state since an inputAdaptor in a task
+// inline mutable state since an InputAdapter in a task
--- End diff --

you miss some words...


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20036: [SPARK-18016][SQL][FOLLOW-UP] Code Generation: Co...

2017-12-21 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/20036#discussion_r158439773
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
 ---
@@ -118,9 +118,8 @@ case class Like(left: Expression, right: Expression) 
extends StringRegexExpressi
   if (rVal != null) {
 val regexStr =
   
StringEscapeUtils.escapeJava(escape(rVal.asInstanceOf[UTF8String].toString()))
-// inline mutable state since not many Like operations in a task
 val pattern = ctx.addMutableState(patternClass, "patternLike",
-  v => s"""$v = ${patternClass}.compile("$regexStr");""", 
forceInline = true)
+  v => s"""$v = ${patternClass}.compile("$regexStr");""")
--- End diff --

Sure, done for other places, too.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20036: [SPARK-18016][SQL][FOLLOW-UP] Code Generation: Co...

2017-12-21 Thread ueshin
Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/20036#discussion_r158430491
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
 ---
@@ -194,9 +193,8 @@ case class RLike(left: Expression, right: Expression) 
extends StringRegexExpress
   if (rVal != null) {
 val regexStr =
   
StringEscapeUtils.escapeJava(rVal.asInstanceOf[UTF8String].toString())
-// inline mutable state since not many RLike operations in a task
 val pattern = ctx.addMutableState(patternClass, "patternRLike",
-  v => s"""$v = ${patternClass}.compile("$regexStr");""", 
forceInline = true)
+  v => s"""$v = ${patternClass}.compile("$regexStr");""")
--- End diff --

ditto.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20036: [SPARK-18016][SQL][FOLLOW-UP] Code Generation: Co...

2017-12-21 Thread ueshin
Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/20036#discussion_r158430481
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
 ---
@@ -118,9 +118,8 @@ case class Like(left: Expression, right: Expression) 
extends StringRegexExpressi
   if (rVal != null) {
 val regexStr =
   
StringEscapeUtils.escapeJava(escape(rVal.asInstanceOf[UTF8String].toString()))
-// inline mutable state since not many Like operations in a task
 val pattern = ctx.addMutableState(patternClass, "patternLike",
-  v => s"""$v = ${patternClass}.compile("$regexStr");""", 
forceInline = true)
+  v => s"""$v = ${patternClass}.compile("$regexStr");""")
--- End diff --

nit: we can remove `{` and `}` around `patternClass`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20036: [SPARK-18016][SQL][FOLLOW-UP] Code Generation: Co...

2017-12-20 Thread kiszk
GitHub user kiszk opened a pull request:

https://github.com/apache/spark/pull/20036

[SPARK-18016][SQL][FOLLOW-UP] Code Generation: Constant Pool Limit - reduce 
entries for mutable state

## What changes were proposed in this pull request?

This PR addresses additional review comments in #19811

## How was this patch tested?

Existing test suites

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kiszk/spark SPARK-18066-followup

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/20036.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #20036


commit 53661eb72bba55376bc6112b51c25489522d309c
Author: Kazuaki Ishizaki 
Date:   2017-12-20T16:21:53Z

initial commit




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org