[GitHub] spark pull request #14277: [SPARK-16640][SQL] Add codegen for Elt function

2016-07-21 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/14277


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14277: [SPARK-16640][SQL] Add codegen for Elt function

2016-07-21 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14277#discussion_r71651284
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
 ---
@@ -204,6 +204,32 @@ case class Elt(children: Seq[Expression])
   }
 }
   }
+
+  override protected def doGenCode(ctx: CodegenContext, ev: ExprCode): 
ExprCode = {
+val index = indexExpr.genCode(ctx)
+val strings = stringExprs.map(_.genCode(ctx))
+val assignStringValue = strings.zipWithIndex.map { case (eval, index) 
=>
+  s"""
+case ${index + 1}:
+  ${ev.value} = ${eval.isNull} ? null : ${eval.value};
+   break;
+  """
+}.mkString("\n")
+val indexVal = ctx.freshName("index")
+val stringArray = ctx.freshName("strings");
+
+ev.copy(index.code + "\n" + strings.map(_.code).mkString("\n") + s"""
+  int $indexVal = ${index.value};
+  boolean ${ev.isNull} = false;
--- End diff --

we can define this after the switch: `final boolean ${ev.isNull} = 
${ev.value} == null;`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14277: [SPARK-16640][SQL] Add codegen for Elt function

2016-07-21 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14277#discussion_r71651050
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
 ---
@@ -204,6 +204,32 @@ case class Elt(children: Seq[Expression])
   }
 }
   }
+
+  override protected def doGenCode(ctx: CodegenContext, ev: ExprCode): 
ExprCode = {
+val index = indexExpr.genCode(ctx)
+val strings = stringExprs.map(_.genCode(ctx))
+val assignStringValue = strings.zipWithIndex.map { case (eval, index) 
=>
+  s"""
+case ${index + 1}:
+  ${ev.value} = ${eval.isNull} ? null : ${eval.value};
+   break;
+  """
+}.mkString("\n")
+val indexVal = ctx.freshName("index")
+val stringArray = ctx.freshName("strings");
+
+ev.copy(index.code + "\n" + strings.map(_.code).mkString("\n") + s"""
+  int $indexVal = ${index.value};
--- End diff --

this can be `final`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14277: [SPARK-16640][SQL] Add codegen for Elt function

2016-07-21 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14277#discussion_r71650993
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
 ---
@@ -204,6 +204,32 @@ case class Elt(children: Seq[Expression])
   }
 }
   }
+
+  override protected def doGenCode(ctx: CodegenContext, ev: ExprCode): 
ExprCode = {
+val index = indexExpr.genCode(ctx)
+val strings = stringExprs.map(_.genCode(ctx))
+val assignStringValue = strings.zipWithIndex.map { case (eval, index) 
=>
+  s"""
+case ${index + 1}:
+  ${ev.value} = ${eval.isNull} ? null : ${eval.value};
+   break;
--- End diff --

ident is wrong here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14277: [SPARK-16640][SQL] Add codegen for Elt function

2016-07-20 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/14277#discussion_r71649272
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
 ---
@@ -204,6 +204,29 @@ case class Elt(children: Seq[Expression])
   }
 }
   }
+
+  override protected def doGenCode(ctx: CodegenContext, ev: ExprCode): 
ExprCode = {
+val index = children.head.map(_.genCode(ctx))(0)
+val strings = children.tail.map(_.genCode(ctx))
+val stringValues = strings.map { eval =>
+  s"${eval.isNull} ? null : ${eval.value}"
+}.mkString(", ")
+val indexVal = ctx.freshName("index")
+val stringArray = ctx.freshName("strings");
+
+ev.copy(index.code + "\n" + strings.map(_.code).mkString("\n") + s"""
+  int $indexVal = ${index.value} - 1;
+  UTF8String[] $stringArray = {$stringValues};
--- End diff --

After re-checking the wholestage code, I find that is not correct. If any 
expression in a sub plan will fall back, then the sub plan will not be 
wholestage codegen. The whole plan can be wholestage codegen, but for these no 
wholestage codegen sub plans, they will be wrapped with input adapter.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14277: [SPARK-16640][SQL] Add codegen for Elt function

2016-07-20 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/14277#discussion_r71648748
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
 ---
@@ -204,6 +204,29 @@ case class Elt(children: Seq[Expression])
   }
 }
   }
+
+  override protected def doGenCode(ctx: CodegenContext, ev: ExprCode): 
ExprCode = {
+val index = children.head.map(_.genCode(ctx))(0)
+val strings = children.tail.map(_.genCode(ctx))
+val stringValues = strings.map { eval =>
+  s"${eval.isNull} ? null : ${eval.value}"
+}.mkString(", ")
+val indexVal = ctx.freshName("index")
+val stringArray = ctx.freshName("strings");
+
+ev.copy(index.code + "\n" + strings.map(_.code).mkString("\n") + s"""
+  int $indexVal = ${index.value} - 1;
+  UTF8String[] $stringArray = {$stringValues};
--- End diff --

Let me do a simple benchmark for this. Actually I think it is no harm to 
have codegen for this. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14277: [SPARK-16640][SQL] Add codegen for Elt function

2016-07-20 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/14277#discussion_r71648516
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
 ---
@@ -204,6 +204,29 @@ case class Elt(children: Seq[Expression])
   }
 }
   }
+
+  override protected def doGenCode(ctx: CodegenContext, ev: ExprCode): 
ExprCode = {
+val index = children.head.map(_.genCode(ctx))(0)
+val strings = children.tail.map(_.genCode(ctx))
+val stringValues = strings.map { eval =>
+  s"${eval.isNull} ? null : ${eval.value}"
+}.mkString(", ")
+val indexVal = ctx.freshName("index")
+val stringArray = ctx.freshName("strings");
+
+ev.copy(index.code + "\n" + strings.map(_.code).mkString("\n") + s"""
+  int $indexVal = ${index.value} - 1;
+  UTF8String[] $stringArray = {$stringValues};
--- End diff --

ok. `CodegenFallback` does it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14277: [SPARK-16640][SQL] Add codegen for Elt function

2016-07-20 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14277#discussion_r71648156
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
 ---
@@ -204,6 +204,29 @@ case class Elt(children: Seq[Expression])
   }
 }
   }
+
+  override protected def doGenCode(ctx: CodegenContext, ev: ExprCode): 
ExprCode = {
+val index = children.head.map(_.genCode(ctx))(0)
+val strings = children.tail.map(_.genCode(ctx))
+val stringValues = strings.map { eval =>
+  s"${eval.isNull} ? null : ${eval.value}"
+}.mkString(", ")
+val indexVal = ctx.freshName("index")
+val stringArray = ctx.freshName("strings");
+
+ev.copy(index.code + "\n" + strings.map(_.code).mkString("\n") + s"""
+  int $indexVal = ${index.value} - 1;
+  UTF8String[] $stringArray = {$stringValues};
--- End diff --

no, expressions can always codegen, by default we just pass the expression 
reference into generated java code and call its `eval` method. So it's only 
about performance for the expression.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14277: [SPARK-16640][SQL] Add codegen for Elt function

2016-07-20 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/14277#discussion_r71646763
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
 ---
@@ -204,6 +204,29 @@ case class Elt(children: Seq[Expression])
   }
 }
   }
+
+  override protected def doGenCode(ctx: CodegenContext, ev: ExprCode): 
ExprCode = {
+val index = children.head.map(_.genCode(ctx))(0)
+val strings = children.tail.map(_.genCode(ctx))
+val stringValues = strings.map { eval =>
+  s"${eval.isNull} ? null : ${eval.value}"
+}.mkString(", ")
+val indexVal = ctx.freshName("index")
+val stringArray = ctx.freshName("strings");
+
+ev.copy(index.code + "\n" + strings.map(_.code).mkString("\n") + s"""
+  int $indexVal = ${index.value} - 1;
+  UTF8String[] $stringArray = {$stringValues};
--- End diff --

My main concern is not performance improvement for elt function. As you 
know, if any expression in a plan doesn't have codegen, the plan can't be 
wholestage codegen. So I think it is better to have as many codegen expressions 
as we can.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14277: [SPARK-16640][SQL] Add codegen for Elt function

2016-07-20 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14277#discussion_r71646550
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
 ---
@@ -204,6 +204,29 @@ case class Elt(children: Seq[Expression])
   }
 }
   }
+
+  override protected def doGenCode(ctx: CodegenContext, ev: ExprCode): 
ExprCode = {
+val index = children.head.map(_.genCode(ctx))(0)
+val strings = children.tail.map(_.genCode(ctx))
+val stringValues = strings.map { eval =>
+  s"${eval.isNull} ? null : ${eval.value}"
+}.mkString(", ")
+val indexVal = ctx.freshName("index")
+val stringArray = ctx.freshName("strings");
+
+ev.copy(index.code + "\n" + strings.map(_.code).mkString("\n") + s"""
+  int $indexVal = ${index.value} - 1;
+  UTF8String[] $stringArray = {$stringValues};
--- End diff --

a `switch` or `if` should work, or we can just fallback to `eval`. Can you 
benchmark how much speedup we can get by codegen for this expression? If it's 
only a little, I think it's ok to not have codegen here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14277: [SPARK-16640][SQL] Add codegen for Elt function

2016-07-20 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/14277#discussion_r71638204
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
 ---
@@ -204,6 +204,29 @@ case class Elt(children: Seq[Expression])
   }
 }
   }
+
+  override protected def doGenCode(ctx: CodegenContext, ev: ExprCode): 
ExprCode = {
+val index = children.head.map(_.genCode(ctx))(0)
+val strings = children.tail.map(_.genCode(ctx))
+val stringValues = strings.map { eval =>
+  s"${eval.isNull} ? null : ${eval.value}"
+}.mkString(", ")
+val indexVal = ctx.freshName("index")
+val stringArray = ctx.freshName("strings");
+
+ev.copy(index.code + "\n" + strings.map(_.code).mkString("\n") + s"""
+  int $indexVal = ${index.value} - 1;
+  UTF8String[] $stringArray = {$stringValues};
--- End diff --

Because the index is only evaluated runtime, I think we can' just evaluate 
the string we need? Or we can wrap these string expressions in an if/else if or 
switch block, then we can only evaluate the string expression specified by the 
index. What do you think?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14277: [SPARK-16640][SQL] Add codegen for Elt function

2016-07-20 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14277#discussion_r71512431
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
 ---
@@ -204,6 +204,29 @@ case class Elt(children: Seq[Expression])
   }
 }
   }
+
+  override protected def doGenCode(ctx: CodegenContext, ev: ExprCode): 
ExprCode = {
+val index = children.head.map(_.genCode(ctx))(0)
+val strings = children.tail.map(_.genCode(ctx))
+val stringValues = strings.map { eval =>
+  s"${eval.isNull} ? null : ${eval.value}"
+}.mkString(", ")
+val indexVal = ctx.freshName("index")
+val stringArray = ctx.freshName("strings");
+
+ev.copy(index.code + "\n" + strings.map(_.code).mkString("\n") + s"""
+  int $indexVal = ${index.value} - 1;
+  UTF8String[] $stringArray = {$stringValues};
--- End diff --

so we will evaluate all string expressions? In `eval` we only evaluate the 
string expression we need.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14277: [SPARK-16640][SQL] Add codegen for Elt function

2016-07-20 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14277#discussion_r71511742
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
 ---
@@ -204,6 +204,29 @@ case class Elt(children: Seq[Expression])
   }
 }
   }
+
+  override protected def doGenCode(ctx: CodegenContext, ev: ExprCode): 
ExprCode = {
+val index = children.head.map(_.genCode(ctx))(0)
+val strings = children.tail.map(_.genCode(ctx))
--- End diff --

`stringExprs.map(_.genCode(ctx))`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14277: [SPARK-16640][SQL] Add codegen for Elt function

2016-07-20 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14277#discussion_r71511689
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
 ---
@@ -204,6 +204,29 @@ case class Elt(children: Seq[Expression])
   }
 }
   }
+
+  override protected def doGenCode(ctx: CodegenContext, ev: ExprCode): 
ExprCode = {
+val index = children.head.map(_.genCode(ctx))(0)
--- End diff --

`indexExpr.genCode(ctx)`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14277: [SPARK-16640][SQL] Add codegen for Elt function

2016-07-19 Thread viirya
GitHub user viirya opened a pull request:

https://github.com/apache/spark/pull/14277

[SPARK-16640][SQL] Add codegen for Elt function

## What changes were proposed in this pull request?

Elt function doesn't support codegen execution now. We should add the 
support.

## How was this patch tested?

Jenkins tests.




You can merge this pull request into a Git repository by running:

$ git pull https://github.com/viirya/spark-1 elt-codegen

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/14277.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #14277


commit c517addc2a00fca7578b4fcb1f47a7ef6f337e5c
Author: Liang-Chi Hsieh 
Date:   2016-07-20T05:06:27Z

Add codegen for Elt function.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org