[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...

2014-09-19 Thread marmbrus
Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/2368#issuecomment-56244170
  
Thanks!  Merged to master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...

2014-09-19 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/2368


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...

2014-09-16 Thread marmbrus
Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/2368#discussion_r17619745
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
 ---
@@ -244,18 +244,22 @@ case class Cast(child: Expression, dataType: 
DataType) extends UnaryExpression {
   b = x.numeric.asInstanceOf[Numeric[Any]].toFloat(b)
   }
 
-  private[this] lazy val cast: Any = Any = dataType match {
-case StringType = castToString
-case BinaryType = castToBinary
-case DecimalType = castToDecimal
-case TimestampType = castToTimestamp
-case BooleanType = castToBoolean
-case ByteType = castToByte
-case ShortType = castToShort
-case IntegerType = castToInt
-case FloatType = castToFloat
-case LongType = castToLong
-case DoubleType = castToDouble
+  private[this] lazy val cast: Any = Any = if (child.dataType == 
dataType) { 
+(e: Any) = e
+  } else {
+dataType match {
+  case StringType = castToString
--- End diff --

Perhaps:

```scala
  private[this] lazy val cast: Any = Any = dataType match {
case dt if dt == child.dataType = identity[Any] _
case StringType = castToString
case BinaryType = castToBinary
case DecimalType = castToDecimal
case TimestampType = castToTimestamp
case BooleanType = castToBoolean
case ByteType = castToByte
case ShortType = castToShort
case IntegerType = castToInt
case FloatType = castToFloat
case LongType = castToLong
case DoubleType = castToDouble
  }
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...

2014-09-16 Thread marmbrus
Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/2368#issuecomment-55788087
  
Minor style suggestion, otherwise LGTM.  Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...

2014-09-16 Thread chenghao-intel
Github user chenghao-intel commented on a diff in the pull request:

https://github.com/apache/spark/pull/2368#discussion_r17639971
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
 ---
@@ -244,18 +244,22 @@ case class Cast(child: Expression, dataType: 
DataType) extends UnaryExpression {
   b = x.numeric.asInstanceOf[Numeric[Any]].toFloat(b)
   }
 
-  private[this] lazy val cast: Any = Any = dataType match {
-case StringType = castToString
-case BinaryType = castToBinary
-case DecimalType = castToDecimal
-case TimestampType = castToTimestamp
-case BooleanType = castToBoolean
-case ByteType = castToByte
-case ShortType = castToShort
-case IntegerType = castToInt
-case FloatType = castToFloat
-case LongType = castToLong
-case DoubleType = castToDouble
+  private[this] lazy val cast: Any = Any = if (child.dataType == 
dataType) { 
+(e: Any) = e
+  } else {
+dataType match {
+  case StringType = castToString
--- End diff --

Thank you @marmbrus , it's more clean. :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...

2014-09-16 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2368#issuecomment-55843681
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20445/consoleFull)
 for   PR 2368 at commit 
[`5c9c3a5`](https://github.com/apache/spark/commit/5c9c3a59d423383885151ff9cd17e0744776baf0).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...

2014-09-15 Thread chenghao-intel
Github user chenghao-intel commented on the pull request:

https://github.com/apache/spark/pull/2368#issuecomment-6637
  
retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...

2014-09-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2368#issuecomment-6842
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20329/consoleFull)
 for   PR 2368 at commit 
[`b804abd`](https://github.com/apache/spark/commit/b804abd5be4161531db38193be310cf628674cec).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...

2014-09-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2368#issuecomment-55564638
  
**[Tests timed 
out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20329/consoleFull)**
 after a configured wait of `120m`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...

2014-09-15 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/2368#discussion_r17562206
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
 ---
@@ -32,12 +32,12 @@ object Optimizer extends RuleExecutor[LogicalPlan] {
 Batch(Combine Limits, FixedPoint(100),
   CombineLimits) ::
 Batch(ConstantFolding, FixedPoint(100),
+  SimplifyCasts,
--- End diff --

 I guess we do not want to rely on optimization rules to avoid exceptions?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...

2014-09-15 Thread marmbrus
Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/2368#discussion_r17562464
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
 ---
@@ -32,12 +32,12 @@ object Optimizer extends RuleExecutor[LogicalPlan] {
 Batch(Combine Limits, FixedPoint(100),
   CombineLimits) ::
 Batch(ConstantFolding, FixedPoint(100),
+  SimplifyCasts,
--- End diff --

I'm not proposing that we don't use simplify casts.  I'm proposing that it 
should be possible to evaluate expressions even if we don't perform certain 
optimizations.  Any valid expression should be evaluate-able and there is 
nothing invalid about casting a type to itself (just like there is nothing 
invalid with an expression like `x + 0` even though the `+ 0` is a no-op).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...

2014-09-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2368#issuecomment-55685521
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20361/consoleFull)
 for   PR 2368 at commit 
[`3f15731`](https://github.com/apache/spark/commit/3f1573133ec7e9515f61361f4c37d65376a2).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...

2014-09-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2368#issuecomment-55690559
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20361/consoleFull)
 for   PR 2368 at commit 
[`3f15731`](https://github.com/apache/spark/commit/3f1573133ec7e9515f61361f4c37d65376a2).
 * This patch **passes** unit tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...

2014-09-15 Thread chenghao-intel
Github user chenghao-intel commented on the pull request:

https://github.com/apache/spark/pull/2368#issuecomment-55693692
  
You're right @marmbrus . the expression should work even without being 
optimized. I've updated the no-op case in `Cast` and revert the change in the 
`Optimizer`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...

2014-09-14 Thread chenghao-intel
Github user chenghao-intel commented on a diff in the pull request:

https://github.com/apache/spark/pull/2368#discussion_r17524564
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
 ---
@@ -32,12 +32,12 @@ object Optimizer extends RuleExecutor[LogicalPlan] {
 Batch(Combine Limits, FixedPoint(100),
   CombineLimits) ::
 Batch(ConstantFolding, FixedPoint(100),
+  SimplifyCasts,
--- End diff --

For this case, I don't think we have to move `SimplifyCasts` ahead, I am 
wondering if people construct the expressions with DSL API may jump into this 
trap accidentally.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...

2014-09-14 Thread marmbrus
Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/2368#discussion_r17524775
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
 ---
@@ -32,12 +32,12 @@ object Optimizer extends RuleExecutor[LogicalPlan] {
 Batch(Combine Limits, FixedPoint(100),
   CombineLimits) ::
 Batch(ConstantFolding, FixedPoint(100),
+  SimplifyCasts,
--- End diff --

I'm inclined to leave it as it is.  All of the rules in this batch are 
optimizations, and thus it should be possible to apply them at anytime without 
an exception being thrown.  If there are other bugs with the casting operators 
we should fix those instead of rearranging to mask the problem.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...

2014-09-14 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2368#issuecomment-55548796
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20318/consoleFull)
 for   PR 2368 at commit 
[`330a5c8`](https://github.com/apache/spark/commit/330a5c8331e22ba122dbc8b09c592cf60f97c919).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...

2014-09-14 Thread chenghao-intel
Github user chenghao-intel commented on a diff in the pull request:

https://github.com/apache/spark/pull/2368#discussion_r17525387
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
 ---
@@ -32,12 +32,12 @@ object Optimizer extends RuleExecutor[LogicalPlan] {
 Batch(Combine Limits, FixedPoint(100),
   CombineLimits) ::
 Batch(ConstantFolding, FixedPoint(100),
+  SimplifyCasts,
--- End diff --

I just added an new unit test ```SELECT CAST(CAST('123' AS binary) AS 
binary) FROM src LIMIT 1```, this test will fail because the `ConstantFolding` 
will try to compute the `foldable` expression prior to `SimplifyCasts`.
Sorry, I should add this test case earlier.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...

2014-09-14 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2368#issuecomment-55549403
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20319/consoleFull)
 for   PR 2368 at commit 
[`b804abd`](https://github.com/apache/spark/commit/b804abd5be4161531db38193be310cf628674cec).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...

2014-09-14 Thread marmbrus
Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/2368#discussion_r17525906
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
 ---
@@ -32,12 +32,12 @@ object Optimizer extends RuleExecutor[LogicalPlan] {
 Batch(Combine Limits, FixedPoint(100),
   CombineLimits) ::
 Batch(ConstantFolding, FixedPoint(100),
+  SimplifyCasts,
--- End diff --

Okay, I see.  I would argue that's a bug in the casting operator, not a
reason to change the ordering here though.
On Sep 14, 2014 8:09 PM, Cheng Hao notificati...@github.com wrote:

 In
 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala:

  @@ -32,12 +32,12 @@ object Optimizer extends RuleExecutor[LogicalPlan] {
   Batch(Combine Limits, FixedPoint(100),
 CombineLimits) ::
   Batch(ConstantFolding, FixedPoint(100),
  +  SimplifyCasts,

 I just added an new unit test SELECT CAST(CAST('123' AS binary) AS
 binary) FROM src LIMIT 1, this test will fail because the ConstantFolding
 will try to compute the foldable expression prior to SimplifyCasts.
 Sorry, I should add this test case earlier.

 —
 Reply to this email directly or view it on GitHub
 https://github.com/apache/spark/pull/2368/files#r17525387.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...

2014-09-14 Thread adrian-wang
Github user adrian-wang commented on a diff in the pull request:

https://github.com/apache/spark/pull/2368#discussion_r17526186
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
 ---
@@ -32,12 +32,12 @@ object Optimizer extends RuleExecutor[LogicalPlan] {
 Batch(Combine Limits, FixedPoint(100),
   CombineLimits) ::
 Batch(ConstantFolding, FixedPoint(100),
+  SimplifyCasts,
--- End diff --

 So maybe it is better to add a match case in the function `lazy val cast: 
Any = Any`, to eliminate those meaningless casts. And in this way we'll no 
longer need `SimplifyCasts` here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...

2014-09-14 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2368#issuecomment-2752
  
**[Tests timed 
out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20318/consoleFull)**
 after a configured wait of `120m`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...

2014-09-14 Thread chenghao-intel
Github user chenghao-intel commented on a diff in the pull request:

https://github.com/apache/spark/pull/2368#discussion_r17526456
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
 ---
@@ -32,12 +32,12 @@ object Optimizer extends RuleExecutor[LogicalPlan] {
 Batch(Combine Limits, FixedPoint(100),
   CombineLimits) ::
 Batch(ConstantFolding, FixedPoint(100),
+  SimplifyCasts,
--- End diff --

Usually we don't change the expression tree within an expression node, but 
utilizing an external rule object like `SimplifyCasts`, thus they can achieve 
the same goal, the later will make the code more clean and effective in runtime.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...

2014-09-14 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2368#issuecomment-3380
  
**[Tests timed 
out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20319/consoleFull)**
 after a configured wait of `120m`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...

2014-09-13 Thread marmbrus
Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/2368#discussion_r17513558
  
--- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/hiveUdfs.scala 
---
@@ -51,13 +51,13 @@ private[hive] abstract class HiveFunctionRegistry
   val function = 
functionInfo.getFunctionClass.newInstance().asInstanceOf[UDF]
   val method = 
function.getResolver.getEvalMethod(children.map(_.dataType.toTypeInfo))
 
-  lazy val expectedDataTypes = 
method.getParameterTypes.map(javaClassToDataType)
+  val expectedDataTypes = 
method.getParameterTypes.map(javaClassToDataType)
 
   HiveSimpleUdf(
 functionClassName,
 children.zip(expectedDataTypes).map {
   case (e, NullType) = e
-  case (e, t) = Cast(e, t)
+  case (e, t) = if (e.dataType == t) e else Cast(e, t)
--- End diff --

Nit: I'd structure this as a guard to call out the two cases more clearly:
```scala
case (e, t) if (e.dataType == t) = e
case (e, t) = Cast(e, t)
```



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...

2014-09-13 Thread marmbrus
Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/2368#discussion_r17513564
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
 ---
@@ -32,12 +32,12 @@ object Optimizer extends RuleExecutor[LogicalPlan] {
 Batch(Combine Limits, FixedPoint(100),
   CombineLimits) ::
 Batch(ConstantFolding, FixedPoint(100),
+  SimplifyCasts,
--- End diff --

Do we need to move this still?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...

2014-09-13 Thread marmbrus
Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/2368#issuecomment-55502187
  
Thanks for fixing this!  While I think eliminating the problem is great, it 
also seems like a bug to me that casting a timestamp to a timestamp throws an 
exception since none of the other datatypes do that.  Mind adding a no-op case 
to the `eval` path as well?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...

2014-09-12 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2368#issuecomment-55369750
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20208/consoleFull)
 for   PR 2368 at commit 
[`b834ed4`](https://github.com/apache/spark/commit/b834ed40363b19f924d51277880300d2f60f0524).
 * This patch **passes** unit tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...

2014-09-11 Thread chenghao-intel
Github user chenghao-intel commented on the pull request:

https://github.com/apache/spark/pull/2368#issuecomment-55363030
  
test this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...

2014-09-11 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2368#issuecomment-55363315
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20208/consoleFull)
 for   PR 2368 at commit 
[`b834ed4`](https://github.com/apache/spark/commit/b834ed40363b19f924d51277880300d2f60f0524).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org