[GitHub] spark pull request #15059: [SPARK-17506][SQL] Improve the check double value...

2016-09-21 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/15059#discussion_r79788242
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ArithmeticExpressionSuite.scala
 ---
@@ -170,11 +170,9 @@ class ArithmeticExpressionSuite extends SparkFunSuite 
with ExpressionEvalHelper
 checkEvaluation(Remainder(positiveLongLit, positiveLongLit), 0L)
 checkEvaluation(Remainder(negativeLongLit, negativeLongLit), 0L)
 
-// TODO: the following lines would fail the test due to inconsistency 
result of interpret
--- End diff --

nvm, it's fixed in https://github.com/apache/spark/pull/15171


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15059: [SPARK-17506][SQL] Improve the check double value...

2016-09-21 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/15059#discussion_r79779543
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ArithmeticExpressionSuite.scala
 ---
@@ -170,11 +170,9 @@ class ArithmeticExpressionSuite extends SparkFunSuite 
with ExpressionEvalHelper
 checkEvaluation(Remainder(positiveLongLit, positiveLongLit), 0L)
 checkEvaluation(Remainder(negativeLongLit, negativeLongLit), 0L)
 
-// TODO: the following lines would fail the test due to inconsistency 
result of interpret
--- End diff --

Ah, it seemed worth removing because the change does apparently make this 
test pass, and that's what the comment refers to. If it's still an issue, we 
can restore a modified version of the comment.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15059: [SPARK-17506][SQL] Improve the check double value...

2016-09-20 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request:

https://github.com/apache/spark/pull/15059#discussion_r79755313
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ArithmeticExpressionSuite.scala
 ---
@@ -170,11 +170,9 @@ class ArithmeticExpressionSuite extends SparkFunSuite 
with ExpressionEvalHelper
 checkEvaluation(Remainder(positiveLongLit, positiveLongLit), 0L)
 checkEvaluation(Remainder(negativeLongLit, negativeLongLit), 0L)
 
-// TODO: the following lines would fail the test due to inconsistency 
result of interpret
--- End diff --

The result of interpret and codegen for remainder between giant values are 
equal within relative tolerance, so maybe this no longer requires to be 
resolved. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15059: [SPARK-17506][SQL] Improve the check double value...

2016-09-20 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/15059#discussion_r79749191
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ArithmeticExpressionSuite.scala
 ---
@@ -170,11 +170,9 @@ class ArithmeticExpressionSuite extends SparkFunSuite 
with ExpressionEvalHelper
 checkEvaluation(Remainder(positiveLongLit, positiveLongLit), 0L)
 checkEvaluation(Remainder(negativeLongLit, negativeLongLit), 0L)
 
-// TODO: the following lines would fail the test due to inconsistency 
result of interpret
--- End diff --

this TODO is not fixed yet, why remove it?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15059: [SPARK-17506][SQL] Improve the check double value...

2016-09-18 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/15059


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15059: [SPARK-17506][SQL] Improve the check double value...

2016-09-18 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request:

https://github.com/apache/spark/pull/15059#discussion_r79284753
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelper.scala
 ---
@@ -289,13 +290,37 @@ trait ExpressionEvalHelper extends 
GeneratorDrivenPropertyChecks {
 (result, expected) match {
   case (result: Array[Byte], expected: Array[Byte]) =>
 java.util.Arrays.equals(result, expected)
-  case (result: Double, expected: Spread[Double @unchecked]) =>
--- End diff --

Yes, it should have been replaced by the new case below.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15059: [SPARK-17506][SQL] Improve the check double value...

2016-09-16 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/15059#discussion_r79133345
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelper.scala
 ---
@@ -289,13 +290,37 @@ trait ExpressionEvalHelper extends 
GeneratorDrivenPropertyChecks {
 (result, expected) match {
   case (result: Array[Byte], expected: Array[Byte]) =>
 java.util.Arrays.equals(result, expected)
-  case (result: Double, expected: Spread[Double @unchecked]) =>
--- End diff --

BTW do you mean to remove this case? does it not apply now?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15059: [SPARK-17506][SQL] Improve the check double value...

2016-09-14 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request:

https://github.com/apache/spark/pull/15059#discussion_r78725648
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelper.scala
 ---
@@ -289,13 +290,32 @@ trait ExpressionEvalHelper extends 
GeneratorDrivenPropertyChecks {
 (result, expected) match {
   case (result: Array[Byte], expected: Array[Byte]) =>
 java.util.Arrays.equals(result, expected)
-  case (result: Double, expected: Spread[Double @unchecked]) =>
-expected.asInstanceOf[Spread[Double]].isWithin(result)
   case (result: Double, expected: Double) if result.isNaN && 
expected.isNaN =>
 true
+  case (result: Double, expected: Double) =>
+relativeErrorComparison(result, expected)
   case (result: Float, expected: Float) if result.isNaN && 
expected.isNaN =>
 true
   case _ => result == expected
 }
   }
+
+  /**
+   * Private helper function for comparing two values using relative 
tolerance.
+   * Note that if x or y is extremely close to zero, i.e., smaller than 
Double.MinPositiveValue,
+   * the relative tolerance is meaningless, so the exception will be 
raised to warn users.
+   */
+  private def relativeErrorComparison(x: Double, y: Double, eps: Double = 
1E-8): Boolean = {
--- End diff --

I've add comment to indicate the problem. Thank you!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15059: [SPARK-17506][SQL] Improve the check double value...

2016-09-14 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/15059#discussion_r78717016
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelper.scala
 ---
@@ -289,13 +290,32 @@ trait ExpressionEvalHelper extends 
GeneratorDrivenPropertyChecks {
 (result, expected) match {
   case (result: Array[Byte], expected: Array[Byte]) =>
 java.util.Arrays.equals(result, expected)
-  case (result: Double, expected: Spread[Double @unchecked]) =>
-expected.asInstanceOf[Spread[Double]].isWithin(result)
   case (result: Double, expected: Double) if result.isNaN && 
expected.isNaN =>
 true
+  case (result: Double, expected: Double) =>
+relativeErrorComparison(result, expected)
   case (result: Float, expected: Float) if result.isNaN && 
expected.isNaN =>
 true
   case _ => result == expected
 }
   }
+
+  /**
+   * Private helper function for comparing two values using relative 
tolerance.
+   * Note that if x or y is extremely close to zero, i.e., smaller than 
Double.MinPositiveValue,
+   * the relative tolerance is meaningless, so the exception will be 
raised to warn users.
+   */
+  private def relativeErrorComparison(x: Double, y: Double, eps: Double = 
1E-8): Boolean = {
--- End diff --

Seems fine. You could refer to the source of this code and explain the 
duplication but it's not a big deal.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15059: [SPARK-17506][SQL] Improve the check double value...

2016-09-12 Thread jiangxb1987
GitHub user jiangxb1987 opened a pull request:

https://github.com/apache/spark/pull/15059

[SPARK-17506][SQL] Improve the check double values equality rule.

## What changes were proposed in this pull request?

In `ExpressionEvalHelper`, we check the equality between two double values 
by comparing whether the expected value is within the range [target - 
tolerance, target + tolerance], but this can cause a negative false when the 
compared numerics are very large. 
Before:
```
val1 = 1.6358558070241E306
val2 = 1.6358558070240974E306
ExpressionEvalHelper.compareResults(val1, val2)
false
```
In fact, `val1` and `val2` are but with different precisions, we should 
tolerant this case by comparing with percentage range, eg.,expected is within 
range [target - target * tolerance_percentage, target + target * 
tolerance_percentage].
After:
```
val1 = 1.6358558070241E306
val2 = 1.6358558070240974E306
ExpressionEvalHelper.compareResults(val1, val2)
true
```

## How was this patch tested?

Exsiting testcases.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jiangxb1987/spark deq

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/15059.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #15059


commit 78f37334164a015605d5c23ff7217a131c3ea3a7
Author: jiangxingbo 
Date:   2016-09-12T15:14:28Z

check the equality of double values with tolerance within percentage range.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org