[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-10-13 Thread marmbrus
Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-58950576
  
Thanks for working on this huge feature!  Merged to master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-10-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/2344


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-10-11 Thread liancheng
Github user liancheng commented on a diff in the pull request:

https://github.com/apache/spark/pull/2344#discussion_r18739656
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala
 ---
@@ -220,20 +220,44 @@ trait HiveTypeCoercion {
   case a: BinaryArithmetic if a.right.dataType == StringType =
 a.makeCopy(Array(a.left, Cast(a.right, DoubleType)))
 
+  // we should cast all timestamp/date/string compare into string 
compare
+  case p: BinaryPredicate if p.left.dataType == StringType
+ p.right.dataType == DateType =
+p.makeCopy(Array(p.left, Cast(p.right, StringType)))
+  case p: BinaryPredicate if p.left.dataType == DateType
+ p.right.dataType == StringType =
+p.makeCopy(Array(Cast(p.left, StringType), p.right))
   case p: BinaryPredicate if p.left.dataType == StringType
  p.right.dataType == TimestampType =
-p.makeCopy(Array(Cast(p.left, TimestampType), p.right))
+p.makeCopy(Array(p.left, Cast(p.right, StringType)))
   case p: BinaryPredicate if p.left.dataType == TimestampType
  p.right.dataType == StringType =
-p.makeCopy(Array(p.left, Cast(p.right, TimestampType)))
+p.makeCopy(Array(Cast(p.left, StringType), p.right))
+  case p: BinaryPredicate if p.left.dataType == TimestampType
+ p.right.dataType == DateType =
+p.makeCopy(Array(Cast(p.left, StringType), Cast(p.right, 
StringType)))
+  case p: BinaryPredicate if p.left.dataType == DateType
+ p.right.dataType == TimestampType =
+p.makeCopy(Array(Cast(p.left, StringType), Cast(p.right, 
StringType)))
--- End diff --

OK... verified this behavior with Hive, I've no idea about this :(


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-10-11 Thread adrian-wang
Github user adrian-wang commented on a diff in the pull request:

https://github.com/apache/spark/pull/2344#discussion_r18739665
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala
 ---
@@ -220,20 +220,44 @@ trait HiveTypeCoercion {
   case a: BinaryArithmetic if a.right.dataType == StringType =
 a.makeCopy(Array(a.left, Cast(a.right, DoubleType)))
 
+  // we should cast all timestamp/date/string compare into string 
compare
+  case p: BinaryPredicate if p.left.dataType == StringType
+ p.right.dataType == DateType =
+p.makeCopy(Array(p.left, Cast(p.right, StringType)))
+  case p: BinaryPredicate if p.left.dataType == DateType
+ p.right.dataType == StringType =
+p.makeCopy(Array(Cast(p.left, StringType), p.right))
   case p: BinaryPredicate if p.left.dataType == StringType
  p.right.dataType == TimestampType =
-p.makeCopy(Array(Cast(p.left, TimestampType), p.right))
+p.makeCopy(Array(p.left, Cast(p.right, StringType)))
   case p: BinaryPredicate if p.left.dataType == TimestampType
  p.right.dataType == StringType =
-p.makeCopy(Array(p.left, Cast(p.right, TimestampType)))
+p.makeCopy(Array(Cast(p.left, StringType), p.right))
+  case p: BinaryPredicate if p.left.dataType == TimestampType
+ p.right.dataType == DateType =
+p.makeCopy(Array(Cast(p.left, StringType), Cast(p.right, 
StringType)))
+  case p: BinaryPredicate if p.left.dataType == DateType
+ p.right.dataType == TimestampType =
+p.makeCopy(Array(Cast(p.left, StringType), Cast(p.right, 
StringType)))
--- End diff --

So Michael agreed to leave the whole ordering and comparing stuff in a 
separated PR :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-10-10 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-58616217
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21577/consoleFull)
 for   PR 2344 at commit 
[`0df6ea1`](https://github.com/apache/spark/commit/0df6ea1964710e6376daac282686da7fb35c29b2).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-10-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-58619034
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21577/Test 
PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-10-10 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-58619031
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21577/consoleFull)
 for   PR 2344 at commit 
[`0df6ea1`](https://github.com/apache/spark/commit/0df6ea1964710e6376daac282686da7fb35c29b2).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-10-10 Thread liancheng
Github user liancheng commented on a diff in the pull request:

https://github.com/apache/spark/pull/2344#discussion_r18695986
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala
 ---
@@ -220,20 +220,44 @@ trait HiveTypeCoercion {
   case a: BinaryArithmetic if a.right.dataType == StringType =
 a.makeCopy(Array(a.left, Cast(a.right, DoubleType)))
 
+  // we should cast all timestamp/date/string compare into string 
compare
+  case p: BinaryPredicate if p.left.dataType == StringType
+ p.right.dataType == DateType =
+p.makeCopy(Array(p.left, Cast(p.right, StringType)))
+  case p: BinaryPredicate if p.left.dataType == DateType
+ p.right.dataType == StringType =
+p.makeCopy(Array(Cast(p.left, StringType), p.right))
   case p: BinaryPredicate if p.left.dataType == StringType
  p.right.dataType == TimestampType =
-p.makeCopy(Array(Cast(p.left, TimestampType), p.right))
+p.makeCopy(Array(p.left, Cast(p.right, StringType)))
   case p: BinaryPredicate if p.left.dataType == TimestampType
  p.right.dataType == StringType =
-p.makeCopy(Array(p.left, Cast(p.right, TimestampType)))
+p.makeCopy(Array(Cast(p.left, StringType), p.right))
+  case p: BinaryPredicate if p.left.dataType == TimestampType
+ p.right.dataType == DateType =
+p.makeCopy(Array(Cast(p.left, StringType), Cast(p.right, 
StringType)))
+  case p: BinaryPredicate if p.left.dataType == DateType
+ p.right.dataType == TimestampType =
+p.makeCopy(Array(Cast(p.left, StringType), Cast(p.right, 
StringType)))
--- End diff --

How about turning `Date`/`Timestamp` comparison to `Long` comparison? 
String and long representations of `Timestamp` are both accurate to seconds.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-10-10 Thread liancheng
Github user liancheng commented on a diff in the pull request:

https://github.com/apache/spark/pull/2344#discussion_r18696379
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala
 ---
@@ -220,20 +220,44 @@ trait HiveTypeCoercion {
   case a: BinaryArithmetic if a.right.dataType == StringType =
 a.makeCopy(Array(a.left, Cast(a.right, DoubleType)))
 
+  // we should cast all timestamp/date/string compare into string 
compare
+  case p: BinaryPredicate if p.left.dataType == StringType
+ p.right.dataType == DateType =
+p.makeCopy(Array(p.left, Cast(p.right, StringType)))
+  case p: BinaryPredicate if p.left.dataType == DateType
+ p.right.dataType == StringType =
+p.makeCopy(Array(Cast(p.left, StringType), p.right))
   case p: BinaryPredicate if p.left.dataType == StringType
  p.right.dataType == TimestampType =
-p.makeCopy(Array(Cast(p.left, TimestampType), p.right))
+p.makeCopy(Array(p.left, Cast(p.right, StringType)))
   case p: BinaryPredicate if p.left.dataType == TimestampType
  p.right.dataType == StringType =
-p.makeCopy(Array(p.left, Cast(p.right, TimestampType)))
+p.makeCopy(Array(Cast(p.left, StringType), p.right))
+  case p: BinaryPredicate if p.left.dataType == TimestampType
+ p.right.dataType == DateType =
+p.makeCopy(Array(Cast(p.left, StringType), Cast(p.right, 
StringType)))
+  case p: BinaryPredicate if p.left.dataType == DateType
+ p.right.dataType == TimestampType =
+p.makeCopy(Array(Cast(p.left, StringType), Cast(p.right, 
StringType)))
 
   case p: BinaryPredicate if p.left.dataType == StringType  
p.right.dataType != StringType =
 p.makeCopy(Array(Cast(p.left, DoubleType), p.right))
   case p: BinaryPredicate if p.left.dataType != StringType  
p.right.dataType == StringType =
 p.makeCopy(Array(p.left, Cast(p.right, DoubleType)))
 
-  case i @ In(a,b) if a.dataType == TimestampType  
b.forall(_.dataType == StringType) =
-i.makeCopy(Array(a,b.map(Cast(_,TimestampType
+  case i @ In(a, b) if a.dataType == DateType  b.forall(_.dataType 
== StringType) =
+i.makeCopy(Array(Cast(a, StringType), b))
+  case i @ In(a, b) if a.dataType == TimestampType  
b.forall(_.dataType == StringType) =
+i.makeCopy(Array(Cast(a, StringType), b))
+  case i @ In(a, b) if a.dataType == DateType  b.forall(_.dataType 
== TimestampType) =
+i.makeCopy(Array(Cast(a, StringType), b.map(Cast(_, StringType
+  case i @ In(a, b) if a.dataType == TimestampType  
b.forall(_.dataType == DateType) =
+i.makeCopy(Array(Cast(a, StringType), b.map(Cast(_, StringType
+  case i @ In(a, b) if a.dataType == DateType  b.forall(_.dataType 
== DateType) =
+i.makeCopy(Array(Cast(a, StringType), b.map(Cast(_, StringType
+  case i @ In(a, b) if a.dataType == TimestampType  
b.forall(_.dataType == TimestampType) =
+i.makeCopy(Array(Cast(a, StringType), b.map(Cast(_, StringType
--- End diff --

The 4 case clauses above can also use long instead of string (and they 
probably should be moved to some rule other than `PromoteStrings`).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-10-10 Thread liancheng
Github user liancheng commented on a diff in the pull request:

https://github.com/apache/spark/pull/2344#discussion_r18697748
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
 ---
@@ -56,7 +60,9 @@ case class Cast(child: Expression, dataType: DataType) 
extends UnaryExpression {
 case StringType =
   buildCast[String](_, _.length() != 0)
 case TimestampType =
-  buildCast[Timestamp](_, b = b.getTime() != 0 || b.getNanos() != 0)
+  buildCast[Timestamp](_, t = t.getTime() != 0 || t.getNanos() != 0)
+case DateType =
+  buildCast[Date](_, d = null)
--- End diff --

Leaving a comment here would be good. It's really unintuitive here to see a 
timestamp can be casted to a boolean while a date has to be null.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-10-10 Thread liancheng
Github user liancheng commented on a diff in the pull request:

https://github.com/apache/spark/pull/2344#discussion_r18698063
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/columnar/ColumnStats.scala ---
@@ -190,6 +190,24 @@ private[sql] class StringColumnStats extends 
ColumnStats {
   def collectedStatistics = Row(lower, upper, nullCount)
 }
 
+private[sql] class DateColumnStats extends ColumnStats {
+  var upper: Date = null
+  var lower: Date = null
+  var nullCount = 0
+
+  override def gatherStats(row: Row, ordinal: Int) {
+if (!row.isNullAt(ordinal)) {
+  val value = row(ordinal).asInstanceOf[Date]
+  if (upper == null || value.toString.compareTo(upper.toString)  0) 
upper = value
+  if (lower == null || value.toString.compareTo(lower.toString)  0) 
lower = value
--- End diff --

Using `.getTime` to do the comparison would be much more efficient. When 
caching a `DateType` column, this function is really a critical path.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-10-10 Thread liancheng
Github user liancheng commented on a diff in the pull request:

https://github.com/apache/spark/pull/2344#discussion_r18698218
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/columnar/ColumnType.scala ---
@@ -376,8 +395,8 @@ private[sql] sealed abstract class 
ByteArrayColumnType[T : DataType](
   }
 }
 
-private[sql] object BINARY extends ByteArrayColumnType[BinaryType.type](9, 
16) {
-  override def setField(row: MutableRow, ordinal: Int, value: 
Array[Byte]): Unit = {
+private[sql] object BINARY extends 
ByteArrayColumnType[BinaryType.type](10, 16) {
+  override def setField(row: MutableRow, ordinal: Int, value: Array[Byte]) 
{
--- End diff --

Nit: Hmm, actually `: Unit =` is more recommended :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-10-10 Thread liancheng
Github user liancheng commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-58639081
  
Left some minor comments, otherwise LGTM.  Also, do we want to add Python 
API in this PR or a separate one?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-10-10 Thread adrian-wang
Github user adrian-wang commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-58646718
  
I can see most of the reviews are focused on comparing and ordering. I'd 
like to fix those comparing rules in a separated PR. I tested what you declared 
here the other day, that was also different from what Hive did. Also Python API 
is can be a separated PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-10-10 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-58651729
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21591/consoleFull)
 for   PR 2344 at commit 
[`00fe81f`](https://github.com/apache/spark/commit/00fe81f5586118694642422f94846f48e6ca50fc).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-10-10 Thread adrian-wang
Github user adrian-wang commented on a diff in the pull request:

https://github.com/apache/spark/pull/2344#discussion_r18704345
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala
 ---
@@ -220,20 +220,44 @@ trait HiveTypeCoercion {
   case a: BinaryArithmetic if a.right.dataType == StringType =
 a.makeCopy(Array(a.left, Cast(a.right, DoubleType)))
 
+  // we should cast all timestamp/date/string compare into string 
compare
+  case p: BinaryPredicate if p.left.dataType == StringType
+ p.right.dataType == DateType =
+p.makeCopy(Array(p.left, Cast(p.right, StringType)))
+  case p: BinaryPredicate if p.left.dataType == DateType
+ p.right.dataType == StringType =
+p.makeCopy(Array(Cast(p.left, StringType), p.right))
   case p: BinaryPredicate if p.left.dataType == StringType
  p.right.dataType == TimestampType =
-p.makeCopy(Array(Cast(p.left, TimestampType), p.right))
+p.makeCopy(Array(p.left, Cast(p.right, StringType)))
   case p: BinaryPredicate if p.left.dataType == TimestampType
  p.right.dataType == StringType =
-p.makeCopy(Array(p.left, Cast(p.right, TimestampType)))
+p.makeCopy(Array(Cast(p.left, StringType), p.right))
+  case p: BinaryPredicate if p.left.dataType == TimestampType
+ p.right.dataType == DateType =
+p.makeCopy(Array(Cast(p.left, StringType), Cast(p.right, 
StringType)))
+  case p: BinaryPredicate if p.left.dataType == DateType
+ p.right.dataType == TimestampType =
+p.makeCopy(Array(Cast(p.left, StringType), Cast(p.right, 
StringType)))
 
   case p: BinaryPredicate if p.left.dataType == StringType  
p.right.dataType != StringType =
 p.makeCopy(Array(Cast(p.left, DoubleType), p.right))
   case p: BinaryPredicate if p.left.dataType != StringType  
p.right.dataType == StringType =
 p.makeCopy(Array(p.left, Cast(p.right, DoubleType)))
 
-  case i @ In(a,b) if a.dataType == TimestampType  
b.forall(_.dataType == StringType) =
-i.makeCopy(Array(a,b.map(Cast(_,TimestampType
+  case i @ In(a, b) if a.dataType == DateType  b.forall(_.dataType 
== StringType) =
+i.makeCopy(Array(Cast(a, StringType), b))
+  case i @ In(a, b) if a.dataType == TimestampType  
b.forall(_.dataType == StringType) =
+i.makeCopy(Array(Cast(a, StringType), b))
+  case i @ In(a, b) if a.dataType == DateType  b.forall(_.dataType 
== TimestampType) =
+i.makeCopy(Array(Cast(a, StringType), b.map(Cast(_, StringType
+  case i @ In(a, b) if a.dataType == TimestampType  
b.forall(_.dataType == DateType) =
+i.makeCopy(Array(Cast(a, StringType), b.map(Cast(_, StringType
+  case i @ In(a, b) if a.dataType == DateType  b.forall(_.dataType 
== DateType) =
+i.makeCopy(Array(Cast(a, StringType), b.map(Cast(_, StringType
+  case i @ In(a, b) if a.dataType == TimestampType  
b.forall(_.dataType == TimestampType) =
+i.makeCopy(Array(Cast(a, StringType), b.map(Cast(_, StringType
--- End diff --

Using long value of `java.sql.Date` is way too far from Hive. I have 
discussed with @chenghao-intel and We think we can leave it as it is and fix 
the ordering and comparing in a separate PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-10-10 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-58652251
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21592/consoleFull)
 for   PR 2344 at commit 
[`2038085`](https://github.com/apache/spark/commit/203808565ccb6f1e503b4feb2a8eac4fe9b59408).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-10-10 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-58654642
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21593/consoleFull)
 for   PR 2344 at commit 
[`f15074a`](https://github.com/apache/spark/commit/f15074a614281d3fe4de4f0529ddc53994b4c0d9).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-10-10 Thread adrian-wang
Github user adrian-wang commented on a diff in the pull request:

https://github.com/apache/spark/pull/2344#discussion_r18705174
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala
 ---
@@ -220,20 +220,44 @@ trait HiveTypeCoercion {
   case a: BinaryArithmetic if a.right.dataType == StringType =
 a.makeCopy(Array(a.left, Cast(a.right, DoubleType)))
 
+  // we should cast all timestamp/date/string compare into string 
compare
+  case p: BinaryPredicate if p.left.dataType == StringType
+ p.right.dataType == DateType =
+p.makeCopy(Array(p.left, Cast(p.right, StringType)))
+  case p: BinaryPredicate if p.left.dataType == DateType
+ p.right.dataType == StringType =
+p.makeCopy(Array(Cast(p.left, StringType), p.right))
   case p: BinaryPredicate if p.left.dataType == StringType
  p.right.dataType == TimestampType =
-p.makeCopy(Array(Cast(p.left, TimestampType), p.right))
+p.makeCopy(Array(p.left, Cast(p.right, StringType)))
   case p: BinaryPredicate if p.left.dataType == TimestampType
  p.right.dataType == StringType =
-p.makeCopy(Array(p.left, Cast(p.right, TimestampType)))
+p.makeCopy(Array(Cast(p.left, StringType), p.right))
+  case p: BinaryPredicate if p.left.dataType == TimestampType
+ p.right.dataType == DateType =
+p.makeCopy(Array(Cast(p.left, StringType), Cast(p.right, 
StringType)))
+  case p: BinaryPredicate if p.left.dataType == DateType
+ p.right.dataType == TimestampType =
+p.makeCopy(Array(Cast(p.left, StringType), Cast(p.right, 
StringType)))
--- End diff --

It seems cast('1970-01-01' as date)  cast('1970-01-01 00:00:00' as 
timestamp)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-10-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-58655469
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21591/Test 
FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-10-10 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-58655457
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21591/consoleFull)
 for   PR 2344 at commit 
[`00fe81f`](https://github.com/apache/spark/commit/00fe81f5586118694642422f94846f48e6ca50fc).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class Cast(child: Expression, dataType: DataType) extends 
UnaryExpression with Logging `
  * `public class DateType extends DataType `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-10-10 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-58656118
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21592/consoleFull)
 for   PR 2344 at commit 
[`2038085`](https://github.com/apache/spark/commit/203808565ccb6f1e503b4feb2a8eac4fe9b59408).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class Cast(child: Expression, dataType: DataType) extends 
UnaryExpression with Logging `
  * `public class DateType extends DataType `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-10-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-58656123
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21592/Test 
FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-10-10 Thread adrian-wang
Github user adrian-wang commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-58657404
  
Seems we are getting something wrong with orc in jenkins... My local test 
is fine.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-10-10 Thread liancheng
Github user liancheng commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-58657652
  
I guess this one is related: 
https://github.com/apache/spark/commit/411cf29fff011561f0093bb6101af87842828369


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-10-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-58658721
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21593/Test 
FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-10-10 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-58658714
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21593/consoleFull)
 for   PR 2344 at commit 
[`f15074a`](https://github.com/apache/spark/commit/f15074a614281d3fe4de4f0529ddc53994b4c0d9).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class Cast(child: Expression, dataType: DataType) extends 
UnaryExpression with Logging `
  * `public class DateType extends DataType `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-10-10 Thread adrian-wang
Github user adrian-wang commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-58659730
  
retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-10-10 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-58660519
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21594/consoleFull)
 for   PR 2344 at commit 
[`f15074a`](https://github.com/apache/spark/commit/f15074a614281d3fe4de4f0529ddc53994b4c0d9).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-10-10 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-58665073
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21594/consoleFull)
 for   PR 2344 at commit 
[`f15074a`](https://github.com/apache/spark/commit/f15074a614281d3fe4de4f0529ddc53994b4c0d9).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class Cast(child: Expression, dataType: DataType) extends 
UnaryExpression with Logging `
  * `public class DateType extends DataType `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-10-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-58665080
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21594/Test 
FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-10-10 Thread adrian-wang
Github user adrian-wang commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-58735806
  
retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-10-10 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-58735889
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21615/consoleFull)
 for   PR 2344 at commit 
[`f15074a`](https://github.com/apache/spark/commit/f15074a614281d3fe4de4f0529ddc53994b4c0d9).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-10-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-58736899
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21615/Test 
PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-10-10 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-58736896
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21615/consoleFull)
 for   PR 2344 at commit 
[`f15074a`](https://github.com/apache/spark/commit/f15074a614281d3fe4de4f0529ddc53994b4c0d9).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-10-07 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-58299209
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21442/consoleFull)
 for   PR 2344 at commit 
[`0c339b0`](https://github.com/apache/spark/commit/0c339b0b4b5572327ff1d61d9c441dabde6d1b45).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-10-07 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-58303274
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21442/consoleFull)
 for   PR 2344 at commit 
[`0c339b0`](https://github.com/apache/spark/commit/0c339b0b4b5572327ff1d61d9c441dabde6d1b45).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class Cast(child: Expression, dataType: DataType) extends 
UnaryExpression with Logging `
  * `public class DateType extends DataType `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-10-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-58303279
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21442/Test 
PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-10-01 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-57550528
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/238/consoleFull)
 for   PR 2344 at commit 
[`f4058ab`](https://github.com/apache/spark/commit/f4058ab1a185b3dc3fb5fe1522ef5b481601d873).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-10-01 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-57554545
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/238/consoleFull)
 for   PR 2344 at commit 
[`f4058ab`](https://github.com/apache/spark/commit/f4058ab1a185b3dc3fb5fe1522ef5b481601d873).
 * This patch **fails** unit tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-10-01 Thread marmbrus
Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-57555702
  
I'm fine with finishing this in another PR, but can you remove the failing 
test from the whitelist?  Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-28 Thread adrian-wang
Github user adrian-wang commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-57079061
  
Hive is really strange dealing with date, there's still some inconsistency 
between, but I think it is better to update in following PRs.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-28 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-57079131
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20933/consoleFull)
 for   PR 2344 at commit 
[`f4058ab`](https://github.com/apache/spark/commit/f4058ab1a185b3dc3fb5fe1522ef5b481601d873).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-28 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-57081729
  
**[Tests timed 
out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20933/consoleFull)**
 after a configured wait of `120m`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-57081730
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/20933/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-27 Thread adrian-wang
Github user adrian-wang commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-57057116
  
@marmbrus , just in case it will be out of date soon...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-27 Thread marmbrus
Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/2344#discussion_r18125250
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala
 ---
@@ -220,20 +220,52 @@ trait HiveTypeCoercion {
   case a: BinaryArithmetic if a.right.dataType == StringType =
 a.makeCopy(Array(a.left, Cast(a.right, DoubleType)))
 
+  // we should cast all timestamp/date/string compare into string 
compare,
+  // even if both sides are of same type, as Hive use xxxwritable to 
compare.
--- End diff --

Can you explain this more?  It seems more expensive to convert to strings 
and then compare strings instead of just comparing the underlying types.  What 
does writables have to do with this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-27 Thread marmbrus
Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-57062345
  
Sorry for the delay, this week has been very busy!  I'd like to merge this 
soon, only one small question.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-27 Thread adrian-wang
Github user adrian-wang commented on a diff in the pull request:

https://github.com/apache/spark/pull/2344#discussion_r18126726
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala
 ---
@@ -220,20 +220,52 @@ trait HiveTypeCoercion {
   case a: BinaryArithmetic if a.right.dataType == StringType =
 a.makeCopy(Array(a.left, Cast(a.right, DoubleType)))
 
+  // we should cast all timestamp/date/string compare into string 
compare,
+  // even if both sides are of same type, as Hive use xxxwritable to 
compare.
--- End diff --

I considered this question again and now think when comparing same types, 
it is better not to convert to string but write `compareTo` methods for them, 
since the native `compareTo` method of `java.sql.date` seems not consistent 
with `DateWritable`. I'll do a quick follow up.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-27 Thread adrian-wang
Github user adrian-wang commented on a diff in the pull request:

https://github.com/apache/spark/pull/2344#discussion_r18126862
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala
 ---
@@ -220,20 +220,52 @@ trait HiveTypeCoercion {
   case a: BinaryArithmetic if a.right.dataType == StringType =
 a.makeCopy(Array(a.left, Cast(a.right, DoubleType)))
 
+  // we should cast all timestamp/date/string compare into string 
compare,
+  // even if both sides are of same type, as Hive use xxxwritable to 
compare.
--- End diff --

The native `compareTo` of both `java.sql.Date` point to `java.util.Date`, 
which compares the millis since epoch, but `DateWritable` compares days since 
epoch, so here is the gap.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-23 Thread marmbrus
Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-56575927
  
Yeah, I think this is probably ready to merge.  Mind updating / resolving 
conflicts?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-23 Thread adrian-wang
Github user adrian-wang commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-56616986
  
@marmbrus , I have updated on current master and run sbt tests locally.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-23 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-56617051
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20737/consoleFull)
 for   PR 2344 at commit 
[`be6a958`](https://github.com/apache/spark/commit/be6a958284932b0685656869b6586f1159cdb599).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-23 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-56619774
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/20737/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-23 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-56619771
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20737/consoleFull)
 for   PR 2344 at commit 
[`be6a958`](https://github.com/apache/spark/commit/be6a958284932b0685656869b6586f1159cdb599).
 * This patch **passes** unit tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class Cast(child: Expression, dataType: DataType) extends 
UnaryExpression with Logging `
  * `public class DateType extends DataType `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-22 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-56332589
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20647/consoleFull)
 for   PR 2344 at commit 
[`8e91db0`](https://github.com/apache/spark/commit/8e91db07e5df95fa7962be892185e5430d17b793).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-22 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-56336063
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20647/consoleFull)
 for   PR 2344 at commit 
[`8e91db0`](https://github.com/apache/spark/commit/8e91db07e5df95fa7962be892185e5430d17b793).
 * This patch **passes** unit tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class Cast(child: Expression, dataType: DataType) extends 
UnaryExpression with Logging `
  * `public class DateType extends DataType `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-22 Thread chenghao-intel
Github user chenghao-intel commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-56342088
  
@adrian-wang clarified some of my concerns offline, I think generally it's 
ready to be merged. @marmbrus , @liancheng , any more comments on this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-21 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-56327106
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20640/consoleFull)
 for   PR 2344 at commit 
[`e3b12af`](https://github.com/apache/spark/commit/e3b12afca1a06ccf37d032187f7ff60f19793fef).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-21 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-56327272
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20642/consoleFull)
 for   PR 2344 at commit 
[`92b31b4`](https://github.com/apache/spark/commit/92b31b4ef9f86ff05c0d0429eb382e27da1fe18e).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-21 Thread adrian-wang
Github user adrian-wang commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-56327262
  
have rebased and addressed comments from @chenghao-intel .


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-21 Thread adrian-wang
Github user adrian-wang commented on a diff in the pull request:

https://github.com/apache/spark/pull/2344#discussion_r17832080
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/types/dataTypes.scala 
---
@@ -170,6 +170,18 @@ case object TimestampType extends NativeType {
   def simpleString: String = timestamp
 }
 
+case object DateType extends NativeType {
+  private[sql] type JvmType = Date
+
+  @transient private[sql] lazy val tag = ScalaReflectionLock.synchronized 
{ typeTag[JvmType] }
+
+  private[sql] val ordering = new Ordering[JvmType] {
+def compare(x: Date, y: Date) = x.compareTo(y)
--- End diff --

This function is only used for ordering, not for data comparison.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-21 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-56328916
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20640/consoleFull)
 for   PR 2344 at commit 
[`e3b12af`](https://github.com/apache/spark/commit/e3b12afca1a06ccf37d032187f7ff60f19793fef).
 * This patch **passes** unit tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class Cast(child: Expression, dataType: DataType) extends 
UnaryExpression with Logging `
  * `public class DateType extends DataType `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-21 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-56329240
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20642/consoleFull)
 for   PR 2344 at commit 
[`92b31b4`](https://github.com/apache/spark/commit/92b31b4ef9f86ff05c0d0429eb382e27da1fe18e).
 * This patch **passes** unit tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class Cast(child: Expression, dataType: DataType) extends 
UnaryExpression with Logging `
  * `abstract class LogicalPlan extends QueryPlan[LogicalPlan] with Logging 
`
  * `public class DateType extends DataType `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-18 Thread adrian-wang
Github user adrian-wang commented on a diff in the pull request:

https://github.com/apache/spark/pull/2344#discussion_r17711980
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala
 ---
@@ -221,19 +221,35 @@ trait HiveTypeCoercion {
 a.makeCopy(Array(a.left, Cast(a.right, DoubleType)))
 
   case p: BinaryPredicate if p.left.dataType == StringType
+ p.right.dataType == DateType =
+p.makeCopy(Array(Cast(p.left, DateType), p.right))
+  case p: BinaryPredicate if p.left.dataType == DateType
+ p.right.dataType == StringType =
+p.makeCopy(Array(p.left, Cast(p.right, DateType)))
+  case p: BinaryPredicate if p.left.dataType == StringType
  p.right.dataType == TimestampType =
 p.makeCopy(Array(Cast(p.left, TimestampType), p.right))
   case p: BinaryPredicate if p.left.dataType == TimestampType
  p.right.dataType == StringType =
 p.makeCopy(Array(p.left, Cast(p.right, TimestampType)))
+  case p: BinaryPredicate if p.left.dataType == TimestampType
+ p.right.dataType == DateType =
+p.makeCopy(Array(Cast(p.left, DateType), p.right))
--- End diff --

To get the same result from hive, seems we have to change everything to 
string here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-18 Thread adrian-wang
Github user adrian-wang commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-56018809
  
Though Hive would throw exception when cast int to date, I think it doesn't 
matter if not obey, because query exit with exception is considered a failure 
in `runHive`. Let's keep cast int to date as null, exactly as what hive does 
with cast string to int.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-18 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-56018966
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20537/consoleFull)
 for   PR 2344 at commit 
[`458c113`](https://github.com/apache/spark/commit/458c11394c2e8788cee7664ed8c6f094a3df9f97).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-18 Thread adrian-wang
Github user adrian-wang commented on a diff in the pull request:

https://github.com/apache/spark/pull/2344#discussion_r17718951
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
 ---
@@ -177,6 +223,8 @@ case class Cast(child: Expression, dataType: DataType) 
extends UnaryExpression {
   })
 case BooleanType =
   buildCast[Boolean](_, b = if (b) 1.toShort else 0.toShort)
+case DateType =
--- End diff --

Hive returns NULL when cast from DATE to INT., etc


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-18 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-56022043
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20537/consoleFull)
 for   PR 2344 at commit 
[`458c113`](https://github.com/apache/spark/commit/458c11394c2e8788cee7664ed8c6f094a3df9f97).
 * This patch **fails** unit tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class JavaSparkContext(val sc: SparkContext)`
  * `  class ArrayConstructor extends 
net.razorvine.pickle.objects.ArrayConstructor `
  * `  throw new IllegalStateException(The main method in the given 
main class must be static)`
  * `class TaskCompletionListenerException(errorMessages: Seq[String]) 
extends Exception `
  * `class NonASCIICharacterChecker extends ScalariformChecker `
  * `class Dummy(object):`
  * `class RatingDeserializer(FramedSerializer):`
  * `class SCCallSiteSync(object):`
  * `case class Cast(child: Expression, dataType: DataType) extends 
UnaryExpression with Logging `
  * `case class CreateTableAsSelect(`
  * `public class DateType extends DataType `
  * `  class Encoder[T : NativeType](columnType: NativeColumnType[T]) 
extends compression.Encoder[T] `
  * `  class Encoder[T : NativeType](columnType: NativeColumnType[T]) 
extends compression.Encoder[T] `
  * `  class Encoder[T : NativeType](columnType: NativeColumnType[T]) 
extends compression.Encoder[T] `
  * `  class Encoder extends compression.Encoder[IntegerType.type] `
  * `  class Decoder(buffer: ByteBuffer, columnType: 
NativeColumnType[IntegerType.type])`
  * `  class Encoder extends compression.Encoder[LongType.type] `
  * `  class Decoder(buffer: ByteBuffer, columnType: 
NativeColumnType[LongType.type])`
  * `case class CreateTableAsSelect(`
  * `class JavaStreamingContext(val ssc: StreamingContext) extends 
Closeable `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-18 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-56028120
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20539/consoleFull)
 for   PR 2344 at commit 
[`252e84a`](https://github.com/apache/spark/commit/252e84a185191be5bb07681561e4f16c80fc8bf9).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-18 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-56032559
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20539/consoleFull)
 for   PR 2344 at commit 
[`252e84a`](https://github.com/apache/spark/commit/252e84a185191be5bb07681561e4f16c80fc8bf9).
 * This patch **passes** unit tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class Cast(child: Expression, dataType: DataType) extends 
UnaryExpression with Logging `
  * `public class DateType extends DataType `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-18 Thread chenghao-intel
Github user chenghao-intel commented on a diff in the pull request:

https://github.com/apache/spark/pull/2344#discussion_r17765755
  
--- Diff: 
sql/core/src/main/java/org/apache/spark/sql/api/java/DateType.java ---
@@ -0,0 +1,27 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.api.java;
+
+/**
+ * The data type representing java.sql.Timestamp values.
+ *
+ * {@code TimestampType} is represented by the singleton object {@link 
DataType#TimestampType}.
--- End diff --

Update the comment, please. :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-18 Thread chenghao-intel
Github user chenghao-intel commented on a diff in the pull request:

https://github.com/apache/spark/pull/2344#discussion_r17765733
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/types/dataTypes.scala 
---
@@ -170,6 +170,18 @@ case object TimestampType extends NativeType {
   def simpleString: String = timestamp
 }
 
+case object DateType extends NativeType {
+  private[sql] type JvmType = Date
+
+  @transient private[sql] lazy val tag = ScalaReflectionLock.synchronized 
{ typeTag[JvmType] }
+
+  private[sql] val ordering = new Ordering[JvmType] {
+def compare(x: Date, y: Date) = x.compareTo(y)
--- End diff --

I've checked the logic of `java.sql.Date`.`compareTo`, and it is not the 
same as `DateWritable`.`compareTo`, which is the internal representation in 
Hive. The former will compare its milliseconds, but the later only compare the 
`days since Epoch`, probably we need to follow the same semantic with Hive here.
BTW: If we change the logic here, does that also mean we needn't cast the 
`date` to `string` for `BinaryPredicate` in `HiveTypeCoercion`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-18 Thread chenghao-intel
Github user chenghao-intel commented on a diff in the pull request:

https://github.com/apache/spark/pull/2344#discussion_r17765726
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/types/dataTypes.scala 
---
@@ -170,6 +170,18 @@ case object TimestampType extends NativeType {
   def simpleString: String = timestamp
 }
 
+case object DateType extends NativeType {
+  private[sql] type JvmType = Date
+
+  @transient private[sql] lazy val tag = ScalaReflectionLock.synchronized 
{ typeTag[JvmType] }
+
+  private[sql] val ordering = new Ordering[JvmType] {
+def compare(x: Date, y: Date) = x.compareTo(y)
--- End diff --

I've checked the logic of `java.sql.Date`.`compareTo`, and it is not the 
same as `DateWritable`.`compareTo`, which is the internal representation in 
Hive. The former will compare its milliseconds, but the later only compare the 
`days since Epoch`, probably we need to follow the same semantic with Hive here.
BTW: If we change the logic here, does that also mean we needn't cast the 
`date` to `string` for `BinaryPredicate` in `HiveTypeCoercion`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-18 Thread adrian-wang
Github user adrian-wang commented on a diff in the pull request:

https://github.com/apache/spark/pull/2344#discussion_r17766373
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/types/dataTypes.scala 
---
@@ -170,6 +170,18 @@ case object TimestampType extends NativeType {
   def simpleString: String = timestamp
 }
 
+case object DateType extends NativeType {
+  private[sql] type JvmType = Date
+
+  @transient private[sql] lazy val tag = ScalaReflectionLock.synchronized 
{ typeTag[JvmType] }
+
+  private[sql] val ordering = new Ordering[JvmType] {
+def compare(x: Date, y: Date) = x.compareTo(y)
--- End diff --

Do we also need to modify `compareTo` from `TimestampType`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-17 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-55870259
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20463/consoleFull)
 for   PR 2344 at commit 
[`413f946`](https://github.com/apache/spark/commit/413f946ea2d9488b08186307bed550fa298a8a23).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-17 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-55877253
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20463/consoleFull)
 for   PR 2344 at commit 
[`413f946`](https://github.com/apache/spark/commit/413f946ea2d9488b08186307bed550fa298a8a23).
 * This patch **fails** unit tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class Cast(child: Expression, dataType: DataType) extends 
UnaryExpression with Logging `
  * `public class DateType extends DataType `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-17 Thread adrian-wang
Github user adrian-wang commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-55889454
  
Just rebase code, the failure is in spark-core and not in this patch.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-17 Thread liancheng
Github user liancheng commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-55967788
  
test this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-17 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-55968540
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20495/consoleFull)
 for   PR 2344 at commit 
[`413f946`](https://github.com/apache/spark/commit/413f946ea2d9488b08186307bed550fa298a8a23).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-17 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-55969213
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20495/consoleFull)
 for   PR 2344 at commit 
[`413f946`](https://github.com/apache/spark/commit/413f946ea2d9488b08186307bed550fa298a8a23).
 * This patch **fails** unit tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class Cast(child: Expression, dataType: DataType) extends 
UnaryExpression with Logging `
  * `public class DateType extends DataType `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-17 Thread chenghao-intel
Github user chenghao-intel commented on a diff in the pull request:

https://github.com/apache/spark/pull/2344#discussion_r17703765
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala
 ---
@@ -221,19 +221,35 @@ trait HiveTypeCoercion {
 a.makeCopy(Array(a.left, Cast(a.right, DoubleType)))
 
   case p: BinaryPredicate if p.left.dataType == StringType
+ p.right.dataType == DateType =
+p.makeCopy(Array(Cast(p.left, DateType), p.right))
+  case p: BinaryPredicate if p.left.dataType == DateType
+ p.right.dataType == StringType =
+p.makeCopy(Array(p.left, Cast(p.right, DateType)))
+  case p: BinaryPredicate if p.left.dataType == StringType
  p.right.dataType == TimestampType =
 p.makeCopy(Array(Cast(p.left, TimestampType), p.right))
   case p: BinaryPredicate if p.left.dataType == TimestampType
  p.right.dataType == StringType =
 p.makeCopy(Array(p.left, Cast(p.right, TimestampType)))
+  case p: BinaryPredicate if p.left.dataType == TimestampType
+ p.right.dataType == DateType =
+p.makeCopy(Array(Cast(p.left, DateType), p.right))
--- End diff --

I didn't check this in Hive, but probably it's better to convert the `Date` 
= `Timestamp`, since `Timestamp` is more precise. The same for the rule `In` .


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-17 Thread chenghao-intel
Github user chenghao-intel commented on a diff in the pull request:

https://github.com/apache/spark/pull/2344#discussion_r17703775
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala
 ---
@@ -221,19 +221,35 @@ trait HiveTypeCoercion {
 a.makeCopy(Array(a.left, Cast(a.right, DoubleType)))
 
   case p: BinaryPredicate if p.left.dataType == StringType
+ p.right.dataType == DateType =
+p.makeCopy(Array(Cast(p.left, DateType), p.right))
+  case p: BinaryPredicate if p.left.dataType == DateType
+ p.right.dataType == StringType =
+p.makeCopy(Array(p.left, Cast(p.right, DateType)))
+  case p: BinaryPredicate if p.left.dataType == StringType
  p.right.dataType == TimestampType =
 p.makeCopy(Array(Cast(p.left, TimestampType), p.right))
   case p: BinaryPredicate if p.left.dataType == TimestampType
  p.right.dataType == StringType =
 p.makeCopy(Array(p.left, Cast(p.right, TimestampType)))
+  case p: BinaryPredicate if p.left.dataType == TimestampType
+ p.right.dataType == DateType =
+p.makeCopy(Array(Cast(p.left, DateType), p.right))
+  case p: BinaryPredicate if p.left.dataType == DateType
+ p.right.dataType == TimestampType =
+p.makeCopy(Array(p.left, Cast(p.right, DateType)))
 
   case p: BinaryPredicate if p.left.dataType == StringType  
p.right.dataType != StringType =
 p.makeCopy(Array(Cast(p.left, DoubleType), p.right))
   case p: BinaryPredicate if p.left.dataType != StringType  
p.right.dataType == StringType =
 p.makeCopy(Array(p.left, Cast(p.right, DoubleType)))
 
+  case i @ In(a,b) if a.dataType == DateType  b.forall(_.dataType == 
StringType) =
+i.makeCopy(Array(a,b.map(Cast(_,DateType
--- End diff --

Nit: space after `,`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-17 Thread chenghao-intel
Github user chenghao-intel commented on a diff in the pull request:

https://github.com/apache/spark/pull/2344#discussion_r17703854
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
 ---
@@ -95,6 +100,8 @@ case class Cast(child: Expression, dataType: DataType) 
extends UnaryExpression {
   buildCast[Short](_, s = new Timestamp(s * 1000))
 case ByteType =
   buildCast[Byte](_, b = new Timestamp(b * 1000))
+case DateType =
+  buildCast[Date](_, d = Timestamp.valueOf(dateToString(d) +  
00:00:00))
--- End diff --

Will that be more efficient `new Timestamp(d.getTime())`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-17 Thread chenghao-intel
Github user chenghao-intel commented on a diff in the pull request:

https://github.com/apache/spark/pull/2344#discussion_r17703950
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
 ---
@@ -140,6 +147,39 @@ case class Cast(child: Expression, dataType: DataType) 
extends UnaryExpression {
 }
   }
 
+  // Converts Timestamp to string according to Hive TimestampWritable 
convention
+  private[this] def timestampToDateString(ts: Timestamp): String = {
+Cast.threadLocalDateFormat.get.format(ts)
+  }
+
+  // DateConverter
+  private[this] def castToDate: Any = Any = child.dataType match {
+case StringType =
+  buildCast[String](_, s = if (s.contains( )) {
+try castToDate(castToTimestamp(s))
+catch { case _: java.lang.IllegalArgumentException = null }
+  } else {
+try Date.valueOf(s) catch { case _: 
java.lang.IllegalArgumentException = null }
+  })
+case TimestampType =
+  buildCast[Timestamp](_, t = Date.valueOf(timestampToDateString(t)))
+// TimestampWritable.decimalToDate
+case _ =
--- End diff --

Nit: Keep in the same line.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-17 Thread chenghao-intel
Github user chenghao-intel commented on a diff in the pull request:

https://github.com/apache/spark/pull/2344#discussion_r17704012
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
 ---
@@ -140,6 +147,39 @@ case class Cast(child: Expression, dataType: DataType) 
extends UnaryExpression {
 }
   }
 
+  // Converts Timestamp to string according to Hive TimestampWritable 
convention
+  private[this] def timestampToDateString(ts: Timestamp): String = {
+Cast.threadLocalDateFormat.get.format(ts)
+  }
+
+  // DateConverter
+  private[this] def castToDate: Any = Any = child.dataType match {
--- End diff --

We probably need to compare the date conversion with Hive, 
https://github.com/apache/hive/blob/branch-0.12/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/PrimitiveObjectInspectorUtils.java#L1032

Particularly for `null` returns or throws exception.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-17 Thread chenghao-intel
Github user chenghao-intel commented on a diff in the pull request:

https://github.com/apache/spark/pull/2344#discussion_r17704067
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
 ---
@@ -140,6 +147,39 @@ case class Cast(child: Expression, dataType: DataType) 
extends UnaryExpression {
 }
   }
 
+  // Converts Timestamp to string according to Hive TimestampWritable 
convention
+  private[this] def timestampToDateString(ts: Timestamp): String = {
+Cast.threadLocalDateFormat.get.format(ts)
+  }
+
+  // DateConverter
+  private[this] def castToDate: Any = Any = child.dataType match {
+case StringType =
+  buildCast[String](_, s = if (s.contains( )) {
+try castToDate(castToTimestamp(s))
+catch { case _: java.lang.IllegalArgumentException = null }
+  } else {
+try Date.valueOf(s) catch { case _: 
java.lang.IllegalArgumentException = null }
+  })
+case TimestampType =
+  buildCast[Timestamp](_, t = Date.valueOf(timestampToDateString(t)))
+// TimestampWritable.decimalToDate
+case _ =
+  _ = null
+  }
+
+  // Date cannot be cast to long, according to hive
+  private[this] def dateToLong(d: Date) = null
--- End diff --

Throw exception instead? Seems Hive does that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-17 Thread chenghao-intel
Github user chenghao-intel commented on a diff in the pull request:

https://github.com/apache/spark/pull/2344#discussion_r17704236
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
 ---
@@ -177,6 +223,8 @@ case class Cast(child: Expression, dataType: DataType) 
extends UnaryExpression {
   })
 case BooleanType =
   buildCast[Boolean](_, b = if (b) 1.toShort else 0.toShort)
+case DateType =
--- End diff --

If `Datetype` can not be casted into `IntegerType` / `ShortType` / 
`LongType`, let's remove this, raise exception in `compile` time probably 
better than in `runtime`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-17 Thread chenghao-intel
Github user chenghao-intel commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-55983850
  
A few comments on returns null V.S. raise exception in `Cast`, and which is 
the more wider type for `Date` and `Timestamp`, the other LGTM.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-17 Thread adrian-wang
Github user adrian-wang commented on a diff in the pull request:

https://github.com/apache/spark/pull/2344#discussion_r17710004
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
 ---
@@ -95,6 +100,8 @@ case class Cast(child: Expression, dataType: DataType) 
extends UnaryExpression {
   buildCast[Short](_, s = new Timestamp(s * 1000))
 case ByteType =
   buildCast[Byte](_, b = new Timestamp(b * 1000))
+case DateType =
+  buildCast[Date](_, d = Timestamp.valueOf(dateToString(d) +  
00:00:00))
--- End diff --

If the date `d` is polluted, for example `d.setTime(d.getTime + 1)`, this 
would be in a turmoil. I'm not sure if there's any way to get into that. Maybe 
before that really occurs, we can just use `new Timestamp(d.getTime)`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-17 Thread adrian-wang
Github user adrian-wang commented on a diff in the pull request:

https://github.com/apache/spark/pull/2344#discussion_r17710025
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
 ---
@@ -140,6 +147,39 @@ case class Cast(child: Expression, dataType: DataType) 
extends UnaryExpression {
 }
   }
 
+  // Converts Timestamp to string according to Hive TimestampWritable 
convention
+  private[this] def timestampToDateString(ts: Timestamp): String = {
+Cast.threadLocalDateFormat.get.format(ts)
+  }
+
+  // DateConverter
+  private[this] def castToDate: Any = Any = child.dataType match {
+case StringType =
+  buildCast[String](_, s = if (s.contains( )) {
+try castToDate(castToTimestamp(s))
+catch { case _: java.lang.IllegalArgumentException = null }
+  } else {
+try Date.valueOf(s) catch { case _: 
java.lang.IllegalArgumentException = null }
+  })
+case TimestampType =
+  buildCast[Timestamp](_, t = Date.valueOf(timestampToDateString(t)))
+// TimestampWritable.decimalToDate
+case _ =
+  _ = null
+  }
+
+  // Date cannot be cast to long, according to hive
+  private[this] def dateToLong(d: Date) = null
--- End diff --

Hive returns null if `cast(cast('2014-09-18' as date) as int)`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-10 Thread adrian-wang
GitHub user adrian-wang opened a pull request:

https://github.com/apache/spark/pull/2344

[SPARK-3407][SQL]Add Date type support



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/adrian-wang/spark date

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/2344.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2344


commit 8edb30dfa0ffbdc18f87ce042badd8e52fe7c625
Author: Daoyuan Wang daoyuan.w...@intel.com
Date:   2014-09-10T11:42:00Z

support date type

commit ce576b3730fb5942f78ebc7f3ee6472777d75fbd
Author: Daoyuan Wang daoyuan.w...@intel.com
Date:   2014-09-10T12:00:01Z

set test list




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-10 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-55110094
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20103/consoleFull)
 for   PR 2344 at commit 
[`ce576b3`](https://github.com/apache/spark/commit/ce576b3730fb5942f78ebc7f3ee6472777d75fbd).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-10 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-55123057
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20103/consoleFull)
 for   PR 2344 at commit 
[`ce576b3`](https://github.com/apache/spark/commit/ce576b3730fb5942f78ebc7f3ee6472777d75fbd).
 * This patch **fails** unit tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class Cast(child: Expression, dataType: DataType) extends 
UnaryExpression with Logging `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-10 Thread adrian-wang
Github user adrian-wang commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-55222163
  
Ha, first pass in recent days!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org