[GitHub] spark pull request: [SPARK-4293][SQL] Make Cast be able to handle ...

2014-12-11 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/3150


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4293][SQL] Make Cast be able to handle ...

2014-11-26 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3150#issuecomment-64673009
  
  [Test build #23899 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23899/consoleFull)
 for   PR 3150 at commit 
[`e935939`](https://github.com/apache/spark/commit/e935939ac829ecaa887df4bcbb6c65027876a210).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4293][SQL] Make Cast be able to handle ...

2014-11-26 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3150#issuecomment-64682672
  
  [Test build #23899 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23899/consoleFull)
 for   PR 3150 at commit 
[`e935939`](https://github.com/apache/spark/commit/e935939ac829ecaa887df4bcbb6c65027876a210).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4293][SQL] Make Cast be able to handle ...

2014-11-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3150#issuecomment-64682678
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23899/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4293][SQL] Make Cast be able to handle ...

2014-11-14 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3150#issuecomment-63158911
  
  [Test build #23408 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23408/consoleFull)
 for   PR 3150 at commit 
[`ba14003`](https://github.com/apache/spark/commit/ba14003fedbc13db8b40b1712070ae1ed44972f8).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4293][SQL] Make Cast be able to handle ...

2014-11-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3150#issuecomment-63160799
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23408/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4293][SQL] Make Cast be able to handle ...

2014-11-14 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3150#issuecomment-63160796
  
  [Test build #23408 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23408/consoleFull)
 for   PR 3150 at commit 
[`ba14003`](https://github.com/apache/spark/commit/ba14003fedbc13db8b40b1712070ae1ed44972f8).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4293][SQL] Make Cast be able to handle ...

2014-11-08 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3150#issuecomment-62251997
  
  [Test build #23096 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23096/consoleFull)
 for   PR 3150 at commit 
[`8999868`](https://github.com/apache/spark/commit/89998684122af72482e8c1c2d22198dfc66aa4d4).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4293][SQL] Make Cast be able to handle ...

2014-11-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3150#issuecomment-62253734
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23096/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4293][SQL] Make Cast be able to handle ...

2014-11-08 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3150#issuecomment-62253732
  
  [Test build #23096 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23096/consoleFull)
 for   PR 3150 at commit 
[`8999868`](https://github.com/apache/spark/commit/89998684122af72482e8c1c2d22198dfc66aa4d4).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4293][SQL] Make Cast be able to handle ...

2014-11-07 Thread chenghao-intel
Github user chenghao-intel commented on the pull request:

https://github.com/apache/spark/pull/3150#issuecomment-62115143
  
It looks good to me in general, and I like the idea of summarizing the 
convertible data type checking, but in the meantime, I am a little afraid it 
might be error-prone for future maintenance or new data type added. 
Or can we remove the `resolve` method?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4293][SQL] Make Cast be able to handle ...

2014-11-07 Thread ueshin
Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/3150#discussion_r20001109
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
 ---
@@ -37,8 +42,62 @@ case class Cast(child: Expression, dataType: DataType) 
extends UnaryExpression w
 case (BooleanType, DateType)  = true
 case (DateType, _: NumericType)   = true
 case (DateType, BooleanType)  = true
-case (_, DecimalType.Fixed(_, _)) = true  // TODO: not all upcasts 
here can really give null
-case _= child.nullable
+case (_, DecimalType.Fixed(_, _)) = true // TODO: not all upcasts 
here can really give null
+case _= false
+  }
+
+  private[this] def resolvableNullability(from: Boolean, to: Boolean) = 
!from || to
+
+  private[this] def resolve(from: DataType, to: DataType): Boolean = {
+(from, to) match {
+  case (from, to) if from == to = true
+
+  case (NullType, _)= true
+
+  case (_, StringType)  = true
+
+  case (StringType, BinaryType) = true
+
+  case (StringType, BooleanType)= true
+  case (DateType, BooleanType)  = true
+  case (TimestampType, BooleanType) = true
+  case (_: NumericType, BooleanType)= true
+
+  case (StringType, TimestampType)  = true
+  case (BooleanType, TimestampType) = true
+  case (DateType, TimestampType)= true
+  case (_: NumericType, TimestampType)  = true
+
+  case (_, DateType)= true
+
+  case (StringType, _: NumericType) = true
+  case (BooleanType, _: NumericType)= true
+  case (DateType, _: NumericType)   = true
+  case (TimestampType, _: NumericType)  = true
+  case (_: NumericType, _: NumericType) = true
+
+  case (ArrayType(from, fn), ArrayType(to, tn)) =
+resolve(from, to) 
+  resolvableNullability(fn || forceNullable(from, to), tn)
+
+  case (MapType(fromKey, fromValue, fn), MapType(toKey, toValue, tn)) 
=
+resolve(fromKey, toKey) 
+  (!forceNullable(fromKey, toKey)) 
+  resolve(fromValue, toValue) 
+  resolvableNullability(fn || forceNullable(fromValue, toValue), 
tn)
+
+  case (StructType(fromFields), StructType(toFields)) =
+fromFields.size == toFields.size 
+  fromFields.zip(toFields).forall {
+case (fromField, toField) =
+  resolve(fromField.dataType, toField.dataType) 
+resolvableNullability(
+  fromField.nullable || forceNullable(fromField.dataType, 
toField.dataType),
+  toField.nullable)
+  }
+
+  case _ = false
--- End diff --

Hmm, I think the resolve check should be in logical plan analyzing.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4293][SQL] Make Cast be able to handle ...

2014-11-07 Thread ueshin
Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/3150#discussion_r20001249
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
 ---
@@ -37,8 +42,62 @@ case class Cast(child: Expression, dataType: DataType) 
extends UnaryExpression w
 case (BooleanType, DateType)  = true
 case (DateType, _: NumericType)   = true
 case (DateType, BooleanType)  = true
-case (_, DecimalType.Fixed(_, _)) = true  // TODO: not all upcasts 
here can really give null
-case _= child.nullable
+case (_, DecimalType.Fixed(_, _)) = true // TODO: not all upcasts 
here can really give null
+case _= false
+  }
+
+  private[this] def resolvableNullability(from: Boolean, to: Boolean) = 
!from || to
+
+  private[this] def resolve(from: DataType, to: DataType): Boolean = {
+(from, to) match {
+  case (from, to) if from == to = true
+
+  case (NullType, _)= true
+
+  case (_, StringType)  = true
+
+  case (StringType, BinaryType) = true
+
+  case (StringType, BooleanType)= true
+  case (DateType, BooleanType)  = true
+  case (TimestampType, BooleanType) = true
+  case (_: NumericType, BooleanType)= true
+
+  case (StringType, TimestampType)  = true
+  case (BooleanType, TimestampType) = true
+  case (DateType, TimestampType)= true
+  case (_: NumericType, TimestampType)  = true
+
+  case (_, DateType)= true
+
+  case (StringType, _: NumericType) = true
+  case (BooleanType, _: NumericType)= true
+  case (DateType, _: NumericType)   = true
+  case (TimestampType, _: NumericType)  = true
+  case (_: NumericType, _: NumericType) = true
+
+  case (ArrayType(from, fn), ArrayType(to, tn)) =
+resolve(from, to) 
+  resolvableNullability(fn || forceNullable(from, to), tn)
+
+  case (MapType(fromKey, fromValue, fn), MapType(toKey, toValue, tn)) 
=
+resolve(fromKey, toKey) 
+  (!forceNullable(fromKey, toKey)) 
+  resolve(fromValue, toValue) 
+  resolvableNullability(fn || forceNullable(fromValue, toValue), 
tn)
+
+  case (StructType(fromFields), StructType(toFields)) =
+fromFields.size == toFields.size 
+  fromFields.zip(toFields).forall {
+case (fromField, toField) =
+  resolve(fromField.dataType, toField.dataType) 
+resolvableNullability(
+  fromField.nullable || forceNullable(fromField.dataType, 
toField.dataType),
+  toField.nullable)
+  }
+
+  case _ = false
--- End diff --

Some expressions are checking the `resolved` in the `dataType` method, 
though.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4293][SQL] Make Cast be able to handle ...

2014-11-07 Thread ueshin
Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/3150#discussion_r20001270
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
 ---
@@ -323,28 +371,53 @@ case class Cast(child: Expression, dataType: 
DataType) extends UnaryExpression w
   buildCast[Date](_, d = dateToDouble(d))
 case TimestampType =
   buildCast[Timestamp](_, t = timestampToDouble(t).toFloat)
-case DecimalType() =
-  buildCast[Decimal](_, _.toFloat)
 case x: NumericType =
   b = x.numeric.asInstanceOf[Numeric[Any]].toFloat(b)
   }
 
-  private[this] lazy val cast: Any = Any = dataType match {
+  private[this] def castArray(from: ArrayType, to: ArrayType): Any = Any 
= {
+val elementCast = cast(from.elementType, to.elementType)
+buildCast[Seq[Any]](_, _.map(v = if (v == null) null else 
elementCast(v)))
--- End diff --

I don't think we need to handle the case specially the same as other 
expressions.
The element data of the type `ArrayType.containsNull == false` are never 
`null`, so always `elementCast(v)` will be called.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4293][SQL] Make Cast be able to handle ...

2014-11-07 Thread ueshin
Github user ueshin commented on the pull request:

https://github.com/apache/spark/pull/3150#issuecomment-62118596
  
@chenghao-intel, Thank you for your comments.
If `resolve` method is removed, the nullability check (e.g. cast from 
`ArrayType(IntegerType, containsNull = true)` to `ArrayType(IntegerType, 
containsNull = false)` is apparently invalid) is also removed and it will cause 
unexpected errors. If there is a better way to ensure the nullability check, we 
can remove the method.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4293][SQL] Make Cast be able to handle ...

2014-11-07 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3150#issuecomment-62239484
  
  [Test build #23081 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23081/consoleFull)
 for   PR 3150 at commit 
[`f677c30`](https://github.com/apache/spark/commit/f677c303115a0065589535e1053bd1e803aeb4fc).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4293][SQL] Make Cast be able to handle ...

2014-11-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3150#issuecomment-62242628
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23081/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4293][SQL] Make Cast be able to handle ...

2014-11-07 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3150#issuecomment-62242627
  
  [Test build #23081 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23081/consoleFull)
 for   PR 3150 at commit 
[`f677c30`](https://github.com/apache/spark/commit/f677c303115a0065589535e1053bd1e803aeb4fc).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4293][SQL] Make Cast be able to handle ...

2014-11-06 Thread ueshin
GitHub user ueshin opened a pull request:

https://github.com/apache/spark/pull/3150

[SPARK-4293][SQL] Make Cast be able to handle complex types.

Inserting data of type including `ArrayType.containsNull == false` or 
`MapType.valueContainsNull == false` or `StructType.fields.exists(_.nullable == 
false)` into Hive table will fail because `Cast` inserted by 
`HiveMetastoreCatalog.PreInsertionCasts` rule of `Analyzer` can't handle these 
types correctly.

Complex type cast rule proposal:

- Cast for non-complex types should be able to cast the same as before.
- Cast for `ArrayType` can evaluate if
  - Element type can cast
  - Nullability rule doesn't break
- Cast for `MapType` can evaluate if
  - Key type can cast
  - Nullability for casted key type is `false`
  - Value type can cast
  - Nullability rule for value type doesn't break
- Cast for `StructType` can evaluate if
  - The field size is the same
  - Each field can cast
  - Nullability rule for each field doesn't break
- The nested structure should be the same.

Nullability rule:

- If the casted type is `nullable == true`, the target nullability should 
be `true`

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ueshin/apache-spark issues/SPARK-4293

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/3150.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3150


commit 4f71bb8e4fa83f160f0f131c9b60bca911acef99
Author: Takuya UESHIN ues...@happy-camper.st
Date:   2014-11-07T05:13:11Z

Make Cast be able to handle complex types.

commit 287f410329edf375c8d6142ea2400aa75537da5f
Author: Takuya UESHIN ues...@happy-camper.st
Date:   2014-11-07T05:13:38Z

Add tests to insert data of types ArrayType / MapType / StructType with 
nullability is false into Hive table.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4293][SQL] Make Cast be able to handle ...

2014-11-06 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3150#issuecomment-62103342
  
  [Test build #23041 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23041/consoleFull)
 for   PR 3150 at commit 
[`287f410`](https://github.com/apache/spark/commit/287f410329edf375c8d6142ea2400aa75537da5f).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4293][SQL] Make Cast be able to handle ...

2014-11-06 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3150#issuecomment-62107846
  
  [Test build #23041 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23041/consoleFull)
 for   PR 3150 at commit 
[`287f410`](https://github.com/apache/spark/commit/287f410329edf375c8d6142ea2400aa75537da5f).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4293][SQL] Make Cast be able to handle ...

2014-11-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3150#issuecomment-62107850
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23041/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4293][SQL] Make Cast be able to handle ...

2014-11-06 Thread chenghao-intel
Github user chenghao-intel commented on a diff in the pull request:

https://github.com/apache/spark/pull/3150#discussion_r19997693
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
 ---
@@ -37,8 +42,62 @@ case class Cast(child: Expression, dataType: DataType) 
extends UnaryExpression w
 case (BooleanType, DateType)  = true
 case (DateType, _: NumericType)   = true
 case (DateType, BooleanType)  = true
-case (_, DecimalType.Fixed(_, _)) = true  // TODO: not all upcasts 
here can really give null
-case _= child.nullable
+case (_, DecimalType.Fixed(_, _)) = true // TODO: not all upcasts 
here can really give null
+case _= false
+  }
+
+  private[this] def resolvableNullability(from: Boolean, to: Boolean) = 
!from || to
+
+  private[this] def resolve(from: DataType, to: DataType): Boolean = {
+(from, to) match {
+  case (from, to) if from == to = true
+
+  case (NullType, _)= true
+
+  case (_, StringType)  = true
+
+  case (StringType, BinaryType) = true
+
+  case (StringType, BooleanType)= true
+  case (DateType, BooleanType)  = true
+  case (TimestampType, BooleanType) = true
+  case (_: NumericType, BooleanType)= true
+
+  case (StringType, TimestampType)  = true
+  case (BooleanType, TimestampType) = true
+  case (DateType, TimestampType)= true
+  case (_: NumericType, TimestampType)  = true
+
+  case (_, DateType)= true
+
+  case (StringType, _: NumericType) = true
+  case (BooleanType, _: NumericType)= true
+  case (DateType, _: NumericType)   = true
+  case (TimestampType, _: NumericType)  = true
+  case (_: NumericType, _: NumericType) = true
+
+  case (ArrayType(from, fn), ArrayType(to, tn)) =
+resolve(from, to) 
+  resolvableNullability(fn || forceNullable(from, to), tn)
+
+  case (MapType(fromKey, fromValue, fn), MapType(toKey, toValue, tn)) 
=
+resolve(fromKey, toKey) 
+  (!forceNullable(fromKey, toKey)) 
+  resolve(fromValue, toValue) 
+  resolvableNullability(fn || forceNullable(fromValue, toValue), 
tn)
+
+  case (StructType(fromFields), StructType(toFields)) =
+fromFields.size == toFields.size 
+  fromFields.zip(toFields).forall {
+case (fromField, toField) =
+  resolve(fromField.dataType, toField.dataType) 
+resolvableNullability(
+  fromField.nullable || forceNullable(fromField.dataType, 
toField.dataType),
+  toField.nullable)
+  }
+
+  case _ = false
--- End diff --

I am wondering if throwing exception will be more informative, than plain 
`UnresolvedException` thrown in logical plan analyzing.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4293][SQL] Make Cast be able to handle ...

2014-11-06 Thread chenghao-intel
Github user chenghao-intel commented on a diff in the pull request:

https://github.com/apache/spark/pull/3150#discussion_r19997794
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
 ---
@@ -323,28 +371,53 @@ case class Cast(child: Expression, dataType: 
DataType) extends UnaryExpression w
   buildCast[Date](_, d = dateToDouble(d))
 case TimestampType =
   buildCast[Timestamp](_, t = timestampToDouble(t).toFloat)
-case DecimalType() =
-  buildCast[Decimal](_, _.toFloat)
 case x: NumericType =
   b = x.numeric.asInstanceOf[Numeric[Any]].toFloat(b)
   }
 
-  private[this] lazy val cast: Any = Any = dataType match {
+  private[this] def castArray(from: ArrayType, to: ArrayType): Any = Any 
= {
+val elementCast = cast(from.elementType, to.elementType)
+buildCast[Seq[Any]](_, _.map(v = if (v == null) null else 
elementCast(v)))
--- End diff --

Semantically, how do we handle the case where`ArrayType.nullable=false`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org