[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-11-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14124 **[Test build #68477 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68477/consoleFull)** for PR 14124 at commit [`d240c0d`](https://github.com/apache/spark/commit/d

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-11-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14124 **[Test build #68474 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68474/consoleFull)** for PR 14124 at commit [`7306937`](https://github.com/apache/spark/commit/7

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14124 Actually, nvm. I think handling this in `DataFrameReader.schema` will deal with most of general cases. --- If your project is set up for it, you can reply to this email and have your reply appe

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14124 Oh wait, @cloud-fan, it seems, at least, Parquet files could possibly be written with not nullable fields. So, reading it without user-specified schema might also cause the inconsistency between

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14124 Thanks @cloud-fan, sure, that sounds great. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this fea

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-11-09 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/14124 Sorry for the delay. After thinking it again, I think it doesn't make sense to allow users to specify the nullability when reading a data source. How about we turn schema to nullable in `DataFrame

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14124 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14124 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67029/ Test PASSed. ---

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-10-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14124 **[Test build #67029 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67029/consoleFull)** for PR 14124 at commit [`3f153a3`](https://github.com/apache/spark/commit/

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-10-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14124 **[Test build #67029 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67029/consoleFull)** for PR 14124 at commit [`3f153a3`](https://github.com/apache/spark/commit/3

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14124 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65747/ Test PASSed. ---

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14124 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-09-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14124 **[Test build #65747 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65747/consoleFull)** for PR 14124 at commit [`0bc06c6`](https://github.com/apache/spark/commit/

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-09-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14124 **[Test build #65747 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65747/consoleFull)** for PR 14124 at commit [`0bc06c6`](https://github.com/apache/spark/commit/0

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-09-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14124 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65360/ Test PASSed. ---

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-09-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14124 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-09-14 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14124 **[Test build #65360 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65360/consoleFull)** for PR 14124 at commit [`f6be52b`](https://github.com/apache/spark/commit/

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-09-14 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14124 **[Test build #65360 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65360/consoleFull)** for PR 14124 at commit [`f6be52b`](https://github.com/apache/spark/commit/f

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-08-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14124 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64631/ Test PASSed. ---

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-08-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14124 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-08-30 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14124 **[Test build #64631 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64631/consoleFull)** for PR 14124 at commit [`ffacb55`](https://github.com/apache/spark/commit/

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-08-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14124 **[Test build #64631 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64631/consoleFull)** for PR 14124 at commit [`ffacb55`](https://github.com/apache/spark/commit/f

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-08-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14124 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-08-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14124 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64066/ Test PASSed. ---

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-08-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14124 **[Test build #64066 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64066/consoleFull)** for PR 14124 at commit [`079aae2`](https://github.com/apache/spark/commit/

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-08-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14124 **[Test build #64066 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64066/consoleFull)** for PR 14124 at commit [`079aae2`](https://github.com/apache/spark/commit/0

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-07-27 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14124 @cloud-fan If nullability should be not ignored, then I can fix this PR to make them consistent to not ignoring it (and of course I will try to identify the related problems). In this case, I wi

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-07-26 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14124 BTW, actually, this is not only about user-given schema. Currently, it always reads data into dataframe by datasources based on `FileFormat` ignoring nullability in schema (for both user

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-07-26 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14124 Thanks for feedback @cloud-fan ! If the user-given schema is wrong, it is handled differently for each datasource specific. - For JSON and CSV it is kind of permissive gen

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-07-26 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/14124 What will happen if the given schema is wrong? It seems weird that we allow users to provide schema while reading the data, but without validating it. --- If your project is set up for it, you ca

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-07-26 Thread marmbrus
Github user marmbrus commented on the issue: https://github.com/apache/spark/pull/14124 @cloud-fan --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if t

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-07-26 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14124 gentle ping @marmbrus --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wish

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-07-16 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14124 Could you take a look please @marmbrus ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this featur

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-07-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14124 @viirya Thanks for your comment! Actually, that's I want to have some feedback for from @marmbrus . It seems forcing to a nullable schema all is already happening when you read/write da

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-07-10 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/14124 @HyukjinKwon Your patch solves this inconsistency by forcing schema as nullable at all. However, looks like the parquet case is for compatibility, is this the same for json? --- If your project is

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-07-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14124 I am a bit confused if we are allowed to read JSON (via `json(jsonRDD: RDD[String])` API) with schema having fields set `false` in `nullable`. If it is meant to be not allowed, this issue wil

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-07-10 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/14124 @HyukjinKwon No matter whether this PR is merged or not, I still think we should fix the above issue. Silent conversion does not look good to me. --- If your project is set up for it, you can re

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-07-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14124 Oh, I see, before this patch ``` +---+ | a| +---+ | 1| | 0| +---+ ``` after this patch ``` ++ | a| ++ | 1

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-07-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14124 Ah, yes it seems a bug to me.. I thought it throws an exception in that case. Does this PR introduce the problem? (Just curious and to be sure). --- If your project is set up for it, you can re

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-07-10 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/14124 ``` val rdd = spark.sparkContext.makeRDD(Seq("{\"a\" : 1}", "{\"a\" : null}")) val schema = StructType(StructField("a", IntegerType, nullable = false) :: Nil) val df = spark.read.schem

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-07-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14124 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-07-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14124 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62054/ Test PASSed. ---

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-07-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14124 **[Test build #62054 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62054/consoleFull)** for PR 14124 at commit [`3980681`](https://github.com/apache/spark/commit/

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-07-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14124 **[Test build #62054 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62054/consoleFull)** for PR 14124 at commit [`3980681`](https://github.com/apache/spark/commit/3

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-07-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14124 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62053/ Test FAILed. ---

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-07-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14124 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-07-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14124 **[Test build #62053 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62053/consoleFull)** for PR 14124 at commit [`adae8de`](https://github.com/apache/spark/commit/

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-07-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14124 **[Test build #62053 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62053/consoleFull)** for PR 14124 at commit [`adae8de`](https://github.com/apache/spark/commit/a

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-07-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14124 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62049/ Test FAILed. ---

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-07-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14124 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-07-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14124 **[Test build #62049 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62049/consoleFull)** for PR 14124 at commit [`a917678`](https://github.com/apache/spark/commit/

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-07-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14124 **[Test build #62049 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62049/consoleFull)** for PR 14124 at commit [`a917678`](https://github.com/apache/spark/commit/a

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-07-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14124 Hi @gatorsmile and @marmbrus, I saw the discussion and found you are related with this one. Could you please review this? --- If your project is set up for it, you can reply to this email and h