[GitHub] spark issue #19471: [SPARK-22245][SQL] partitioned data set should always pu...

2017-10-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19471
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19471: [SPARK-22245][SQL] partitioned data set should always pu...

2017-10-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19471
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83066/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19471: [SPARK-22245][SQL] partitioned data set should always pu...

2017-10-25 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/19471
  
closing in favor of https://github.com/apache/spark/pull/19579


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19471: [SPARK-22245][SQL] partitioned data set should always pu...

2017-10-25 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19471
  
**[Test build #83066 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83066/testReport)**
 for PR 19471 at commit 
[`d21ebaa`](https://github.com/apache/spark/commit/d21ebaab6d6e2d7d6d10933d72360cef49194a90).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19471: [SPARK-22245][SQL] partitioned data set should always pu...

2017-10-25 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19471
  
**[Test build #83066 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83066/testReport)**
 for PR 19471 at commit 
[`d21ebaa`](https://github.com/apache/spark/commit/d21ebaab6d6e2d7d6d10933d72360cef49194a90).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19471: [SPARK-22245][SQL] partitioned data set should always pu...

2017-10-19 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/19471
  
> No behavior change if there is no overlapped columns in data and 
partition schema.

> The schema changed(partition columns go to the end) when reading file 
format data source with partition columns in data files.

@cloud-fan Could you check why so many test cases failed? 



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19471: [SPARK-22245][SQL] partitioned data set should always pu...

2017-10-14 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/19471
  
We may need to document this change in `Migration Guide` in SQL programming 
guide.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19471: [SPARK-22245][SQL] partitioned data set should always pu...

2017-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19471
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82671/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19471: [SPARK-22245][SQL] partitioned data set should always pu...

2017-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19471
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19471: [SPARK-22245][SQL] partitioned data set should always pu...

2017-10-12 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19471
  
**[Test build #82671 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82671/testReport)**
 for PR 19471 at commit 
[`dea7037`](https://github.com/apache/spark/commit/dea70371e26b3587d905e2521b83f4ecf9356aa0).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19471: [SPARK-22245][SQL] partitioned data set should always pu...

2017-10-12 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19471
  
**[Test build #82671 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82671/testReport)**
 for PR 19471 at commit 
[`dea7037`](https://github.com/apache/spark/commit/dea70371e26b3587d905e2521b83f4ecf9356aa0).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19471: [SPARK-22245][SQL] partitioned data set should always pu...

2017-10-11 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/19471
  
+1 for this change. BTW, wow, there are lots of test case failures: 81 
failures.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19471: [SPARK-22245][SQL] partitioned data set should always pu...

2017-10-11 Thread maropu
Github user maropu commented on the issue:

https://github.com/apache/spark/pull/19471
  
Fair enough to me. To check this change reasonable, we might be able to 
send a dev/user list email to social feedbacks. I saw marmbrus doing so when 
adding the json API;
https://github.com/apache/spark/pull/15274#issuecomment-250092074

http://apache-spark-developers-list.1001551.n3.nabble.com/Spark-SQL-JSON-Column-Support-td19132.html
If we have no response or positive feedbacks, we could quickly/safely drop 
the support.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19471: [SPARK-22245][SQL] partitioned data set should always pu...

2017-10-11 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/19471
  
waiting for more feedbacks before moving forward :)

Another thing I wanna point out: for `sql("create table t using parquet 
options(skipHiveMetadata=true) location '/tmp/t'")`, it works in Spark 2.0, and 
the created table has a schema that the partition column is at the beginning. 
In Spark 2.1, it also works, and `DESC TABLE` also shows the table schema has 
partition column at the beginning. However, if you query the table, the output 
schema has partition column at the end.

It's been a long time since Spark 2.1 was released and no one reports this 
behavior change. It seems this is really a corner case and makes me feel we 
should not compilcate our code too much for it.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19471: [SPARK-22245][SQL] partitioned data set should always pu...

2017-10-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19471
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82633/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19471: [SPARK-22245][SQL] partitioned data set should always pu...

2017-10-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19471
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19471: [SPARK-22245][SQL] partitioned data set should always pu...

2017-10-11 Thread maropu
Github user maropu commented on the issue:

https://github.com/apache/spark/pull/19471
  
Does this change affect some other tests for the overlapped cases like 
[DataStreamReaderWriterSuite](https://github.com/apache/spark/blob/655f6f86f84ff5241d1d20766e1ef83bb32ca5e0/sql/core/src/test/scala/org/apache/spark/sql/streaming/test/DataStreamReaderWriterSuite.scala#L550)
 and `OrcPartitionDiscoverySuite`? Since we already have some amount of these 
tests in multiple places, (I know you've already considered this aspect 
though) I'm a little worried about if this change in minor releases makes 
users confused.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19471: [SPARK-22245][SQL] partitioned data set should always pu...

2017-10-11 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19471
  
**[Test build #82633 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82633/testReport)**
 for PR 19471 at commit 
[`ac7ae6b`](https://github.com/apache/spark/commit/ac7ae6b6149afd630e788a9fb42e6c4b25e84e17).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19471: [SPARK-22245][SQL] partitioned data set should always pu...

2017-10-11 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/19471
  
cc @rxin @brkyvz @liancheng @gatorsmile @maropu 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19471: [SPARK-22245][SQL] partitioned data set should always pu...

2017-10-11 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19471
  
**[Test build #82633 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82633/testReport)**
 for PR 19471 at commit 
[`ac7ae6b`](https://github.com/apache/spark/commit/ac7ae6b6149afd630e788a9fb42e6c4b25e84e17).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org