date:20170116

[GitHub] spark pull request #16601: [SPARK-19182][DStream] Optimize the lock in Strea...

2017-01-16 Thread uncleGen

Github user uncleGen commented on a diff in the pull request:

https://github.com/apache/spark/pull/16601#discussion_r96358523
  
--- Diff: 
streaming/src/main/scala/org/apache/spark/streaming/DStreamGraph.scala ---
@@ -112,12 +112,10 @@ final private[streaming] class DStreamGraph extends 
Serializable with Logging {
 
   def generateJobs(time: Time): Seq[Job] = {
 logDebug("Generating jobs for time " + time)
-val jobs = this.synchronized {
-  outputStreams.flatMap { outputStream =>
-val jobOption = outputStream.generateJob(time)
-jobOption.foreach(_.setCallSite(outputStream.creationSite))
-jobOption
-  }
+val jobs = getOutputStreams().flatMap { outputStream =>
--- End diff --

Yes, I have put the question to be too simple


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16606: [SPARK-19246][SQL]CataLogTable's partitionSchema ...

2017-01-16 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/16606#discussion_r96358416
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/PartitionProviderCompatibilitySuite.scala
 ---
@@ -481,4 +481,27 @@ class PartitionProviderCompatibilitySuite
   assert(spark.sql("show partitions test").count() == 5)
 }
   }
+
+  test("saveAsTable with inconsistent columns order" +
--- End diff --

Could you move it to `PartitionedWriteSuite`? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16610: [SPARK-19254][SQL] Support Seq, Map, and Struct in funct...

2017-01-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16610
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71473/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16610: [SPARK-19254][SQL] Support Seq, Map, and Struct in funct...

2017-01-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16610
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16606: [SPARK-19246][SQL]CataLogTable's partitionSchema ...

2017-01-16 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/16606#discussion_r96357937
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala
 ---
@@ -183,9 +183,12 @@ case class CatalogTable(
 
   import CatalogTable._
 
-  /** schema of this table's partition columns */
-  def partitionSchema: StructType = StructType(schema.filter {
-c => partitionColumnNames.contains(c.name)
+  /**
+   * schema of this table's partition columns
+   * keep the schema order with partitionColumnNames
--- End diff --

let's keep the previous document, I think it's clear enough.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16610: [SPARK-19254][SQL] Support Seq, Map, and Struct in funct...

2017-01-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16610
  
**[Test build #71473 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71473/testReport)**
 for PR 16610 at commit 
[`6a02490`](https://github.com/apache/spark/commit/6a02490745952bd2a5c5b0c84482b5cd874ae820).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16599: [SPARK-19239][PySpark] Check the lowerBound and u...

2017-01-16 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/16599#discussion_r96357852
  
--- Diff: python/pyspark/sql/readwriter.py ---
@@ -431,6 +432,8 @@ def jdbc(self, url, table, column=None, 
lowerBound=None, upperBound=None, numPar
 if column is not None:
 if numPartitions is None:
 numPartitions = self._spark._sc.defaultParallelism
--- End diff --

I think we should make the Scala API and Python API consistent. The 
existing Python API is not following [the 
document](http://spark.apache.org/docs/2.1.0/sql-programming-guide.html). 
```
These options must all be specified if any of them is specified. They 
describe how to partition the table when reading in parallel from multiple 
workers. partitionColumn must be a numeric column from the table in question. 
Notice that lowerBound and upperBound are just used to decide the partition 
stride, not for filtering the rows in table. So all rows in the table will be 
partitioned and returned. This option applies only to reading.
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16606: [SPARK-19246][SQL]CataLogTable's partitionSchema ...

2017-01-16 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/16606#discussion_r96357715
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala
 ---
@@ -183,9 +183,12 @@ case class CatalogTable(
 
   import CatalogTable._
 
-  /** schema of this table's partition columns */
-  def partitionSchema: StructType = StructType(schema.filter {
-c => partitionColumnNames.contains(c.name)
+  /**
+   * schema of this table's partition columns
+   * keep the schema order with partitionColumnNames
+   */
+  def partitionSchema: StructType = 
StructType(partitionColumnNames.flatMap {
+p => schema.filter(_.name == p)
--- End diff --

nit: code style
```
xxx.map { p =>
  xxx
}
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16573: [SPARK-19210][DStream] Add log level info into checkpoin...

2017-01-16 Thread uncleGen

Github user uncleGen commented on the issue:

https://github.com/apache/spark/pull/16573
  
@zsxwing I know what you mean, and indeed it can achieve the right result! 

IMHO, since we have provided the `SparkContext.setLogLevel`, it is weird to 
call `org.apache.log4j.Logger.getRootLogger().setLevel(l)`, but not use 
`SparkContext.setLogLevel()`. Besides, the new adding conf is just an internal 
conf. At last, the actual change is far from complicated.

Anyway, it is not a major issue, and can be fixed with your way, if you do 
not like this PR, I will close it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16597: [SPARK-19240][SQL][TEST] add test for setting location f...

2017-01-16 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/16597
  
Just FYI, this only test the behaviors of InMemoryCatalog. I will port it 
to `HiveDDLSuite` in https://github.com/apache/spark/pull/16592


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16599: [SPARK-19239][PySpark] Check the lowerBound and u...

2017-01-16 Thread djvulee

Github user djvulee commented on a diff in the pull request:

https://github.com/apache/spark/pull/16599#discussion_r96357233
  
--- Diff: python/pyspark/sql/readwriter.py ---
@@ -431,6 +432,8 @@ def jdbc(self, url, table, column=None, 
lowerBound=None, upperBound=None, numPar
 if column is not None:
 if numPartitions is None:
 numPartitions = self._spark._sc.defaultParallelism
--- End diff --

I have a little worry whether this change will break the API. If some users 
just specify the `column`, `lowerBound`, `upperBound` in some Spark version, 
its program will fail after update, even very few people just use the default  
parallelism. 

In my personal opinion, I prefer to make a change and  keep API consistent.

If your opinion is to add the assert on `numPartitions`, I will update the 
PR soon.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14559: [SPARK-16968]Add additional options in jdbc when creatin...

2017-01-16 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14559
  
This is a pretty general issue for JDBC users. Could we backport it to 
Spark 2.0? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16564: [SPARK-19065][SQL]Don't inherit expression id in dropDup...

2017-01-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16564
  
**[Test build #71487 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71487/testReport)**
 for PR 16564 at commit 
[`26652a0`](https://github.com/apache/spark/commit/26652a09be891de4a26fe54e4d3755b1cd42094f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16599: [SPARK-19239][PySpark] Check the lowerBound and u...

2017-01-16 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/16599#discussion_r96355936
  
--- Diff: python/pyspark/sql/readwriter.py ---
@@ -431,6 +432,8 @@ def jdbc(self, url, table, column=None, 
lowerBound=None, upperBound=None, numPar
 if column is not None:
 if numPartitions is None:
 numPartitions = self._spark._sc.defaultParallelism
--- End diff --

This is contradicting with the scala version. Could you also change it to 
the following code
```Python
assert numPartitions is not None, "numPartitions can not be None when 
``column`` is specified"
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16599: [SPARK-19239][PySpark] Check the lowerBound and upperBou...

2017-01-16 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/16599
  
Have you manually tested your code changes?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16564: [SPARK-19065][SQL]Don't inherit expression id in ...

2017-01-16 Thread zsxwing

Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/16564#discussion_r9636
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala 
---
@@ -898,11 +899,15 @@ class DatasetSuite extends QueryTest with 
SharedSQLContext {
   (1, 2), (1, 1), (2, 1), (2, 2))
   }
 
-  test("dropDuplicates should not change child plan output") {
-val ds = Seq(("a", 1), ("a", 2), ("b", 1), ("a", 1)).toDS()
-checkDataset(
-  ds.dropDuplicates("_1").select(ds("_1").as[String], 
ds("_2").as[Int]),
-  ("a", 1), ("b", 1))
+  test("SPARK-19065 dropDuplicates should not create expressions using the 
same id") {
--- End diff --

It seems weird to me that adding a test to verify that we don't support 
some feature, so I just added my previous regression test back in order to have 
a test to catch this issue.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16597: [SPARK-19240][SQL][TEST] add test for setting location f...

2017-01-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16597
  
**[Test build #71484 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71484/testReport)**
 for PR 16597 at commit 
[`a5687f8`](https://github.com/apache/spark/commit/a5687f8d99bb0cfdc075c6947898d4a5a65dd57f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16587: [SPARK-19229] [SQL] Disallow Creating Hive Source Tables...

2017-01-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16587
  
**[Test build #71485 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71485/testReport)**
 for PR 16587 at commit 
[`49e6e81`](https://github.com/apache/spark/commit/49e6e815639550a9c597b0752f8aa68ec9cfb496).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16473: [SPARK-19069] [CORE] Expose task 'status' and 'duration'...

2017-01-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16473
  
**[Test build #71486 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71486/testReport)**
 for PR 16473 at commit 
[`b2ad3bc`](https://github.com/apache/spark/commit/b2ad3bc2ab02f99bce4498726e11728516ba1be0).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16573: [SPARK-19210][DStream] Add log level info into checkpoin...

2017-01-16 Thread zsxwing

Github user zsxwing commented on the issue:

https://github.com/apache/spark/pull/16573
  
You can just call `org.apache.log4j.Logger.getRootLogger().setLevel(l)` in 
your main method before `StreamingContext.getOrCreate`. I don't think it's a 
good idea to add a new Spark conf just for Streaming checkpoints. In addition, 
it seems weird to me that Streaming also checkpoints the log level.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16606: [SPARK-19246][SQL]CataLogTable's partitionSchema order a...

2017-01-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16606
  
**[Test build #71483 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71483/testReport)**
 for PR 16606 at commit 
[`4260f84`](https://github.com/apache/spark/commit/4260f844530c17533d811f0c7f3deed14ed7a307).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16601: [SPARK-19182][DStream] Optimize the lock in Strea...

2017-01-16 Thread zsxwing

Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/16601#discussion_r96354491
  
--- Diff: 
streaming/src/main/scala/org/apache/spark/streaming/DStreamGraph.scala ---
@@ -112,12 +112,10 @@ final private[streaming] class DStreamGraph extends 
Serializable with Logging {
 
   def generateJobs(time: Time): Seq[Job] = {
 logDebug("Generating jobs for time " + time)
-val jobs = this.synchronized {
-  outputStreams.flatMap { outputStream =>
-val jobOption = outputStream.generateJob(time)
-jobOption.foreach(_.setCallSite(outputStream.creationSite))
-jobOption
-  }
+val jobs = getOutputStreams().flatMap { outputStream =>
--- End diff --

`synchronized` is to make sure `writeObject` never write some intermediate 
states of `DStreamGraph`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16591: [SPARK-19251] remove unused imports and outdated comment...

2017-01-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16591
  
**[Test build #71482 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71482/testReport)**
 for PR 16591 at commit 
[`f0e0576`](https://github.com/apache/spark/commit/f0e0576e0163bd72ff749a1eef885d9296302925).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16591: [SPARK-19251] remove unused imports and outdated comment...

2017-01-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16591
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16591: [SPARK-19251] remove unused imports and outdated comment...

2017-01-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16591
  
**[Test build #71481 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71481/testReport)**
 for PR 16591 at commit 
[`ab70d6b`](https://github.com/apache/spark/commit/ab70d6ba21aea42991a66dabafe8ab495d2413e7).
 * This patch **fails to build**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16591: [SPARK-19251] remove unused imports and outdated comment...

2017-01-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16591
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71481/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16573: [SPARK-19210][DStream] Add log level info into checkpoin...

2017-01-16 Thread uncleGen

Github user uncleGen commented on the issue:

https://github.com/apache/spark/pull/16573
  
also cc @tdas 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16601: [SPARK-19182][DStream] Optimize the lock in StreamingJob...

2017-01-16 Thread uncleGen

Github user uncleGen commented on the issue:

https://github.com/apache/spark/pull/16601
  
also cc @tdas


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16591: [SPARK-19251] remove unused imports and outdated comment...

2017-01-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16591
  
**[Test build #71481 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71481/testReport)**
 for PR 16591 at commit 
[`ab70d6b`](https://github.com/apache/spark/commit/ab70d6ba21aea42991a66dabafe8ab495d2413e7).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16591: [SPARK-19251] remove unused imports and outdated comment...

2017-01-16 Thread uncleGen

Github user uncleGen commented on the issue:

https://github.com/apache/spark/pull/16591
  
retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16591: [SPARK-19251][CORE] remove unused imports and outdated c...

2017-01-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16591
  
Build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16591: [SPARK-19251][CORE] remove unused imports and outdated c...

2017-01-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16591
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71480/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16591: [SPARK-19251][CORE] remove unused imports and outdated c...

2017-01-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16591
  
**[Test build #71480 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71480/testReport)**
 for PR 16591 at commit 
[`b5244ec`](https://github.com/apache/spark/commit/b5244ecffc2e957ebdf4f6c70e42b507cfda7595).
 * This patch **fails Scala style tests**.
 * This patch **does not merge cleanly**.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16591: [SPARK-19251][CORE] remove unused imports and outdated c...

2017-01-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16591
  
**[Test build #71480 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71480/testReport)**
 for PR 16591 at commit 
[`b5244ec`](https://github.com/apache/spark/commit/b5244ecffc2e957ebdf4f6c70e42b507cfda7595).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16583: [SPARK-19129] [SQL] SessionCatalog: Disallow empty part ...

2017-01-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16583
  
**[Test build #71479 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71479/testReport)**
 for PR 16583 at commit 
[`f1b6fe0`](https://github.com/apache/spark/commit/f1b6fe0d733ab160531ce261564340491a7840dd).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16599: [SPARK-19239][PySpark] Check the lowerBound and upperBou...

2017-01-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16599
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71476/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16599: [SPARK-19239][PySpark] Check the lowerBound and upperBou...

2017-01-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16599
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16599: [SPARK-19239][PySpark] Check the lowerBound and upperBou...

2017-01-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16599
  
**[Test build #71476 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71476/testReport)**
 for PR 16599 at commit 
[`43602b5`](https://github.com/apache/spark/commit/43602b56d6099213a103a0c0389ac37ebb2c326b).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16583: [SPARK-19129] [SQL] SessionCatalog: Disallow empt...

2017-01-16 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/16583#discussion_r96349730
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala 
---
@@ -568,7 +569,9 @@ private[hive] class HiveClientImpl(
 val hiveTable = toHiveTable(table)
 val parts = spec match {
   case None => shim.getAllPartitions(client, 
hiveTable).map(fromHivePartition)
-  case Some(s) => client.getPartitions(hiveTable, 
s.asJava).asScala.map(fromHivePartition)
+  case Some(s) =>
+assert(s.values.forall(_.nonEmpty), s"partition spec '$s' is 
invalid")
--- End diff --

Yeah, it has the same issue. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable wor...

2017-01-16 Thread lins05

Github user lins05 commented on a diff in the pull request:

https://github.com/apache/spark/pull/16593#discussion_r96348515
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala 
---
@@ -1343,17 +1343,41 @@ class HiveDDLSuite
   sql("INSERT INTO t SELECT 2, 'b'")
   checkAnswer(spark.table("t"), Row(9, "x") :: Row(2, "b") :: Nil)
 
-  val e = intercept[AnalysisException] {
-Seq(1 -> "a").toDF("i", 
"j").write.format("hive").partitionBy("i").saveAsTable("t2")
-  }
-  assert(e.message.contains("A Create Table As Select (CTAS) statement 
is not allowed " +
-"to create a partitioned table using Hive"))
-
   val e2 = intercept[AnalysisException] {
 Seq(1 -> "a").toDF("i", "j").write.format("hive").bucketBy(4, 
"i").saveAsTable("t2")
   }
   assert(e2.message.contains("Creating bucketed Hive serde table is 
not supported yet"))
 
+  try {
+spark.sql("set hive.exec.dynamic.partition.mode=nonstrict")
--- End diff --

I think we can use `withSQLConf` instead of `try .. finally ..`.

```scala
withSQLConf("hive.exec.dynamic.partition.mode" -> "nonstrict") {
...
}
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable wor...

2017-01-16 Thread lins05

Github user lins05 commented on a diff in the pull request:

https://github.com/apache/spark/pull/16593#discussion_r96348791
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/CreateHiveTableAsSelectCommand.scala
 ---
@@ -45,6 +46,25 @@ case class CreateHiveTableAsSelectCommand(
   override def innerChildren: Seq[LogicalPlan] = Seq(query)
 
   override def run(sparkSession: SparkSession): Seq[Row] = {
+
+// relation should move partition columns to the last
+val (partOutputs, nonPartOutputs) = query.output.partition {
+  a =>
+tableDesc.partitionColumnNames.contains(a.name)
+}
+
+// the CTAS's SELECT partition-outputs order should be consistent with
+// tableDesc.partitionColumnNames
+val reorderPartOutputs = tableDesc.partitionColumnNames.map {
--- End diff --

nit: `reorderPartOutputs` -> `reorderedPartOutputs`. The former sounds like 
a verb while the later sounds like a noun.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable wor...

2017-01-16 Thread lins05

Github user lins05 commented on a diff in the pull request:

https://github.com/apache/spark/pull/16593#discussion_r96349044
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala
 ---
@@ -183,9 +183,15 @@ case class CatalogTable(
 
   import CatalogTable._
 
-  /** schema of this table's partition columns */
-  def partitionSchema: StructType = StructType(schema.filter {
-c => partitionColumnNames.contains(c.name)
+  /**
+   * schema of this table's partition columns
+   * keep the schema order with partitionColumnNames
--- End diff --

"keep the schema order with partitionColumnNames because we always 
concatenate the partition columns to the schema when reading the table 
information from hive  metastore."


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable wor...

2017-01-16 Thread lins05

Github user lins05 commented on a diff in the pull request:

https://github.com/apache/spark/pull/16593#discussion_r96348696
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/CreateHiveTableAsSelectCommand.scala
 ---
@@ -88,7 +108,9 @@ case class CreateHiveTableAsSelectCommand(
 } else {
   try {
 sparkSession.sessionState.executePlan(InsertIntoTable(
-  metastoreRelation, Map(), query, overwrite = true, ifNotExists = 
false)).toRdd
+metastoreRelation, Map(), reorderOutputQuery, overwrite = true
+  , ifNotExists = false))
--- End diff --

nit: The comma should be in the line above (after `overwrite = true`). 
Actually I think we can put all the args to `InsertIntoTable` in the same line.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable wor...

2017-01-16 Thread lins05

Github user lins05 commented on a diff in the pull request:

https://github.com/apache/spark/pull/16593#discussion_r96349144
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala
 ---
@@ -183,9 +183,15 @@ case class CatalogTable(
 
   import CatalogTable._
 
-  /** schema of this table's partition columns */
-  def partitionSchema: StructType = StructType(schema.filter {
-c => partitionColumnNames.contains(c.name)
+  /**
+   * schema of this table's partition columns
+   * keep the schema order with partitionColumnNames
+   */
+  def partitionSchema: StructType = StructType(partitionColumnNames.map {
+p => schema.find(_.name == p).getOrElse(
+  throw new AnalysisException(s"Partition column [$p] " +
+s"did not exist in schema ${schema.toString}")
--- End diff --

"did not exist" -> "does not exist"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable wor...

2017-01-16 Thread lins05

Github user lins05 commented on a diff in the pull request:

https://github.com/apache/spark/pull/16593#discussion_r96348933
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala 
---
@@ -1343,17 +1343,41 @@ class HiveDDLSuite
   sql("INSERT INTO t SELECT 2, 'b'")
   checkAnswer(spark.table("t"), Row(9, "x") :: Row(2, "b") :: Nil)
 
-  val e = intercept[AnalysisException] {
-Seq(1 -> "a").toDF("i", 
"j").write.format("hive").partitionBy("i").saveAsTable("t2")
-  }
-  assert(e.message.contains("A Create Table As Select (CTAS) statement 
is not allowed " +
-"to create a partitioned table using Hive"))
-
   val e2 = intercept[AnalysisException] {
 Seq(1 -> "a").toDF("i", "j").write.format("hive").bucketBy(4, 
"i").saveAsTable("t2")
   }
   assert(e2.message.contains("Creating bucketed Hive serde table is 
not supported yet"))
 
+  try {
+spark.sql("set hive.exec.dynamic.partition.mode=nonstrict")
+Seq(10 -> "y").toDF("i", 
"j").write.format("hive").partitionBy("i").saveAsTable("t3")
+checkAnswer(spark.table("t3"), Row("y", 10) :: Nil)
+table = 
spark.sessionState.catalog.getTableMetadata(TableIdentifier("t3"))
+var partitionSchema = table.partitionSchema
+assert(partitionSchema.size == 1 && partitionSchema.fields(0).name 
== "i" &&
+  partitionSchema.fields(0).dataType == IntegerType)
+
+Seq(11 -> "z").toDF("i", 
"j").write.mode("overwrite").format("hive")
+  .partitionBy("j").saveAsTable("t3")
+checkAnswer(spark.table("t3"), Row(11, "z") :: Nil)
+table = 
spark.sessionState.catalog.getTableMetadata(TableIdentifier("t3"))
+partitionSchema = table.partitionSchema
+assert(partitionSchema.size == 1 && partitionSchema.fields(0).name 
== "j" &&
+  partitionSchema.fields(0).dataType == StringType)
+
+Seq((1, 2, 3)).toDF("i", "j", 
"k").write.mode("overwrite").format("hive")
+  .partitionBy("k", "j").saveAsTable("t3")
+table = 
spark.sessionState.catalog.getTableMetadata(TableIdentifier("t3"))
+checkAnswer(spark.table("t3"), Row(1, 3, 2) :: Nil)
+
+Seq((1, 2, 3)).toDF("i", "j", 
"k").write.mode("overwrite").format("hive")
+  .partitionBy("j", "k").saveAsTable("t3")
+table = 
spark.sessionState.catalog.getTableMetadata(TableIdentifier("t3"))
+checkAnswer(spark.table("t3"), Row(1, 2, 3) :: Nil)
+  } finally {
+spark.sql("set hive.exec.dynamic.partition.mode=strict")
+  }
+
--- End diff --

I think this test case is a bit fat, maybe we can split it into two or 
three smaller ones? e.g.:

```scala
  test("create hive serde table with DataFrameWriter.saveAsTable - basic") 
...
  test("create hive serde table with DataFrameWriter.saveAsTable - 
overwrite and append") ...
  test("create hive serde table with DataFrameWriter.saveAsTable - 
partitioned") ...
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16473: [SPARK-19069] [CORE] Expose task 'status' and 'duration'...

2017-01-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16473
  
**[Test build #71478 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71478/testReport)**
 for PR 16473 at commit 
[`96194df`](https://github.com/apache/spark/commit/96194df0ec6fdead12e18f436ee4ef107518152b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16585: [SPARK-19223][SQL][PySpark] Fix InputFileBlockHolder for...

2017-01-16 Thread rxin

Github user rxin commented on the issue:

https://github.com/apache/spark/pull/16585
  
BTW please add a test case for this. Thanks.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15300: [SPARK-17729] [SQL] Enable creating hive bucketed tables

2017-01-16 Thread tejasapatil

Github user tejasapatil commented on the issue:

https://github.com/apache/spark/pull/15300
  
@cloud-fan : I have linked a proposal in 
https://issues.apache.org/jira/browse/SPARK-19256.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16476: [SPARK-19084][SQL] Implement expression field

2017-01-16 Thread gczsjdy

Github user gczsjdy commented on a diff in the pull request:

https://github.com/apache/spark/pull/16476#discussion_r96348295
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala
 ---
@@ -340,3 +344,96 @@ object CaseKeyWhen {
 CaseWhen(cases, elseValue)
   }
 }
+
+/**
+ * A function that returns the index of expr in (expr1, expr2, ...) list 
or 0 if not found.
+ * It takes at least 2 parameters, and all parameters should be subtype of 
AtomicType or NullType.
+ * It's also acceptable to give parameters of different types.
--- End diff --

Good idea, thx!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16473: [SPARK-19069] [CORE] Expose task 'status' and 'duration'...

2017-01-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16473
  
**[Test build #71477 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71477/testReport)**
 for PR 16473 at commit 
[`0e25b30`](https://github.com/apache/spark/commit/0e25b301b3c2c7d9fe4f5ab2a4f266133f916960).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16599: [SPARK-19239][PySpark] Check the lowerBound and upperBou...

2017-01-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16599
  
**[Test build #71476 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71476/testReport)**
 for PR 16599 at commit 
[`43602b5`](https://github.com/apache/spark/commit/43602b56d6099213a103a0c0389ac37ebb2c326b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16476: [SPARK-19084][SQL] Implement expression field

2017-01-16 Thread gczsjdy

Github user gczsjdy commented on a diff in the pull request:

https://github.com/apache/spark/pull/16476#discussion_r96347397
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala
 ---
@@ -340,3 +344,96 @@ object CaseKeyWhen {
 CaseWhen(cases, elseValue)
   }
 }
+
+/**
+ * A function that returns the index of expr in (expr1, expr2, ...) list 
or 0 if not found.
+ * It takes at least 2 parameters, and all parameters should be subtype of 
AtomicType or NullType.
+ * It's also acceptable to give parameters of different types.
+ * If the search string is NULL, the return value is 0 because NULL fails 
equality comparison with any value.
+ * When the paramters have different types, comparing will be done based 
on type firstly,
+ * for example, ''999'' won't be considered equal with 999, no implicit 
cast will be done here.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(expr, expr1, expr2, ...) - Returns the index of expr in 
the expr1, expr2, ... or 0 if not found.",
+  extended = """
+Examples:
+  > SELECT _FUNC_(10, 9, 3, 10, 4);
+   3
+  > SELECT _FUNC_('a', 'b', 'c', 'd', 'a');
+   4
+  > SELECT _FUNC_('999', 'a', 999, 9.99, '999');
+   4
+  """)
+case class Field(children: Seq[Expression]) extends Expression {
+
+  /** Even if expr is not found in (expr1, expr2, ...) list, the value 
will be 0, not null */
+  override def nullable: Boolean = false
+  override def foldable: Boolean = children.forall(_.foldable)
+
+  private lazy val ordering = 
TypeUtils.getInterpretedOrdering(children(0).dataType)
+
+  private val dataTypeMatchIndex: Array[Int] = 
children.zipWithIndex.tail.filter(
+_._1.dataType.sameType(children.head.dataType)).map(_._2).toArray
+
+  override def checkInputDataTypes(): TypeCheckResult = {
+if (children.length <= 1) {
+  TypeCheckResult.TypeCheckFailure(s"FIELD requires at least 2 
arguments")
+} else if (!children.forall(
+e => e.dataType.isInstanceOf[AtomicType] || 
e.dataType.isInstanceOf[NullType])) {
+  TypeCheckResult.TypeCheckFailure(s"FIELD requires all arguments to 
be of AtomicType")
--- End diff --

That's for user's explicit indication of NULL, that's legal in Hive's 
`field` expression.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16476: [SPARK-19084][SQL] Implement expression field

2017-01-16 Thread gczsjdy

Github user gczsjdy commented on a diff in the pull request:

https://github.com/apache/spark/pull/16476#discussion_r96347281
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/ColumnExpressionSuite.scala ---
@@ -17,11 +17,13 @@
 
 package org.apache.spark.sql
 
+import java.sql.{Date, Timestamp}
--- End diff --

My bad.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16476: [SPARK-19084][SQL] Implement expression field

2017-01-16 Thread gczsjdy

Github user gczsjdy commented on a diff in the pull request:

https://github.com/apache/spark/pull/16476#discussion_r96347166
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala
 ---
@@ -340,3 +344,96 @@ object CaseKeyWhen {
 CaseWhen(cases, elseValue)
   }
 }
+
+/**
+ * A function that returns the index of expr in (expr1, expr2, ...) list 
or 0 if not found.
+ * It takes at least 2 parameters, and all parameters should be subtype of 
AtomicType or NullType.
--- End diff --

Yes, that's right.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16610: [SPARK-19254][SQL] Support Seq, Map, and Struct in funct...

2017-01-16 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/16610
  
Since spark possibly supports comparable `MapType` in future (#15970), it 
might also needs to support this type in `functions.lit`. However, Since I 
exactly know that adding new IFs is much arguable, anyone gives me some 
insights about this? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16585: [SPARK-19223][SQL][PySpark] Fix InputFileBlockHolder for...

2017-01-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16585
  
**[Test build #71475 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71475/testReport)**
 for PR 16585 at commit 
[`1563e03`](https://github.com/apache/spark/commit/1563e03796a1ee557decfa041d39dbd5eee8cf33).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16603: [SPARK-19244][Core] Sort MemoryConsumers according to th...

2017-01-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16603
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71471/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16603: [SPARK-19244][Core] Sort MemoryConsumers according to th...

2017-01-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16603
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16603: [SPARK-19244][Core] Sort MemoryConsumers according to th...

2017-01-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16603
  
**[Test build #71471 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71471/testReport)**
 for PR 16603 at commit 
[`070ec51`](https://github.com/apache/spark/commit/070ec51f322d3af889c499f60be11fef29068aa5).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16559: [WIP] Add expression index and test cases

2017-01-16 Thread gczsjdy

Github user gczsjdy closed the pull request at:

https://github.com/apache/spark/pull/16559


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16559: [WIP] Add expression index and test cases

2017-01-16 Thread gczsjdy

Github user gczsjdy commented on the issue:

https://github.com/apache/spark/pull/16559
  
Thanks for your inform @rxin  @aray  @cloud-fan , I will close this PR.
Sorry for the late reply.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16610: [SPARK-19254][SQL] Support Seq, Map, and Struct in funct...

2017-01-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16610
  
**[Test build #71473 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71473/testReport)**
 for PR 16610 at commit 
[`6a02490`](https://github.com/apache/spark/commit/6a02490745952bd2a5c5b0c84482b5cd874ae820).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16585: [SPARK-19223][SQL][PySpark] Fix InputFileBlockHolder for...

2017-01-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16585
  
**[Test build #71474 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71474/testReport)**
 for PR 16585 at commit 
[`e2d872c`](https://github.com/apache/spark/commit/e2d872c2ba706433e9aebe74213c4dbeb9c0754b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16610: [SPARK-19254][SQL] Support Seq, Map, and Struct i...

2017-01-16 Thread maropu

GitHub user maropu opened a pull request:

https://github.com/apache/spark/pull/16610

[SPARK-19254][SQL] Support Seq, Map, and Struct in functions.lit

## What changes were proposed in this pull request?
This pr is to support Seq, Map, and Struct in functions.lit; it adds a new 
IF named `lit2` with `TypeTag` for avoiding type erasure.

## How was this patch tested?
Added tests in `LiteralExpressionSuite`

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/maropu/spark SPARK-19254

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/16610.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #16610


commit 6a02490745952bd2a5c5b0c84482b5cd874ae820
Author: Takeshi YAMAMURO 
Date:   2016-11-14T13:21:09Z

Add a new create with TypeTag in Literal




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16605: [SPARK-18884][SQL] Support Array[_] in ScalaUDF

2017-01-16 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/16605
  
many thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16605: [SPARK-18884][SQL] Support Array[_] in ScalaUDF

2017-01-16 Thread dongjoon-hyun

Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/16605
  
Sure, @maropu . I'll do that tomorrow morning (PST).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16591: [SPARK-19251][CORE] remove unused imports and outdated c...

2017-01-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16591
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16591: [SPARK-19251][CORE] remove unused imports and outdated c...

2017-01-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16591
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71470/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16585: [SPARK-19223][SQL][PySpark] Fix InputFileBlockHolder for...

2017-01-16 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/16585
  
`InheritableThreadLocal`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16599: [SPARK-19239][PySpark] Check the lowerBound and upperBou...

2017-01-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16599
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71472/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16599: [SPARK-19239][PySpark] Check the lowerBound and upperBou...

2017-01-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16599
  
**[Test build #71472 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71472/testReport)**
 for PR 16599 at commit 
[`94c44ba`](https://github.com/apache/spark/commit/94c44ba368acb3c7fa648ad66cfd3cac352af911).
 * This patch **fails Python style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16591: [SPARK-19251][CORE] remove unused imports and outdated c...

2017-01-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16591
  
**[Test build #71470 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71470/testReport)**
 for PR 16591 at commit 
[`958c2fe`](https://github.com/apache/spark/commit/958c2fe8170514e392b080d15d7e78b6568c403c).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16599: [SPARK-19239][PySpark] Check the lowerBound and upperBou...

2017-01-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16599
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16599: [SPARK-19239][PySpark] Check the lowerBound and upperBou...

2017-01-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16599
  
**[Test build #71472 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71472/testReport)**
 for PR 16599 at commit 
[`94c44ba`](https://github.com/apache/spark/commit/94c44ba368acb3c7fa648ad66cfd3cac352af911).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16599: [SPARK-19239][PySpark] Check the lowerBound and upperBou...

2017-01-16 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/16599
  
ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16528: [SPARK-19148][SQL] do not expose the external tab...

2017-01-16 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/16528


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16585: [SPARK-19223][SQL][PySpark] Fix InputFileBlockHolder for...

2017-01-16 Thread viirya

Github user viirya commented on the issue:

https://github.com/apache/spark/pull/16585
  
@cloud-fan SGTM is for current approach or `InheritableThreadLocal`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16585: [SPARK-19223][SQL][PySpark] Fix InputFileBlockHolder for...

2017-01-16 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/16585
  
SGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16528: [SPARK-19148][SQL] do not expose the external table conc...

2017-01-16 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/16528
  
thanks for the review, merging to master!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16344: [SPARK-18929][ML] Add Tweedie distribution in GLM

2017-01-16 Thread felixcheung

Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/16344
  
Jenkins, retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16528: [SPARK-19148][SQL] do not expose the external table conc...

2017-01-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16528
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16528: [SPARK-19148][SQL] do not expose the external table conc...

2017-01-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16528
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71469/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16528: [SPARK-19148][SQL] do not expose the external table conc...

2017-01-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16528
  
**[Test build #71469 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71469/testReport)**
 for PR 16528 at commit 
[`318dc04`](https://github.com/apache/spark/commit/318dc0459cd1ba487643abff52b5979b4ab0).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16605: [SPARK-18884][SQL] Support Array[_] in ScalaUDF

2017-01-16 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/16605
  
@dongjoon-hyun Could you take time to review this before committers do? 
Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16605: [SPARK-18884][SQL] Support Array[_] in ScalaUDF

2017-01-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16605
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71468/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16605: [SPARK-18884][SQL] Support Array[_] in ScalaUDF

2017-01-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16605
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16605: [SPARK-18884][SQL] Support Array[_] in ScalaUDF

2017-01-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16605
  
**[Test build #71468 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71468/testReport)**
 for PR 16605 at commit 
[`581c7fa`](https://github.com/apache/spark/commit/581c7fa46e9f3f8b71759eaaf0490f84f56825aa).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16599: [SPARK-19239][PySpark] Check the lowerBound and u...

2017-01-16 Thread djvulee

Github user djvulee commented on a diff in the pull request:

https://github.com/apache/spark/pull/16599#discussion_r96339764
  
--- Diff: python/pyspark/sql/readwriter.py ---
@@ -431,6 +432,8 @@ def jdbc(self, url, table, column=None, 
lowerBound=None, upperBound=None, numPar
 if column is not None:
 if numPartitions is None:
 numPartitions = self._spark._sc.defaultParallelism
+assert lowerBound != None, "lowerBound can not be None when 
``column`` is specified"
+assert upperBound != None, "upperBound can not be None when 
``column`` is specified"
--- End diff --

Yes, The Scala code could check this, but the PySpark code will fail at 
```int(lowerBound)``` first, so the customer is confused. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16429: [SPARK-19019][PYTHON] Fix hijacked `collections.namedtup...

2017-01-16 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/16429
  
@davies, Could this be merged by any change maybe?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16553: [SPARK-9435][SQL] Reuse function in Java UDF to correctl...

2017-01-16 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/16553
  
@marmbrus Can this be merged by any change maybe?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16599: [SPARK-19239][PySpark] Check the lowerBound and u...

2017-01-16 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/16599#discussion_r96339264
  
--- Diff: python/pyspark/sql/readwriter.py ---
@@ -431,6 +432,8 @@ def jdbc(self, url, table, column=None, 
lowerBound=None, upperBound=None, numPar
 if column is not None:
 if numPartitions is None:
 numPartitions = self._spark._sc.defaultParallelism
+assert lowerBound != None, "lowerBound can not be None when 
``column`` is specified"
+assert upperBound != None, "upperBound can not be None when 
``column`` is specified"
--- End diff --

Should we resemble the condition here - 
https://github.com/apache/spark/blob/55d528f2ba0ba689dbb881616d9436dc7958e943/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCOptions.scala#L100-L103
 ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16585: [SPARK-19223][SQL][PySpark] Fix InputFileBlockHolder for...

2017-01-16 Thread viirya

Github user viirya commented on the issue:

https://github.com/apache/spark/pull/16585
  
@rxin Thanks for looking at this. I think the simplest way to transfer the 
info is using `InheritableThreadLocal` to replace `ThreadLocal` in 
`InputFileBlockHolder`. As I tested, it works. What do you think? It is ok for 
you?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16603: [SPARK-19244][Core] Sort MemoryConsumers according to th...

2017-01-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16603
  
**[Test build #71471 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71471/testReport)**
 for PR 16603 at commit 
[`070ec51`](https://github.com/apache/spark/commit/070ec51f322d3af889c499f60be11fef29068aa5).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16603: [SPARK-19244][Core] Sort MemoryConsumers accordin...

2017-01-16 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/16603#discussion_r96337114
  
--- Diff: core/src/main/java/org/apache/spark/memory/TaskMemoryManager.java 
---
@@ -144,23 +170,31 @@ public long acquireExecutionMemory(long required, 
MemoryConsumer consumer) {
   // spilling, avoid to have too many spilled files.
   if (got < required) {
 // Call spill() on other consumers to release memory
+// Sort the consumers according their memory usage. So we avoid 
spilling the same consumer
+// which is just spilled in last few times and re-spilling on it 
will produce many small
+// spill files.
+List sortedList = new ArrayList<>();
 for (MemoryConsumer c: consumers) {
   if (c != consumer && c.getUsed() > 0 && c.getMode() == mode) {
-try {
-  long released = c.spill(required - got, consumer);
-  if (released > 0) {
-logger.debug("Task {} released {} from {} for {}", 
taskAttemptId,
-  Utils.bytesToString(released), c, consumer);
-got += memoryManager.acquireExecutionMemory(required - 
got, taskAttemptId, mode);
-if (got >= required) {
-  break;
-}
+sortedList.add(c);
+  }
+}
+Collections.sort(sortedList, new ConsumerComparator());
+for (MemoryConsumer c: sortedList) {
+  try {
+long released = c.spill(required - got, consumer);
+if (released > 0) {
+  logger.debug("Task {} released {} from {} for {}", 
taskAttemptId,
+Utils.bytesToString(released), c, consumer);
+  got += memoryManager.acquireExecutionMemory(required - got, 
taskAttemptId, mode);
+  if (got >= required) {
+break;
   }
-} catch (IOException e) {
-  logger.error("error while calling spill() on " + c, e);
-  throw new OutOfMemoryError("error while calling spill() on " 
+ c + " : "
-+ e.getMessage());
 }
+  } catch (IOException e) {
+logger.error("error while calling spill() on " + c, e);
+throw new OutOfMemoryError("error while calling spill() on " + 
c + " : "
+  + e.getMessage());
   }
--- End diff --

As the memory usage of memory consumer is changing over time, not sure if 
we use TreeSet/TreeMap for consumers, can we get the correctly sorted order 
from the TreeSet/Map? In other words, the sorted order of TreeSet/Map is still 
guaranteed if the elements are mutable and changing after insertion? I think it 
is not.

If we are going to do sorting here anyway, a TreeMap/TreeSet might be 
overkill than a list like that. Another concern is that the API of 
TreeMap/TreeSet can let us find the tail set or ceiling element, but it 
requires we give it an input element to compare. But we only have the required 
memory number, not a memory consumer to compare.

Another concern is that TreeSet/TreeMap could return an empty set if all 
elements have less memory than required size. In this case, we need to go back 
to iterate all elements in the set/map to spill. It seems add more complexity.

Totally agreed that it is better to fetch the required size instead of 
going from largest to smallest always. With the current list based approach, we 
still can achieve that.





---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16591: [SPARK-19251][CORE] remove unused imports and out...

2017-01-16 Thread uncleGen

Github user uncleGen commented on a diff in the pull request:

https://github.com/apache/spark/pull/16591#discussion_r96336953
  
--- Diff: 
core/src/main/java/org/apache/spark/api/java/JavaFutureAction.java ---
@@ -17,7 +17,6 @@
 
 package org.apache.spark.api.java;
 
-
--- End diff --

Get it


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16473: [SPARK-19069] [CORE] Expose task 'status' and 'du...

2017-01-16 Thread zsxwing

Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/16473#discussion_r96335765
  
--- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskInfo.scala ---
@@ -20,6 +20,7 @@ package org.apache.spark.scheduler
 import org.apache.spark.TaskState
 import org.apache.spark.TaskState.TaskState
 import org.apache.spark.annotation.DeveloperApi
+import org.apache.spark.ui.jobs.UIData.TaskMetricsUIData
--- End diff --

nit: unused import.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16473: [SPARK-19069] [CORE] Expose task 'status' and 'du...

2017-01-16 Thread zsxwing

Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/16473#discussion_r96335590
  
--- Diff: core/src/main/scala/org/apache/spark/ui/jobs/UIData.scala ---
@@ -127,6 +127,14 @@ private[spark] object UIData {
 def updateTaskMetrics(metrics: Option[TaskMetrics]): Unit = {
   _metrics = TaskUIData.toTaskMetricsUIData(metrics)
 }
+
+def getTaskDuration(): Long = {
--- End diff --

nit: `getTaskDuration()` -> `taskDuration` as this doesn't have side 
effects.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16591: [SPARK-19227][CORE] remove unused imports and outdated c...

2017-01-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16591
  
**[Test build #71470 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71470/testReport)**
 for PR 16591 at commit 
[`958c2fe`](https://github.com/apache/spark/commit/958c2fe8170514e392b080d15d7e78b6568c403c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable wor...

2017-01-16 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/16593#discussion_r96336300
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/CreateHiveTableAsSelectCommand.scala
 ---
@@ -45,6 +46,25 @@ case class CreateHiveTableAsSelectCommand(
   override def innerChildren: Seq[LogicalPlan] = Seq(query)
 
   override def run(sparkSession: SparkSession): Seq[Row] = {
+
+// relation should move partition columns to the last
+val (partOutputs, nonPartOutputs) = query.output.partition {
+  a =>
--- End diff --

nit: code style
```
xxx.map { p =>
  xxx
}
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16542: [SPARK-18905][STREAMING] Fix the issue of removing a fai...

2017-01-16 Thread CodingCat

Github user CodingCat commented on the issue:

https://github.com/apache/spark/pull/16542
  
Thanks


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 3 4 5 >

1 - 100 of 494 matches

Mail list logo