date:20171109

[GitHub] spark issue #19709: [SPARK-22483][CORE]. Exposing java.nio bufferedPool memo...

2017-11-09 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19709
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19709: [SPARK-22483][CORE]. Exposing java.nio bufferedPool memo...

2017-11-09 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19709
  
**[Test build #83649 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83649/testReport)**
 for PR 19709 at commit 
[`2a0b281`](https://github.com/apache/spark/commit/2a0b2816b70f5c1f83a0da3f8dd81913c5e90051).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19709: [SPARK-22483][CORE]. Exposing java.nio bufferedPool memo...

2017-11-09 Thread vanzin

Github user vanzin commented on the issue:

https://github.com/apache/spark/pull/19709
  
ok to test


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19709: [SPARK-22483][CORE]. Exposing java.nio bufferedPool memo...

2017-11-09 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19709
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19709: [SPARK-22483][CORE]. Exposing java.nio bufferedPo...

2017-11-09 Thread vundela

GitHub user vundela opened a pull request:

https://github.com/apache/spark/pull/19709

[SPARK-22483][CORE]. Exposing java.nio bufferedPool memory metrics to 
Metric System

## What changes were proposed in this pull request?

Adds java.nio bufferedPool memory metrics to metrics system which includes 
both direct and mapped memory.

## How was this patch tested?
Manually tested and checked direct and mapped memory metrics too available 
in metrics system using Console sink.

Here is the sample console output

application_1509655862825_0016.2.jvm.direct.capacity
 value = 19497
application_1509655862825_0016.2.jvm.direct.count
 value = 6
application_1509655862825_0016.2.jvm.direct.used
 value = 19498

application_1509655862825_0016.2.jvm.mapped.capacity
 value = 0
application_1509655862825_0016.2.jvm.mapped.count
 value = 0
application_1509655862825_0016.2.jvm.mapped.used
 value = 0

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vundela/spark SPARK-22483

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/19709.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #19709


commit 2a0b2816b70f5c1f83a0da3f8dd81913c5e90051
Author: Srinivasa Reddy Vundela 
Date:   2017-11-09T18:16:26Z

[SPARK-22483][CORE]. Exposing java.nio bufferedPool memory metrics to 
metrics system




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19700: [SPARK-22471][SQL] SQLListener consumes much memo...

2017-11-09 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/19700#discussion_r150041775
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SQLListener.scala ---
@@ -101,6 +101,8 @@ class SQLListener(conf: SparkConf) extends 
SparkListener with Logging {
 
   private val retainedExecutions = 
conf.getInt("spark.sql.ui.retainedExecutions", 1000)
 
+  private val retainedStages = conf.getInt("spark.ui.retainedStages", 1000)
--- End diff --

BTW, the name should be `spark.sql.ui.retainedStages` instead of 
`spark.ui.retainedStages`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19700: [SPARK-22471][SQL] SQLListener consumes much memo...

2017-11-09 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/19700#discussion_r150041314
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SQLListener.scala ---
@@ -101,6 +101,8 @@ class SQLListener(conf: SparkConf) extends 
SparkListener with Logging {
 
   private val retainedExecutions = 
conf.getInt("spark.sql.ui.retainedExecutions", 1000)
 
+  private val retainedStages = conf.getInt("spark.ui.retainedStages", 1000)
--- End diff --

@tashoyan . Could you add a doc for this like 
`spark.sql.ui.retainedExecutions` here?
Please refer #9052.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19701: [SPARK-22211][SQL][FOLLOWUP] Fix bad merge for tests

2017-11-09 Thread felixcheung

Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/19701
  
thanks!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19699: [MINOR][Core] Fix nits in MetricsSystemSuite

2017-11-09 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/19699


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19515: [SPARK-22287][MESOS] SPARK_DAEMON_MEMORY not honored by ...

2017-11-09 Thread felixcheung

Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/19515
  
@pmackles perhaps you could email this to d...@spark.apache.org to get some 
visibility to this and hopefully someone else on the mesos side can review?



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19699: [MINOR][Core] Fix nits in MetricsSystemSuite

2017-11-09 Thread vanzin

Github user vanzin commented on the issue:

https://github.com/apache/spark/pull/19699
  
Merging to master.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19515: [SPARK-22287][MESOS] SPARK_DAEMON_MEMORY not honored by ...

2017-11-09 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19515
  
**[Test build #83648 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83648/testReport)**
 for PR 19515 at commit 
[`33a8e68`](https://github.com/apache/spark/commit/33a8e6880a468335330a7cb6507493de8b125faa).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19515: [SPARK-22287][MESOS] SPARK_DAEMON_MEMORY not honored by ...

2017-11-09 Thread felixcheung

Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/19515
  
@susanxhuynh or anyone from the mesos side would you please review?



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19543: [SPARK-19606][MESOS] Support constraints in spark-dispat...

2017-11-09 Thread felixcheung

Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/19543
  
@susanxhuynh or anyone from the mesos side would you please review?



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19515: [SPARK-22287][MESOS] SPARK_DAEMON_MEMORY not honored by ...

2017-11-09 Thread felixcheung

Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/19515
  
Jenkins, test this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19657: [SPARK-22344][SPARKR] clean up install dir if run...

2017-11-09 Thread felixcheung

GitHub user felixcheung reopened a pull request:

https://github.com/apache/spark/pull/19657

[SPARK-22344][SPARKR] clean up install dir if running test as source package

## What changes were proposed in this pull request?

remove spark if spark downloaded & installed

## How was this patch tested?

manually by building package
Jenkins, AppVeyor

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/felixcheung/spark rinstalldir

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/19657.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #19657


commit d4433e13565e9e3d41928e1d2262696204476341
Author: Felix Cheung 
Date:   2017-11-04T08:14:33Z

add flag to cleanup

commit 0ea7c9b1c26c604296c35bc1588a6a5606a10cb2
Author: Felix Cheung 
Date:   2017-11-05T03:21:26Z

no get0

commit d0064ca24339143aeac9f1ef78b924361f908248
Author: Felix Cheung 
Date:   2017-11-07T10:27:13Z

make into function

commit 31f3bd06cc7d2b7bf482eddfe2f2738244cfbca7
Author: Felix Cheung 
Date:   2017-11-07T10:50:55Z

fix lint

commit ca5349bfc0dae03c2402b104e51c78a841541b09
Author: Felix Cheung 
Date:   2017-11-07T10:55:27Z

comment

commit f2aa5b7e12ed36e7b56610e695615260643f952f
Author: Felix Cheung 
Date:   2017-11-07T17:31:16Z

fix windows

commit 90d36c9ee3b0aed60ac9343e05b44366d1d2bf43
Author: Felix Cheung 
Date:   2017-11-07T17:38:12Z

more test

commit f21a90bef2a08c9d4cfdcc6588fb2da64679b4ec
Author: Felix Cheung 
Date:   2017-11-07T17:39:05Z

fix

commit 18e238a62d53de5a73283a741c1a9bb8230f4484
Author: Felix Cheung 
Date:   2017-11-08T04:54:53Z

fix 2




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19657: [SPARK-22344][SPARKR] clean up install dir if run...

2017-11-09 Thread felixcheung

Github user felixcheung closed the pull request at:

https://github.com/apache/spark/pull/19657


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19459: [SPARK-20791][PYSPARK] Use Arrow to create Spark DataFra...

2017-11-09 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19459
  
**[Test build #83647 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83647/testReport)**
 for PR 19459 at commit 
[`0ad736b`](https://github.com/apache/spark/commit/0ad736b352eacd394ea6ea684aa851853769e7d1).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19703: [SPARK-22403][SS] Add optional checkpointLocation argume...

2017-11-09 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19703
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19703: [SPARK-22403][SS] Add optional checkpointLocation argume...

2017-11-09 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19703
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83646/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19703: [SPARK-22403][SS] Add optional checkpointLocation argume...

2017-11-09 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19703
  
**[Test build #83646 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83646/testReport)**
 for PR 19703 at commit 
[`171496a`](https://github.com/apache/spark/commit/171496a424ed23ebadafe29ff74de72f3db5a49f).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19701: [SPARK-22211][SQL][FOLLOWUP] Fix bad merge for te...

2017-11-09 Thread henryr

Github user henryr closed the pull request at:

https://github.com/apache/spark/pull/19701


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19704: [SPARK-22417][PYTHON][FOLLOWUP][BRANCH-2.2] Fix for crea...

2017-11-09 Thread dongjoon-hyun

Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/19704
  
Thank you, @ueshin !


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19703: [SPARK-22403][SS] Add optional checkpointLocation...

2017-11-09 Thread wypoon

Github user wypoon commented on a diff in the pull request:

https://github.com/apache/spark/pull/19703#discussion_r150030572
  
--- Diff: 
examples/src/main/scala/org/apache/spark/examples/sql/streaming/StructuredKafkaWordCount.scala
 ---
@@ -46,11 +51,13 @@ object StructuredKafkaWordCount {
   def main(args: Array[String]): Unit = {
 if (args.length < 3) {
   System.err.println("Usage: StructuredKafkaWordCount 
 " +
-" ")
+"  []")
   System.exit(1)
 }
 
-val Array(bootstrapServers, subscribeType, topics) = args
+val Array(bootstrapServers, subscribeType, topics, _*) = args
+val checkpointLocation =
+  if (args.length > 3) args(3) else "/tmp/temporary-" + 
UUID.randomUUID.toString
--- End diff --

This is what the internal default would be if java.io.tmpdir is "/tmp", but 
in case of YARN cluster mode, java.io.tmpdir is something else (the underlying 
problem). Supplying this default here is just to ease the user experience. They 
would get the same result running in YARN cluster mode or client mode, without 
supplying an explicit checkpoint location.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19479: [SPARK-17074] [SQL] Generate equi-height histogram in co...

2017-11-09 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19479
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83645/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19479: [SPARK-17074] [SQL] Generate equi-height histogram in co...

2017-11-09 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19479
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19479: [SPARK-17074] [SQL] Generate equi-height histogram in co...

2017-11-09 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19479
  
**[Test build #83645 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83645/testReport)**
 for PR 19479 at commit 
[`8af3868`](https://github.com/apache/spark/commit/8af38687d638ae2d94d9f76955b182df02404cce).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19703: [SPARK-22403][SS] Add optional checkpointLocation argume...

2017-11-09 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19703
  
**[Test build #83646 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83646/testReport)**
 for PR 19703 at commit 
[`171496a`](https://github.com/apache/spark/commit/171496a424ed23ebadafe29ff74de72f3db5a49f).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19703: [SPARK-22403][SS] Add optional checkpointLocation...

2017-11-09 Thread squito

Github user squito commented on a diff in the pull request:

https://github.com/apache/spark/pull/19703#discussion_r150029549
  
--- Diff: 
examples/src/main/scala/org/apache/spark/examples/sql/streaming/StructuredKafkaWordCount.scala
 ---
@@ -46,11 +51,13 @@ object StructuredKafkaWordCount {
   def main(args: Array[String]): Unit = {
 if (args.length < 3) {
   System.err.println("Usage: StructuredKafkaWordCount 
 " +
-" ")
+"  []")
   System.exit(1)
 }
 
-val Array(bootstrapServers, subscribeType, topics) = args
+val Array(bootstrapServers, subscribeType, topics, _*) = args
+val checkpointLocation =
+  if (args.length > 3) args(3) else "/tmp/temporary-" + 
UUID.randomUUID.toString
--- End diff --

why bother supplying a default?  will this be any better than spark's 
internal default?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19703: [SPARK-22403][SS] Add optional checkpointLocation argume...

2017-11-09 Thread squito

Github user squito commented on the issue:

https://github.com/apache/spark/pull/19703
  
Jenkins, add to whitelist


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19707: [SPARK-22472][SQL] add null check for top-level p...

2017-11-09 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/19707#discussion_r150027756
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala 
---
@@ -1408,6 +1409,23 @@ class DatasetSuite extends QueryTest with 
SharedSQLContext {
   checkDataset(ds, SpecialCharClass("1", "2"))
 }
   }
+
+  test("SPARK-22472: add null check for top-level primitive values") {
+// If the primitive values are from Option, we need to do runtime null 
check.
+val ds = Seq(Some(1), None).toDS().as[Int]
+intercept[NullPointerException](ds.collect())
+val e = intercept[SparkException](ds.map(_ * 2).collect())
+assert(e.getCause.isInstanceOf[NullPointerException])
+
+withTempPath { path =>
+  Seq(new Integer(1), 
null).toDF("i").write.parquet(path.getCanonicalPath)
--- End diff --

Is this PR orthogonal to data source format?
Could you test more data source like `JSON`, here?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19703: [SPARK-22403][SS] Add optional checkpointLocation argume...

2017-11-09 Thread wypoon

Github user wypoon commented on the issue:

https://github.com/apache/spark/pull/19703
  
@srowen This change is indeed just a workaround for an underlying problem, 
as explained in the JIRA. @zsxwing suggested improving the 
StructuredKafkaWordCount example as a workaround. He did not have a proposal on 
how best to address the underlying problem.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19707: [SPARK-22472][SQL] add null check for top-level p...

2017-11-09 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/19707#discussion_r150026840
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala 
---
@@ -134,7 +134,13 @@ object ScalaReflection extends ScalaReflection {
 val tpe = localTypeOf[T]
 val clsName = getClassNameFromType(tpe)
 val walkedTypePath = s"""- root class: "$clsName :: Nil
-deserializerFor(tpe, None, walkedTypePath)
+val expr = deserializerFor(tpe, None, walkedTypePath)
+val Schema(_, nullable) = schemaFor(tpe)
+if (nullable) {
+  expr
+} else {
+  AssertNotNull(expr, walkedTypePath)
+}
--- End diff --

Hi, @cloud-fan .
It looks great. Can we add a test case in `ScalaReflectionSuite`, too?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19707: [SPARK-22472][SQL] add null check for top-level p...

2017-11-09 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19707#discussion_r150024246
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala 
---
@@ -1408,6 +1409,23 @@ class DatasetSuite extends QueryTest with 
SharedSQLContext {
   checkDataset(ds, SpecialCharClass("1", "2"))
 }
   }
+
+  test("SPARK-22472: add null check for top-level primitive values") {
+// If the primitive values are from Option, we need to do runtime null 
check.
+val ds = Seq(Some(1), None).toDS().as[Int]
+intercept[NullPointerException](ds.collect())
+val e = intercept[SparkException](ds.map(_ * 2).collect())
+assert(e.getCause.isInstanceOf[NullPointerException])
+
+withTempPath { path =>
+  Seq(new Integer(1), 
null).toDF("i").write.parquet(path.getCanonicalPath)
--- End diff --

not a big deal, but `toDF("i")` is more explicit about column name.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19651: [SPARK-20682][SPARK-15474][SPARK-21791] Add new ORCFileF...

2017-11-09 Thread dongjoon-hyun

Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/19651
  
Hi, @cloud-fan and @gatorsmile .
Could you review this PR?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19697: [SPARK-22222][CORE][TEST][FOLLOW-UP] Remove redundant an...

2017-11-09 Thread dongjoon-hyun

Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/19697
  
Thank you, @HyukjinKwon , @srowen , and @jiangxb1987 .


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19250: [SPARK-12297] Table timezone correction for Timestamps

2017-11-09 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/19250
  
ok @squito  can we send a new PR to do it? basically in parquet read task, 
get the writer info from the footer. If the writer is impala, and a config is 
set, we treat the seconds as seconds from epoch of session local time zone, and 
adjust the seconds to seconds from Unix epoch.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19701: [SPARK-22211][SQL][FOLLOWUP] Fix bad merge for tests

2017-11-09 Thread dongjoon-hyun

Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/19701
  
Please close this PR, @henryr . `branch-2.2` PR is not closed automatically.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19701: [SPARK-22211][SQL][FOLLOWUP] Fix bad merge for tests

2017-11-09 Thread dongjoon-hyun

Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/19701
  
Thank you, @gatorsmile and @henryr !


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19707: [SPARK-22472][SQL] add null check for top-level primitiv...

2017-11-09 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19707
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19707: [SPARK-22472][SQL] add null check for top-level primitiv...

2017-11-09 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19707
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83644/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19707: [SPARK-22472][SQL] add null check for top-level primitiv...

2017-11-09 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19707
  
**[Test build #83644 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83644/testReport)**
 for PR 19707 at commit 
[`dad5080`](https://github.com/apache/spark/commit/dad50806b27a40ed1112d8ee29b3bd5c60164170).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...

2017-11-09 Thread kiszk

Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/19222
  
ping @cloud-fan for review


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19707: [SPARK-22472][SQL] add null check for top-level primitiv...

2017-11-09 Thread kiszk

Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/19707
  
LGTM except one minor comment


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19707: [SPARK-22472][SQL] add null check for top-level p...

2017-11-09 Thread kiszk

Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/19707#discussion_r150015018
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala 
---
@@ -1408,6 +1409,23 @@ class DatasetSuite extends QueryTest with 
SharedSQLContext {
   checkDataset(ds, SpecialCharClass("1", "2"))
 }
   }
+
+  test("SPARK-22472: add null check for top-level primitive values") {
+// If the primitive values are from Option, we need to do runtime null 
check.
+val ds = Seq(Some(1), None).toDS().as[Int]
+intercept[NullPointerException](ds.collect())
+val e = intercept[SparkException](ds.map(_ * 2).collect())
+assert(e.getCause.isInstanceOf[NullPointerException])
+
+withTempPath { path =>
+  Seq(new Integer(1), 
null).toDF("i").write.parquet(path.getCanonicalPath)
--- End diff --

nit: `toDF()` also works.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19479: [SPARK-17074] [SQL] Generate equi-height histogra...

2017-11-09 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19479#discussion_r150011624
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala ---
@@ -1034,11 +1034,18 @@ private[spark] class HiveExternalCatalog(conf: 
SparkConf, hadoopConf: Configurat
   schema.fields.map(f => (f.name, f.dataType)).toMap
 stats.colStats.foreach { case (colName, colStat) =>
   colStat.toMap(colName, colNameTypeMap(colName)).foreach { case (k, 
v) =>
-statsProperties += (columnStatKeyPropName(colName, k) -> v)
+val statKey = columnStatKeyPropName(colName, k)
+val threshold = conf.get(SCHEMA_STRING_LENGTH_THRESHOLD)
+if (v.length > threshold) {
+  throw new AnalysisException(s"Cannot persist '$statKey' into 
hive metastore as " +
--- End diff --

what if we don't do it? will hive give us an exception?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19702: [SPARK-10365][SQL] Support Parquet logical type TIMESTAM...

2017-11-09 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/19702
  
Is it available in parquet 1.8.2? that's the version Spark currently use.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19702: [SPARK-10365][SQL] Support Parquet logical type TIMESTAM...

2017-11-09 Thread squito

Github user squito commented on the issue:

https://github.com/apache/spark/pull/19702
  
hey thanks for doing this @cloud-fan but I have a small request -- can we 
get another day to review how this works, especially in connection with 
somewhat recent changes in parquet to include a 
[`isAdjustedToUTC`](https://github.com/apache/parquet-format/blob/master/src/main/thrift/parquet.thrift#L271)?
  Just want to make sure this doesn't cause problems with resolving with / 
without time zone in parquet data later on.  (don't think it should, just want 
to take a bit closer look)


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19705: [SPARK-22308][test-maven] Support alternative unit testi...

2017-11-09 Thread nkronenfeld

Github user nkronenfeld commented on the issue:

https://github.com/apache/spark/pull/19705
  
ok, now I question my own testing... does maven not run scalastyle tests? 
Or did I not run the tests properly somehow? I just ran mvn test from root, and 
it all seemed to work on my machine



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19666: [SPARK-22451][ML] Reduce decision tree aggregate size fo...

2017-11-09 Thread WeichenXu123

Github user WeichenXu123 commented on the issue:

https://github.com/apache/spark/pull/19666
  
@facaiy Your idea looks also reasonable. So we can use the condition 
"exclude the first bin" to do the pruning (filter out the other half symmetric 
splits). This condition looks simpler than `1 <= combNumber <= numSplists`, 
`Good idea !
And your code use another traverse order, my current PR is also 
backtracking, with different traverse order, but I think both of them works, 
and both of their complexity will be `O(2^n)`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19708: [SPARK-22479][SQL] Exclude credentials from SaveintoData...

2017-11-09 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19708
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19702: [SPARK-10365][SQL] Support Parquet logical type TIMESTAM...

2017-11-09 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19702
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83643/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19702: [SPARK-10365][SQL] Support Parquet logical type TIMESTAM...

2017-11-09 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19702
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19702: [SPARK-10365][SQL] Support Parquet logical type TIMESTAM...

2017-11-09 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19702
  
**[Test build #83643 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83643/testReport)**
 for PR 19702 at commit 
[`e10c806`](https://github.com/apache/spark/commit/e10c8062e3df5b5caa784b0c10ccd92cf56099d2).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19708: [SPARK-22479][SQL] Exclude credentials from Savei...

2017-11-09 Thread onursatici

GitHub user onursatici opened a pull request:

https://github.com/apache/spark/pull/19708

[SPARK-22479][SQL] Exclude credentials from 
SaveintoDataSourceCommand.simpleString

## What changes were proposed in this pull request?

Do not include jdbc properties which may contain credentials in logging a 
logical plan with `SaveIntoDataSourceCommand` in it.

## How was this patch tested?

building locally and trying to reproduce (per the steps in 
https://issues.apache.org/jira/browse/SPARK-22479):
```
== Parsed Logical Plan ==
SaveIntoDataSourceCommand 
org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider@10ffe32f, 
ErrorIfExists
   +- Range (0, 100, step=1, splits=Some(8))

== Analyzed Logical Plan ==
SaveIntoDataSourceCommand 
org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider@10ffe32f, 
ErrorIfExists
   +- Range (0, 100, step=1, splits=Some(8))

== Optimized Logical Plan ==
SaveIntoDataSourceCommand 
org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider@10ffe32f, 
ErrorIfExists
   +- Range (0, 100, step=1, splits=Some(8))

== Physical Plan ==
Execute SaveIntoDataSourceCommand
   +- SaveIntoDataSourceCommand 
org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider@10ffe32f, 
ErrorIfExists
 +- Range (0, 100, step=1, splits=Some(8))
```

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/onursatici/spark os/redact-jdbc-creds

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/19708.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #19708


commit 04aa9f0363f6202a5358e41587415da4fa5f425e
Author: osatici 
Date:   2017-11-09T14:06:05Z

do not log properties on SaveintoDataSourceCommand.simpleString




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer t...

2017-11-09 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/17819


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-11-09 Thread MLnick

Github user MLnick commented on the issue:

https://github.com/apache/spark/pull/17819
  
Merged to master. Thanks @viirya and all the reviewers!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19479: [SPARK-17074] [SQL] Generate equi-height histogram in co...

2017-11-09 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19479
  
**[Test build #83645 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83645/testReport)**
 for PR 19479 at commit 
[`8af3868`](https://github.com/apache/spark/commit/8af38687d638ae2d94d9f76955b182df02404cce).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19661: [SPARK-22450][Core][Mllib]safely register class for mlli...

2017-11-09 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19661
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19661: [SPARK-22450][Core][Mllib]safely register class for mlli...

2017-11-09 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19661
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83642/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19661: [SPARK-22450][Core][Mllib]safely register class for mlli...

2017-11-09 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19661
  
**[Test build #83642 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83642/testReport)**
 for PR 19661 at commit 
[`2eb1b62`](https://github.com/apache/spark/commit/2eb1b62c6fb281f89f05aa8a3c0fcd923ed62cf4).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19707: [SPARK-22472][SQL] add null check for top-level primitiv...

2017-11-09 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19707
  
**[Test build #83644 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83644/testReport)**
 for PR 19707 at commit 
[`dad5080`](https://github.com/apache/spark/commit/dad50806b27a40ed1112d8ee29b3bd5c60164170).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19707: [SPARK-22472][SQL] add null check for top-level primitiv...

2017-11-09 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/19707
  
cc @gatorsmile @kiszk @srowen 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19707: [SPARK-22472][SQL] add null check for top-level p...

2017-11-09 Thread cloud-fan

GitHub user cloud-fan opened a pull request:

https://github.com/apache/spark/pull/19707

[SPARK-22472][SQL] add null check for top-level primitive values

## What changes were proposed in this pull request?

One powerful feature of `Dataset` is, we can easily map SQL rows to 
Scala/Java objects and do runtime null check automatically.

For example, let's say we have a parquet file with schema ``, and we have a `case class Data(a: Int, b: String)`. Users can easily 
read this parquet file into `Data` objects, and Spark will throw NPE if column 
`a` has null values.

However the null checking is left behind for top-level primitive values. 
For example, let's say we have a parquet file with schema ``, and we 
read it into Scala `Int`. If column `a` has null values, we will get some weird 
results.
```
scala> val ds = spark.read.parquet(...).as[Int]

scala> ds.show()
++
|v   |
++
|null|
|1   |
++

scala> ds.collect
res0: Array[Long] = Array(0, 1)

scala> ds.map(_ * 2).show
+-+
|value|
+-+
|-2   |
|2|
+-+
```

This is because internally Spark use some special default values for 
primitive types, but never expect users to see/operate these default value 
directly.

This PR adds null check for top-level primitive values

## How was this patch tested?

new test

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/cloud-fan/spark bug

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/19707.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #19707


commit dad50806b27a40ed1112d8ee29b3bd5c60164170
Author: Wenchen Fan 
Date:   2017-11-09T13:39:10Z

add null check for top-level primitive values




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19666: [SPARK-22451][ML] Reduce decision tree aggregate size fo...

2017-11-09 Thread facaiy

Github user facaiy commented on the issue:

https://github.com/apache/spark/pull/19666
  
In fact, I'm not sure whether the idea is right, so no hesitate to correct 
me. I assume the algorithm requires O(N^2) complexity. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19702: [SPARK-10365][SQL] Support Parquet logical type TIMESTAM...

2017-11-09 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19702
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19702: [SPARK-10365][SQL] Support Parquet logical type TIMESTAM...

2017-11-09 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19702
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83641/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19702: [SPARK-10365][SQL] Support Parquet logical type TIMESTAM...

2017-11-09 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19702
  
**[Test build #83641 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83641/testReport)**
 for PR 19702 at commit 
[`af62d30`](https://github.com/apache/spark/commit/af62d301ee9d2f3f9ed0a5797110b6388b78f3e6).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19543: [SPARK-19606][MESOS] Support constraints in spark-dispat...

2017-11-09 Thread pmackles

Github user pmackles commented on the issue:

https://github.com/apache/spark/pull/19543
  
@felixcheung - any chance of getting this merged into the upcoming 2.2.1 
release? I cleaned up the merge conflict


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19666: [SPARK-22451][ML] Reduce decision tree aggregate size fo...

2017-11-09 Thread facaiy

Github user facaiy commented on the issue:

https://github.com/apache/spark/pull/19666
  
Hi, I write a demo with python. I'll be happy if it could be useful.

For N bins, say `[x_1, x_2, ..., x_N]`, since all its splits contain either 
`x_1` or not, so we can choose the half splits which doesn't contain x_1 as 
left splits.
If I understand it correctly, the left splits are indeed all combinations 
of the left bins, `[x_2, x_3, ... x_N]`. The problem can be solved by the 
[backtracking algorithm](https://en.wikipedia.org/wiki/Backtracking).

Please correct me if I'm wrong. Thanks very much.

```python
#!/usr/bin/env python

def gen_splits(bins):
if len(bins) == 1:
return bins
results = []
partial_res = []
gen_splits_iter(1, bins, partial_res, results)
return results


def gen_splits_iter(dep, bins, partial_res, results):
if partial_res:
left_splits = partial_res[:]
right_splits = [x for x in bins if x not in left_splits]
results.append("left: {:20}, right: {}".format(str(left_splits), 
right_splits))

for m in range(dep, len(bins)):
partial_res.append(bins[m])
gen_splits_iter(m+1, bins, partial_res, results)
partial_res.pop()


if __name__ == "__main__":
print("first example:")
bins = ["a", "b", "c"]
print("bins: {}\n-".format(bins))
splits = gen_splits(bins)
for s in splits:
print(s)

print("\n\n=")
print("second example:")
bins = ["a", "b", "c", "d", "e"]
print("bins: {}\n-".format(bins))
splits = gen_splits(bins)
for s in splits:
print(s)
```

logs:
```bash
~/Downloads â¯â¯â¯ python test.py
first example:
bins: ['a', 'b', 'c']
-
left: ['b']   , right: ['a', 'c']
left: ['b', 'c']  , right: ['a']
left: ['c']   , right: ['a', 'b']


=
second example:
bins: ['a', 'b', 'c', 'd', 'e']
-
left: ['b']   , right: ['a', 'c', 'd', 'e']
left: ['b', 'c']  , right: ['a', 'd', 'e']
left: ['b', 'c', 'd'] , right: ['a', 'e']
left: ['b', 'c', 'd', 'e'], right: ['a']
left: ['b', 'c', 'e'] , right: ['a', 'd']
left: ['b', 'd']  , right: ['a', 'c', 'e']
left: ['b', 'd', 'e'] , right: ['a', 'c']
left: ['b', 'e']  , right: ['a', 'c', 'd']
left: ['c']   , right: ['a', 'b', 'd', 'e']
left: ['c', 'd']  , right: ['a', 'b', 'e']
left: ['c', 'd', 'e'] , right: ['a', 'b']
left: ['c', 'e']  , right: ['a', 'b', 'd']
left: ['d']   , right: ['a', 'b', 'c', 'e']
left: ['d', 'e']  , right: ['a', 'b', 'c']
left: ['e']   , right: ['a', 'b', 'c', 'd']
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19156: [SPARK-19634][SQL][ML][FOLLOW-UP] Improve interfa...

2017-11-09 Thread WeichenXu123

Github user WeichenXu123 commented on a diff in the pull request:

https://github.com/apache/spark/pull/19156#discussion_r149956415
  
--- Diff: mllib/src/main/scala/org/apache/spark/ml/stat/Summarizer.scala ---
@@ -527,27 +570,28 @@ private[ml] object SummaryBuilderImpl extends Logging 
{
   weightExpr: Expression,
   mutableAggBufferOffset: Int,
   inputAggBufferOffset: Int)
-extends TypedImperativeAggregate[SummarizerBuffer] {
+extends TypedImperativeAggregate[SummarizerBuffer] with 
ImplicitCastInputTypes {
 
-override def eval(state: SummarizerBuffer): InternalRow = {
+override def eval(state: SummarizerBuffer): Any = {
--- End diff --

Both of them works, but other similar aggregate function also use `Any`. 
Will it cause some issues ?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19515: [SPARK-22287][MESOS] SPARK_DAEMON_MEMORY not honored by ...

2017-11-09 Thread pmackles

Github user pmackles commented on the issue:

https://github.com/apache/spark/pull/19515
  
@felixcheung - any chance of getting this tiny change merged and included 
in the upcoming 2.2.1 release?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19250: [SPARK-12297] Table timezone correction for Timestamps

2017-11-09 Thread zivanfi

Github user zivanfi commented on the issue:

https://github.com/apache/spark/pull/19250
  
Yes, that is correct. We introduced the table property to address the 2nd 
problem I mentioned above: "The adjustment depends on the local timezone." 
(details in my [previous 
comment](https://github.com/apache/spark/pull/19250#issuecomment-342787956)). 
But I think that a simpler workaround similar to what already exists in Hive 
would already be a big step forward for interoperability of existing data.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19687: [SPARK-19644][SQL]Clean up Scala reflection garbage afte...

2017-11-09 Thread ManchesterUnited16

Github user ManchesterUnited16 commented on the issue:

https://github.com/apache/spark/pull/19687
  
can you show me you maven dependency when you ran the program,thank you 
very much!






At 2017-11-09 13:37:46, "Shixiong Zhu"  wrote:


@ManchesterUnited16 I ran your codes and didn't see 
NotSerializableException. How did you patch Spark with my PR?

â
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19702: [SPARK-10365][SQL] Support Parquet logical type TIMESTAM...

2017-11-09 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19702
  
**[Test build #83643 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83643/testReport)**
 for PR 19702 at commit 
[`e10c806`](https://github.com/apache/spark/commit/e10c8062e3df5b5caa784b0c10ccd92cf56099d2).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19156: [SPARK-19634][SQL][ML][FOLLOW-UP] Improve interfa...

2017-11-09 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19156#discussion_r149943555
  
--- Diff: mllib/src/main/scala/org/apache/spark/ml/stat/Summarizer.scala ---
@@ -94,46 +98,87 @@ object Summarizer extends Logging {
*  - min: the minimum for each coefficient.
*  - normL2: the Euclidian norm for each coefficient.
*  - normL1: the L1 norm of each coefficient (sum of the absolute 
values).
-   * @param firstMetric the metric being provided
-   * @param metrics additional metrics that can be provided.
+   * @param metrics metrics that can be provided.
* @return a builder.
* @throws IllegalArgumentException if one of the metric names is not 
understood.
*
* Note: Currently, the performance of this interface is about 2x~3x 
slower then using the RDD
* interface.
*/
   @Since("2.3.0")
-  def metrics(firstMetric: String, metrics: String*): SummaryBuilder = {
-val (typedMetrics, computeMetrics) = 
getRelevantMetrics(Seq(firstMetric) ++ metrics)
+  @scala.annotation.varargs
+  def metrics(metrics: String*): SummaryBuilder = {
--- End diff --

ah then it doesn't matter


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19695: [SPARK-22377][BUILD] Use /usr/sbin/lsof if lsof does not...

2017-11-09 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19695
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83638/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19695: [SPARK-22377][BUILD] Use /usr/sbin/lsof if lsof does not...

2017-11-09 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19695
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19695: [SPARK-22377][BUILD] Use /usr/sbin/lsof if lsof does not...

2017-11-09 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19695
  
**[Test build #83638 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83638/testReport)**
 for PR 19695 at commit 
[`a6642fa`](https://github.com/apache/spark/commit/a6642fa41795cff82ec30c38e3c909d8025f358f).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19156: [SPARK-19634][SQL][ML][FOLLOW-UP] Improve interfa...

2017-11-09 Thread WeichenXu123

Github user WeichenXu123 commented on a diff in the pull request:

https://github.com/apache/spark/pull/19156#discussion_r149941345
  
--- Diff: mllib/src/main/scala/org/apache/spark/ml/stat/Summarizer.scala ---
@@ -94,46 +98,87 @@ object Summarizer extends Logging {
*  - min: the minimum for each coefficient.
*  - normL2: the Euclidian norm for each coefficient.
*  - normL1: the L1 norm of each coefficient (sum of the absolute 
values).
-   * @param firstMetric the metric being provided
-   * @param metrics additional metrics that can be provided.
+   * @param metrics metrics that can be provided.
* @return a builder.
* @throws IllegalArgumentException if one of the metric names is not 
understood.
*
* Note: Currently, the performance of this interface is about 2x~3x 
slower then using the RDD
* interface.
*/
   @Since("2.3.0")
-  def metrics(firstMetric: String, metrics: String*): SummaryBuilder = {
-val (typedMetrics, computeMetrics) = 
getRelevantMetrics(Seq(firstMetric) ++ metrics)
+  @scala.annotation.varargs
+  def metrics(metrics: String*): SummaryBuilder = {
--- End diff --

This class was added after 2.2, does it matters ?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19702: [SPARK-10365][SQL] Support Parquet logical type T...

2017-11-09 Thread ueshin

Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/19702#discussion_r149940418
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaSuite.scala
 ---
@@ -982,7 +941,7 @@ class ParquetSchemaSuite extends ParquetSchemaTest {
 binaryAsString = true,
 int96AsTimestamp = false,
 writeLegacyParquetFormat = true,
-int64AsTimestampMillis = true)
+outputTimestampType = 
SQLConf.ParquetOutputTimestampType.TIMESTAMP_MILLIS)
--- End diff --

Should we add a test for `TIMESTAMP_MICROS` just in case?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19702: [SPARK-10365][SQL] Support Parquet logical type TIMESTAM...

2017-11-09 Thread ueshin

Github user ueshin commented on the issue:

https://github.com/apache/spark/pull/19702
  
LGTM pending tests.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19649: [SPARK-22405][SQL] Add new alter table and alter ...

2017-11-09 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/19649


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19649: [SPARK-22405][SQL] Add new alter table and alter databas...

2017-11-09 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/19649
  
thanks, merging to master!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19664: [SPARK-22442][SQL] ScalaReflection should produce...

2017-11-09 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/19664


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19664: [SPARK-22442][SQL] ScalaReflection should produce correc...

2017-11-09 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/19664
  
thanks, merging to master!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19661: [SPARK-22450][Core][Mllib]safely register class for mlli...

2017-11-09 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19661
  
**[Test build #83642 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83642/testReport)**
 for PR 19661 at commit 
[`2eb1b62`](https://github.com/apache/spark/commit/2eb1b62c6fb281f89f05aa8a3c0fcd923ed62cf4).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19156: [SPARK-19634][SQL][ML][FOLLOW-UP] Improve interfa...

2017-11-09 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19156#discussion_r149928022
  
--- Diff: mllib/src/main/scala/org/apache/spark/ml/stat/Summarizer.scala ---
@@ -94,46 +98,87 @@ object Summarizer extends Logging {
*  - min: the minimum for each coefficient.
*  - normL2: the Euclidian norm for each coefficient.
*  - normL1: the L1 norm of each coefficient (sum of the absolute 
values).
-   * @param firstMetric the metric being provided
-   * @param metrics additional metrics that can be provided.
+   * @param metrics metrics that can be provided.
* @return a builder.
* @throws IllegalArgumentException if one of the metric names is not 
understood.
*
* Note: Currently, the performance of this interface is about 2x~3x 
slower then using the RDD
* interface.
*/
   @Since("2.3.0")
-  def metrics(firstMetric: String, metrics: String*): SummaryBuilder = {
-val (typedMetrics, computeMetrics) = 
getRelevantMetrics(Seq(firstMetric) ++ metrics)
+  @scala.annotation.varargs
+  def metrics(metrics: String*): SummaryBuilder = {
--- End diff --

How about binary compatibility? e.g. spark jobs built with old spark 
versions, can they run on new Spark without re-compile?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19661: [SPARK-22450][Core][Mllib]safely register class f...

2017-11-09 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19661#discussion_r149927241
  
--- Diff: 
core/src/main/scala/org/apache/spark/serializer/KryoSerializer.scala ---
@@ -178,6 +179,28 @@ class KryoSerializer(conf: SparkConf)
 
kryo.register(Utils.classForName("scala.collection.immutable.Map$EmptyMap$"))
 kryo.register(classOf[ArrayBuffer[Any]])
 
+// We can't load those class directly in order to avoid unnecessary 
jar dependencies.
+// We load them safely, ignore it if the class not found.
+Seq("org.apache.spark.mllib.linalg.Vector",
+  "org.apache.spark.mllib.linalg.DenseVector",
+  "org.apache.spark.mllib.linalg.SparseVector",
+  "org.apache.spark.mllib.linalg.Matrix",
+  "org.apache.spark.mllib.linalg.DenseMatrix",
+  "org.apache.spark.mllib.linalg.SparseMatrix",
+  "org.apache.spark.ml.linalg.Vector",
+  "org.apache.spark.ml.linalg.DenseVector",
+  "org.apache.spark.ml.linalg.SparseVector",
+  "org.apache.spark.ml.linalg.Matrix",
+  "org.apache.spark.ml.linalg.DenseMatrix",
+  "org.apache.spark.ml.linalg.SparseMatrix",
+  "org.apache.spark.ml.feature.Instance",
+  "org.apache.spark.ml.feature.OffsetInstance"
+).map(name => Try(Utils.classForName(name))).foreach { t =>
--- End diff --

a bit curious, can't we do
```
Seq(
  ...
).foreach { clsName =>
  try {
val cls = Utils.classForName(clsName)
kryo.register(cls)
  } catch {
case NonFatal(_) => // do nothing
  }
}
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19532: [DOC]update the API doc and modify the stage API ...

2017-11-09 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/19532


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19661: [SPARK-22450][Core][Mllib]safely register class for mlli...

2017-11-09 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/19661
  
ok to test


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15332: [SPARK-10364][SQL] Support Parquet logical type TIMESTAM...

2017-11-09 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/15332
  
great, thanks!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19532: [DOC]update the API doc and modify the stage API descrip...

2017-11-09 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/19532
  
thanks, merging to master!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19706: [SPARK-22476][R] Add dayofweek function to R

2017-11-09 Thread HyukjinKwon

GitHub user HyukjinKwon reopened a pull request:

https://github.com/apache/spark/pull/19706

[SPARK-22476][R] Add dayofweek function to R

## What changes were proposed in this pull request?

This PR adds `dayofweek` to R API:

```r
data <- list(list(d = as.Date("2012-12-13")),
 list(d = as.Date("2013-12-14")),
 list(d = as.Date("2014-12-15")))
df <- createDataFrame(data)
collect(select(df, dayofweek(df$d)))
```

```
  dayofweek(d)
15
27
32
```

## How was this patch tested?

Manual tests and unit tests in `R/pkg/tests/fulltests/test_sparkSQL.R`

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/HyukjinKwon/spark add-dayofweek

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/19706.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #19706


commit d24a89b6a756457c651d0c208ccbe59b979e9ecc
Author: hyukjinkwon 
Date:   2017-11-08T11:31:35Z

Add support for dayofweek function in R




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19706: [SPARK-22476][R] Add dayofweek function to R

2017-11-09 Thread HyukjinKwon

Github user HyukjinKwon closed the pull request at:

https://github.com/apache/spark/pull/19706


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19702: [SPARK-10365][SQL] Support Parquet logical type TIMESTAM...

2017-11-09 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19702
  
**[Test build #83641 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83641/testReport)**
 for PR 19702 at commit 
[`af62d30`](https://github.com/apache/spark/commit/af62d301ee9d2f3f9ed0a5797110b6388b78f3e6).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19702: [SPARK-10365][SQL] Support Parquet logical type T...

2017-11-09 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19702#discussion_r149924096
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala
 ---
@@ -428,15 +417,9 @@ object ParquetFileFormat extends Logging {
   private[parquet] def readSchema(
   footers: Seq[Footer], sparkSession: SparkSession): 
Option[StructType] = {
 
-def parseParquetSchema(schema: MessageType): StructType = {
-  val converter = new ParquetSchemaConverter(
-sparkSession.sessionState.conf.isParquetBinaryAsString,
-sparkSession.sessionState.conf.isParquetBinaryAsString,
-sparkSession.sessionState.conf.writeLegacyParquetFormat,
-sparkSession.sessionState.conf.isParquetINT64AsTimestampMillis)
-
-  converter.convert(schema)
-}
+val converter = new ParquetToSparkSchemaConverter(
+  sparkSession.sessionState.conf.isParquetBinaryAsString,
+  sparkSession.sessionState.conf.isParquetBinaryAsString)
--- End diff --

good catch! It's an existing type ...


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19156: [SPARK-19634][SQL][ML][FOLLOW-UP] Improve interface of d...

2017-11-09 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19156
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83640/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19156: [SPARK-19634][SQL][ML][FOLLOW-UP] Improve interface of d...

2017-11-09 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19156
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19156: [SPARK-19634][SQL][ML][FOLLOW-UP] Improve interface of d...

2017-11-09 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19156
  
**[Test build #83640 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83640/testReport)**
 for PR 19156 at commit 
[`2e4b232`](https://github.com/apache/spark/commit/2e4b232adabe45e9dcafad72ca9c1d3ba5b34dce).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

< 1 2 3 4 >

201 - 300 of 340 matches

Mail list logo