date:20170301

[GitHub] spark issue #16782: [SPARK-19348][PYTHON] PySpark keyword_only decorator is ...

2017-03-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16782
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73713/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16782: [SPARK-19348][PYTHON] PySpark keyword_only decorator is ...

2017-03-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16782
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17129: [SPARK-19373][MESOS] Base spark.scheduler.minRegisteredR...

2017-03-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17129
  
**[Test build #73717 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73717/testReport)**
 for PR 17129 at commit 
[`0d84296`](https://github.com/apache/spark/commit/0d84296ca09423121ed8707eb0c083516bb1440c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16782: [SPARK-19348][PYTHON] PySpark keyword_only decorator is ...

2017-03-01 Thread BryanCutler

Github user BryanCutler commented on the issue:

https://github.com/apache/spark/pull/16782
  
I think this is ready for a final review @jkbradley @davies - thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17110: [SPARK-19635][ML] DataFrame-based API for chi squ...

2017-03-01 Thread imatiach-msft

Github user imatiach-msft commented on a diff in the pull request:

https://github.com/apache/spark/pull/17110#discussion_r103813679
  
--- Diff: 
mllib/src/test/scala/org/apache/spark/ml/stat/ChiSquareSuite.scala ---
@@ -0,0 +1,94 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.ml.stat
+
+import java.util.Random
+
+import org.apache.spark.{SparkException, SparkFunSuite}
+import org.apache.spark.ml.feature.LabeledPoint
+import org.apache.spark.ml.linalg.{Vector, Vectors}
+import org.apache.spark.ml.util.DefaultReadWriteTest
+import org.apache.spark.ml.util.TestingUtils._
+import org.apache.spark.mllib.util.MLlibTestSparkContext
+
+class ChiSquareSuite
+  extends SparkFunSuite with MLlibTestSparkContext with 
DefaultReadWriteTest {
+
+  import testImplicits._
+
+  test("test DataFrame of labeled points") {
+// labels: 1.0 (2 / 6), 0.0 (4 / 6)
+// feature1: 0.5 (1 / 6), 1.5 (2 / 6), 3.5 (3 / 6)
+// feature2: 10.0 (1 / 6), 20.0 (1 / 6), 30.0 (2 / 6), 40.0 (2 / 6)
+val data = Seq(
+  LabeledPoint(0.0, Vectors.dense(0.5, 10.0)),
+  LabeledPoint(0.0, Vectors.dense(1.5, 20.0)),
+  LabeledPoint(1.0, Vectors.dense(1.5, 30.0)),
+  LabeledPoint(0.0, Vectors.dense(3.5, 30.0)),
+  LabeledPoint(0.0, Vectors.dense(3.5, 40.0)),
+  LabeledPoint(1.0, Vectors.dense(3.5, 40.0)))
+for (numParts <- List(2, 4, 6, 8)) {
+  val df = spark.createDataFrame(sc.parallelize(data, numParts))
+  val chi = ChiSquare.test(df, "features", "label")
+  val (pValues: Vector, degreesOfFreedom: Array[Int], statistics: 
Vector) =
+chi.select("pValues", "degreesOfFreedom", "statistics")
+  .as[(Vector, Array[Int], Vector)].head()
+  assert(pValues ~== Vectors.dense(0.6873, 0.6823) relTol 1e-4)
+  assert(degreesOfFreedom === Array(2, 3))
+  assert(statistics ~== Vectors.dense(0.75, 1.5) relTol 1e-4)
+}
+  }
+
+  test("large number of features (SPARK-3087)") {
+// Test that the right number of results is returned
+val numCols = 1001
+val sparseData = Array(
+  LabeledPoint(0.0, Vectors.sparse(numCols, Seq((100, 2.0,
+  LabeledPoint(0.1, Vectors.sparse(numCols, Seq((200, 1.0)
+val df = spark.createDataFrame(sparseData)
+val chi = ChiSquare.test(df, "features", "label")
+val (pValues: Vector, degreesOfFreedom: Array[Int], statistics: 
Vector) =
+  chi.select("pValues", "degreesOfFreedom", "statistics")
+.as[(Vector, Array[Int], Vector)].head()
+assert(pValues.size === numCols)
+assert(degreesOfFreedom.length === numCols)
+assert(statistics.size === numCols)
+assert(pValues(1000) !== null)  // SPARK-3087
+  }
+
+  test("fail on continuous features or labels") {
+// Detect continuous features or labels
+val random = new Random(11L)
+val continuousLabel =
+  Seq.fill(10)(LabeledPoint(random.nextDouble(), 
Vectors.dense(random.nextInt(2
--- End diff --

can the special value that is above the max categorical limit of 1 be 
refactored to a constant?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14299: [SPARK-16440][MLlib] Ensure broadcasted variables are de...

2017-03-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14299
  
**[Test build #3588 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3588/testReport)**
 for PR 14299 at commit 
[`6b8ae85`](https://github.com/apache/spark/commit/6b8ae85dc362ebef0f8d416a8e35970f57130a9f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17106: [SPARK-19775][SQL] Remove an obsolete `partitionBy().ins...

2017-03-01 Thread srowen

Github user srowen commented on the issue:

https://github.com/apache/spark/pull/17106
  
Merged to master


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16954: [SPARK-18874][SQL] First phase: Deferring the correlated...

2017-03-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16954
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16954: [SPARK-18874][SQL] First phase: Deferring the correlated...

2017-03-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16954
  
**[Test build #73712 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73712/testReport)**
 for PR 16954 at commit 
[`3b4bb90`](https://github.com/apache/spark/commit/3b4bb90deb34e6c1bb1671c76b66c83741937578).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16954: [SPARK-18874][SQL] First phase: Deferring the correlated...

2017-03-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16954
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73712/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16944: [SPARK-19611][SQL] Introduce configurable table schema i...

2017-03-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16944
  
**[Test build #73720 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73720/testReport)**
 for PR 16944 at commit 
[`281bc6d`](https://github.com/apache/spark/commit/281bc6d53fbd0c0b5a99224d700b7d929397f090).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17067: [SPARK-19602][SQL][TESTS] Add tests for qualified column...

2017-03-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17067
  
**[Test build #73722 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73722/testReport)**
 for PR 17067 at commit 
[`5594eb0`](https://github.com/apache/spark/commit/5594eb0864376bbac617bf744755330f1e7bff49).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17067: [SPARK-19602][SQL][TESTS] Add tests for qualified column...

2017-03-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17067
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73722/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17067: [SPARK-19602][SQL][TESTS] Add tests for qualified column...

2017-03-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17067
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16782: [SPARK-19348][PYTHON] PySpark keyword_only decorator is ...

2017-03-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16782
  
**[Test build #73713 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73713/testReport)**
 for PR 16782 at commit 
[`e578320`](https://github.com/apache/spark/commit/e5783209dff55a6010ca17da819542f7a1cdb12c).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17081: [SPARK-18726][SQL][FOLLOW-UP]resolveRelation for FileFor...

2017-03-01 Thread windpiger

Github user windpiger commented on the issue:

https://github.com/apache/spark/pull/17081
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17081: [SPARK-18726][SQL][FOLLOW-UP]resolveRelation for FileFor...

2017-03-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17081
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17081: [SPARK-18726][SQL][FOLLOW-UP]resolveRelation for FileFor...

2017-03-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17081
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73716/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17100: [SPARK-13947][PYTHON][SQL] PySpark DataFrames: The error...

2017-03-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17100
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17100: [SPARK-13947][PYTHON][SQL] PySpark DataFrames: The error...

2017-03-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17100
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73714/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17100: [SPARK-13947][PYTHON][SQL] PySpark DataFrames: The error...

2017-03-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17100
  
**[Test build #73714 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73714/testReport)**
 for PR 17100 at commit 
[`65b9596`](https://github.com/apache/spark/commit/65b9596c229ac2b62ecdfeb98e541d2ea92e078d).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17106: [SPARK-19775][SQL] Remove an obsolete `partitionB...

2017-03-01 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/17106


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14299: [SPARK-16440][MLlib] Ensure broadcasted variables are de...

2017-03-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14299
  
**[Test build #3588 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3588/testReport)**
 for PR 14299 at commit 
[`6b8ae85`](https://github.com/apache/spark/commit/6b8ae85dc362ebef0f8d416a8e35970f57130a9f).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17124: [SPARK-19779][SS]Delete needless tmp file after restart ...

2017-03-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17124
  
**[Test build #3589 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3589/testReport)**
 for PR 17124 at commit 
[`5600776`](https://github.com/apache/spark/commit/5600776066e083655fe328915b56936775273e15).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17130: [SPARK-19791] [ML] Add doc and example for fpgrowth

2017-03-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17130
  
**[Test build #73719 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73719/testReport)**
 for PR 17130 at commit 
[`fdce240`](https://github.com/apache/spark/commit/fdce2404688fee1b22154258de5d85f0cee8aa4b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17081: [SPARK-18726][SQL][FOLLOW-UP]resolveRelation for FileFor...

2017-03-01 Thread windpiger

Github user windpiger commented on the issue:

https://github.com/apache/spark/pull/17081
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17081: [SPARK-18726][SQL][FOLLOW-UP]resolveRelation for FileFor...

2017-03-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17081
  
**[Test build #73715 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73715/testReport)**
 for PR 17081 at commit 
[`f1da0a4`](https://github.com/apache/spark/commit/f1da0a4cf457f4efb6128beca3c08ccf95ef37a0).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17031: [SPARK-19702][MESOS] Increase default refuse_seconds tim...

2017-03-01 Thread mgummelt

Github user mgummelt commented on the issue:

https://github.com/apache/spark/pull/17031
  
Your understanding is correct.  You must set refuse_seconds for all your 
frameworks to some value N, such that N >= #frameworks.  So for this change, if 
some operator is running >120 frameworks, they may need to configure this 
value.  However, I'm not aware of any Mesos cluster on Earth running that many 
frameworks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17124: [SPARK-19779][SS]Delete needless tmp file after restart ...

2017-03-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17124
  
**[Test build #3589 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3589/testReport)**
 for PR 17124 at commit 
[`5600776`](https://github.com/apache/spark/commit/5600776066e083655fe328915b56936775273e15).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17122: [SPARK-19786][SQL] Facilitate loop optimizations ...

2017-03-01 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/17122#discussion_r103838737
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala
 ---
@@ -77,6 +77,10 @@ trait CodegenSupport extends SparkPlan {
*/
   final def produce(ctx: CodegenContext, parent: CodegenSupport): String = 
executeQuery {
 this.parent = parent
+
+// to track the existence of apply() call in the current 
produce-consume cycle
+// if apply is not called (e.g. in aggregation), we can skip shoudStop 
in the inner-most loop
+parent.shouldStopRequired = false
--- End diff --

Do we need this? The default value of `shouldStopRequired` is already false.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17129: [SPARK-19373][MESOS] Base spark.scheduler.minRegisteredR...

2017-03-01 Thread srowen

Github user srowen commented on the issue:

https://github.com/apache/spark/pull/17129
  
@mgummelt I merged this to 2.1 (and so you will have to close this PR 
manually) but the cherry-pick to 2.0.x doesn't succeed either, and it's 
non-trivial. If you're willing to evaluate the conflict and resolve it for 2.0 
I can merge that too. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17130: [SPARK-19791] [ML] Add doc and example for fpgrow...

2017-03-01 Thread srowen

Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/17130#discussion_r103822448
  
--- Diff: mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala ---
@@ -240,12 +240,13 @@ class FPGrowthModel private[ml] (
 val predictUDF = udf((items: Seq[_]) => {
   if (items != null) {
 val itemset = items.toSet
-brRules.value.flatMap(rule =>
-  if (items != null && rule._1.forall(item => 
itemset.contains(item))) {
+brRules.value.flatMap { rule =>
--- End diff --

Nit, while we're here -- why change this bit?
Or if simplifying, what about

```

brRules.value.filter(_._1_forall(itemset.contains)).flatMap(_._2.filter(!itemset.contains(_)))
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17059: [SPARK-19733][ML]Removed unnecessary castings and refact...

2017-03-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17059
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17059: [SPARK-19733][ML]Removed unnecessary castings and refact...

2017-03-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17059
  
**[Test build #73718 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73718/testReport)**
 for PR 17059 at commit 
[`3050f6e`](https://github.com/apache/spark/commit/3050f6eeda769127196e8d1ad4b432b92af0ea7c).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17059: [SPARK-19733][ML]Removed unnecessary castings and refact...

2017-03-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17059
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73718/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17067: [SPARK-19602][SQL][TESTS] Add tests for qualified column...

2017-03-01 Thread skambha

Github user skambha commented on the issue:

https://github.com/apache/spark/pull/17067
  
- Changes to the SQLQueryTestSuite framework to mask the exprId so I can 
add the -ve cases as well using this framework.
- Added -ve test cases to the SQLQueryTestSuite framework and so removed 
the hive specific test suite.  For the hive table testcase, I will add that 
test as part of the actual code changes PR.
- I synced up the codeline and there was one test output inner-join.sql.out 
that needed a comment to be updated, so I have updated that as well. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17067: [SPARK-19602][SQL][TESTS] Add tests for qualified column...

2017-03-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17067
  
**[Test build #73722 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73722/testReport)**
 for PR 17067 at commit 
[`5594eb0`](https://github.com/apache/spark/commit/5594eb0864376bbac617bf744755330f1e7bff49).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17129: [SPARK-19373][MESOS] Base spark.scheduler.minRegi...

2017-03-01 Thread mgummelt

Github user mgummelt closed the pull request at:

https://github.com/apache/spark/pull/17129


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17120: [SPARK-19715][Structured Streaming] Option to Strip Path...

2017-03-01 Thread lw-lin

Github user lw-lin commented on the issue:

https://github.com/apache/spark/pull/17120
  
@steveloughran thanks for the comments.

@marmbrus @zsxwing it'd be great if you could share some thoughts!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17059: [SPARK-19733][ML]Removed unnecessary castings and refact...

2017-03-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17059
  
**[Test build #73718 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73718/testReport)**
 for PR 17059 at commit 
[`3050f6e`](https://github.com/apache/spark/commit/3050f6eeda769127196e8d1ad4b432b92af0ea7c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17106: [SPARK-19775][SQL] Remove an obsolete `partitionBy().ins...

2017-03-01 Thread dongjoon-hyun

Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/17106
  
Thank you for merging, @srowen .


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17081: [SPARK-18726][SQL][FOLLOW-UP]resolveRelation for FileFor...

2017-03-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17081
  
**[Test build #73716 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73716/testReport)**
 for PR 17081 at commit 
[`f79f12c`](https://github.com/apache/spark/commit/f79f12c552ee1721295c347744fc5f92f048c74b).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16944: [SPARK-19611][SQL] Introduce configurable table schema i...

2017-03-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16944
  
**[Test build #73720 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73720/testReport)**
 for PR 16944 at commit 
[`281bc6d`](https://github.com/apache/spark/commit/281bc6d53fbd0c0b5a99224d700b7d929397f090).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17130: [SPARK-19791] [ML] Add doc and example for fpgrowth

2017-03-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17130
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17130: [SPARK-19791] [ML] Add doc and example for fpgrowth

2017-03-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17130
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73719/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17130: [SPARK-19791] [ML] Add doc and example for fpgrowth

2017-03-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17130
  
**[Test build #73719 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73719/testReport)**
 for PR 17130 at commit 
[`fdce240`](https://github.com/apache/spark/commit/fdce2404688fee1b22154258de5d85f0cee8aa4b).
 * This patch **fails MiMa tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `public class JavaFPGrowthExample `


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16909: [SPARK-13450] Introduce ExternalAppendOnlyUnsafeRowArray...

2017-03-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16909
  
**[Test build #73723 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73723/testReport)**
 for PR 16909 at commit 
[`173b5d5`](https://github.com/apache/spark/commit/173b5d57d180603133ebebd1c64dad424aa8d61a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16944: [SPARK-19611][SQL] Introduce configurable table schema i...

2017-03-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16944
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16944: [SPARK-19611][SQL] Introduce configurable table schema i...

2017-03-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16944
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73720/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17081: [SPARK-18726][SQL][FOLLOW-UP]resolveRelation for FileFor...

2017-03-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17081
  
**[Test build #73724 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73724/testReport)**
 for PR 17081 at commit 
[`a8c1dea`](https://github.com/apache/spark/commit/a8c1deab0fc8e59863bf4a3d3b551f77fbebbc6d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17110: [SPARK-19635][ML] DataFrame-based API for chi squ...

2017-03-01 Thread imatiach-msft

Github user imatiach-msft commented on a diff in the pull request:

https://github.com/apache/spark/pull/17110#discussion_r103813169
  
--- Diff: mllib/src/main/scala/org/apache/spark/ml/stat/ChiSquare.scala ---
@@ -0,0 +1,81 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.ml.stat
+
+import org.apache.spark.annotation.{Experimental, Since}
+import org.apache.spark.ml.linalg.{Vector, Vectors, VectorUDT}
+import org.apache.spark.ml.util.SchemaUtils
+import org.apache.spark.mllib.linalg.{Vectors => OldVectors}
+import org.apache.spark.mllib.regression.{LabeledPoint => OldLabeledPoint}
+import org.apache.spark.mllib.stat.{Statistics => OldStatistics}
+import org.apache.spark.sql.DataFrame
+import org.apache.spark.sql.functions.col
+
+
+/**
+ * :: Experimental ::
+ *
+ * Chi-square hypothesis testing for categorical data.
+ *
+ * See http://en.wikipedia.org/wiki/Chi-squared_test;>Wikipedia for more 
information
+ * on the Chi-squared test.
+ */
+@Experimental
+@Since("2.2.0")
+object ChiSquare {
+
+  /** Used to construct output schema of tests */
+  private case class ChiSquareResult(
+  pValues: Vector,
+  degreesOfFreedom: Array[Int],
+  statistics: Vector)
+
+  /**
+   * Conduct Pearson's independence test for every feature against the 
label across the input RDD.
+   * For each feature, the (feature, label) pairs are converted into a 
contingency matrix for which
+   * the Chi-squared statistic is computed. All label and feature values 
must be categorical.
+   *
+   * The null hypothesis is that the occurrence of the outcomes is 
statistically independent.
+   *
+   * @param dataset  DataFrame of categorical labels and categorical 
features.
+   * Real-valued features will be treated as categorical 
for each distinct value.
+   * @param featuresCol  Name of features column in dataset, of type 
`Vector` (`VectorUDT`)
+   * @param labelCol  Name of label column in dataset, of any numerical 
type
+   * @return DataFrame containing the test result for every feature 
against the label.
+   * This DataFrame will contain a single Row with the following 
fields:
+   *  - `pValues: Vector`
+   *  - `degreesOfFreedom: Array[Int]`
+   *  - `statistics: Vector`
+   * Each of these fields has one value per feature.
+   */
+  @Since("2.2.0")
+  def test(dataset: DataFrame, featuresCol: String, labelCol: String): 
DataFrame = {
+val spark = dataset.sparkSession
+import spark.implicits._
+
+SchemaUtils.checkColumnType(dataset.schema, featuresCol, new VectorUDT)
+SchemaUtils.checkNumericType(dataset.schema, labelCol)
+val rdd = dataset.select(col(labelCol).cast("double"), 
col(featuresCol)).as[(Double, Vector)]
+  .rdd.map { case (label, features) => OldLabeledPoint(label, 
OldVectors.fromML(features)) }
+val testResults = OldStatistics.chiSqTest(rdd)
--- End diff --

it would be nice to optimize this in the future -- since we have schema, if 
the label and features have been converted to categorical, we can get the 
unique values right away instead of having to re-generate the maps for distinct 
labels and features


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17129: [SPARK-19373][MESOS] Base spark.scheduler.minRegisteredR...

2017-03-01 Thread mgummelt

Github user mgummelt commented on the issue:

https://github.com/apache/spark/pull/17129
  
2.1 is sufficient.  Thanks for the merge.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17123: [SPARK-19781][ML] Handle NULLs as well as NaNs in Bucket...

2017-03-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17123
  
**[Test build #3590 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3590/testReport)**
 for PR 17123 at commit 
[`b3f98b6`](https://github.com/apache/spark/commit/b3f98b66e63c9c61c69a1429819feb236fad56c7).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17122: [SPARK-19786][SQL] Facilitate loop optimizations ...

2017-03-01 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/17122#discussion_r103837174
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/basicPhysicalOperators.scala
 ---
@@ -434,6 +434,17 @@ case class RangeExec(range: 
org.apache.spark.sql.catalyst.plans.logical.Range)
 val input = ctx.freshName("input")
 // Right now, Range is only used when there is one upstream.
 ctx.addMutableState("scala.collection.Iterator", input, s"$input = 
inputs[0];")
+
+val localIdx = ctx.freshName("localIdx")
+val localEnd = ctx.freshName("localEnd")
+val range = ctx.freshName("range")
+// we need to place consume() before calling isShouldStopRequired
+val body = consume(ctx, Seq(ev))
+val shouldStop = if (isShouldStopRequired) {
--- End diff --

`isShouldStopRequired` complicates the logic. Is it necessary? How much 
improvement it brings?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17100: [SPARK-13947][PYTHON][SQL] PySpark DataFrames: The error...

2017-03-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17100
  
**[Test build #73714 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73714/testReport)**
 for PR 17100 at commit 
[`65b9596`](https://github.com/apache/spark/commit/65b9596c229ac2b62ecdfeb98e541d2ea92e078d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17081: [SPARK-18726][SQL][FOLLOW-UP]resolveRelation for FileFor...

2017-03-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17081
  
**[Test build #73716 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73716/testReport)**
 for PR 17081 at commit 
[`f79f12c`](https://github.com/apache/spark/commit/f79f12c552ee1721295c347744fc5f92f048c74b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16782: [SPARK-19348][PYTHON] PySpark keyword_only decorator is ...

2017-03-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16782
  
**[Test build #73709 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73709/testReport)**
 for PR 16782 at commit 
[`8dafc20`](https://github.com/apache/spark/commit/8dafc20fd2bbbe9678fa44f7216982fdd0955c14).
 * This patch passes all tests.
 * This patch **does not merge cleanly**.
 * This patch adds the following public classes _(experimental)_:
  * `class KeywordOnlyTests(unittest.TestCase):`
  * `class Wrapped(object):`
  * `class Setter(object):`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16782: [SPARK-19348][PYTHON] PySpark keyword_only decorator is ...

2017-03-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16782
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73709/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16782: [SPARK-19348][PYTHON] PySpark keyword_only decorator is ...

2017-03-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16782
  
Build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15505: [SPARK-18890][CORE] Move task serialization from the Tas...

2017-03-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15505
  
**[Test build #73721 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73721/testReport)**
 for PR 15505 at commit 
[`b2b1eec`](https://github.com/apache/spark/commit/b2b1eec3c41873eb217cf041f3cf6d71d4cfa265).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17125: [SPARK-19211][SQL] Explicitly prevent Insert into View o...

2017-03-01 Thread jiangxb1987

Github user jiangxb1987 commented on the issue:

https://github.com/apache/spark/pull/17125
  
cc @gatorsmile @cloud-fan Please have a look at this when you have time, 
thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17099: [SPARK-19766][SQL] Constant alias columns in INNER JOIN ...

2017-03-01 Thread stanzhai

Github user stanzhai commented on the issue:

https://github.com/apache/spark/pull/17099
  
ok


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17129: [SPARK-19373][MESOS] Base spark.scheduler.minRegi...

2017-03-01 Thread mgummelt

GitHub user mgummelt opened a pull request:

https://github.com/apache/spark/pull/17129

[SPARK-19373][MESOS] Base spark.scheduler.minRegisteredResourceRatio â¦

â¦on registered cores rather than accepted cores

See JIRA

Unit tests, Mesos/Spark integration tests

cc skonto susanxhuynh

Author: Michael Gummelt 

Closes #17045 from mgummelt/SPARK-19373-registered-resources.

## What changes were proposed in this pull request?

(Please fill in changes proposed in this fix)

## How was this patch tested?

(Please explain how this patch was tested. E.g. unit tests, integration 
tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise, 
remove this)

Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mesosphere/spark 
SPARK-19373-registered-resources-2.1

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/17129.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17129


commit 0d84296ca09423121ed8707eb0c083516bb1440c
Author: Michael Gummelt 
Date:   2017-02-28T23:10:55Z

[SPARK-19373][MESOS] Base spark.scheduler.minRegisteredResourceRatio on 
registered cores rather than accepted cores

See JIRA

Unit tests, Mesos/Spark integration tests

cc skonto susanxhuynh

Author: Michael Gummelt 

Closes #17045 from mgummelt/SPARK-19373-registered-resources.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17129: [SPARK-19373][MESOS] Base spark.scheduler.minRegisteredR...

2017-03-01 Thread mgummelt

Github user mgummelt commented on the issue:

https://github.com/apache/spark/pull/17129
  
@srowen As discussed here 
https://github.com/apache/spark/pull/17045#issuecomment-283192230, this is the 
backport of SPARK-19373 into branch-2.1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17129: [SPARK-19373][MESOS] Base spark.scheduler.minRegisteredR...

2017-03-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17129
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73717/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17129: [SPARK-19373][MESOS] Base spark.scheduler.minRegisteredR...

2017-03-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17129
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17129: [SPARK-19373][MESOS] Base spark.scheduler.minRegisteredR...

2017-03-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17129
  
**[Test build #73717 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73717/testReport)**
 for PR 17129 at commit 
[`0d84296`](https://github.com/apache/spark/commit/0d84296ca09423121ed8707eb0c083516bb1440c).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17123: [SPARK-19781][ML] Handle NULLs as well as NaNs in Bucket...

2017-03-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17123
  
**[Test build #3590 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3590/testReport)**
 for PR 17123 at commit 
[`b3f98b6`](https://github.com/apache/spark/commit/b3f98b66e63c9c61c69a1429819feb236fad56c7).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17081: [SPARK-18726][SQL][FOLLOW-UP]resolveRelation for FileFor...

2017-03-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17081
  
**[Test build #73715 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73715/testReport)**
 for PR 17081 at commit 
[`f1da0a4`](https://github.com/apache/spark/commit/f1da0a4cf457f4efb6128beca3c08ccf95ef37a0).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17130: [SPARK-19791] [ML] Add doc and example for fpgrow...

2017-03-01 Thread hhbyyh

GitHub user hhbyyh opened a pull request:

https://github.com/apache/spark/pull/17130

[SPARK-19791] [ML] Add doc and example for fpgrowth

## What changes were proposed in this pull request?

Add a new section for fpm
Add Example for FPGrowth in scala and Java

## How was this patch tested?

local doc generation. 


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/hhbyyh/spark fpmdoc

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/17130.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17130


commit fdce2404688fee1b22154258de5d85f0cee8aa4b
Author: Yuhao Yang 
Date:   2017-03-01T23:47:53Z

fpm doc




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17081: [SPARK-18726][SQL][FOLLOW-UP]resolveRelation for FileFor...

2017-03-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17081
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73715/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17081: [SPARK-18726][SQL][FOLLOW-UP]resolveRelation for FileFor...

2017-03-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17081
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17113: [SPARK-13669][Core] Improve the blacklist mechanism to h...

2017-03-01 Thread jerryshao

Github user jerryshao commented on the issue:

https://github.com/apache/spark/pull/17113
  
@tgravescs , the main scenario is external shuffle service unavailable 
scenario, this could be happened in working preserving + NM failure situation. 
Also like Mesos + external standalone shuffle service could introduce this 
issue. In scenarios like rolling upgrade I agreed that NM unavailability is 
short and this issue could be self-recoverable. One scenario I'm simulating is 
NM failure. In my test, when NM is failed, RM will detect this failure after 10 
minutes by default, before that executors on that NM can still serve the tasks, 
and Spark doesn't blacklist these containers, so re-issued tasks could still be 
failed. 

`FetchFailed` will immediately abort the running stage and re-issue parent 
stage, configurations like failed task number per stage may not be so useful, 
so my thinking is to backlist these executors/nodes immediately after fetch 
failure. 

This proposal may have many problems for different scenario, that's why I 
opened here for comments. If you don't think it is necessary to fix then I 
could close it.

@markhamstra this patch is targeted to master branch and all the 
investigations and changes is based on master branch.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17052: [SPARK-19690][SS] Join a streaming DataFrame with a batc...

2017-03-01 Thread uncleGen

Github user uncleGen commented on the issue:

https://github.com/apache/spark/pull/17052
  
\cc @zsxwing 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17080: [SPARK-19739][CORE] propagate S3 session token to cluser

2017-03-01 Thread uncleGen

Github user uncleGen commented on the issue:

https://github.com/apache/spark/pull/17080
  
\cc @srowen 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17099: [SPARK-19766][SQL] Constant alias columns in INNE...

2017-03-01 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/17099


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17099: [SPARK-19766][SQL] Constant alias columns in INNER JOIN ...

2017-03-01 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/17099
  
Thanks! Merging to master/2.1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17099: [SPARK-19766][SQL] Constant alias columns in INNER JOIN ...

2017-03-01 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/17099
  
@stanzhai Could you submit another PR to backport it to Spark 2.0?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17093: [SPARK-19761][SQL]create InMemoryFileIndex with an empty...

2017-03-01 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/17093
  
Thanks! Merging to master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17121: [SPARK-19787][ML] Changing the default parameter of regP...

2017-03-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17121
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17123: [SPARK-19781][ML] Handle NULLs as well as NaNs in Bucket...

2017-03-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17123
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17122: [SPARK-19786][SQL] Facilitate loop optimizations in a JI...

2017-03-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17122
  
**[Test build #73689 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73689/testReport)**
 for PR 17122 at commit 
[`47f405c`](https://github.com/apache/spark/commit/47f405c32ffac9b0356050c0d6bbb8c0ea5e0f51).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17119: [SPARK-19784][SQL][WIP]refresh table after alter the loc...

2017-03-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17119
  
**[Test build #73690 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73690/testReport)**
 for PR 17119 at commit 
[`dccac9a`](https://github.com/apache/spark/commit/dccac9a02e6191d09782d8a97d7d9a4ab0edc92e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17120: [SPARK-19715][Structured Streaming] Option to Strip Path...

2017-03-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17120
  
**[Test build #73691 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73691/testReport)**
 for PR 17120 at commit 
[`aeb10d1`](https://github.com/apache/spark/commit/aeb10d100a24ca644745fb8b26985b584fd5118e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #9571: [SPARK-11373] [CORE] Add metrics to the History Server an...

2017-03-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/9571
  
**[Test build #73695 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73695/testReport)**
 for PR 9571 at commit 
[`d8ae876`](https://github.com/apache/spark/commit/d8ae876505de9599480929905f88612dcbc3905b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17059: [SPARK-19733][ML]Removed unnecessary castings and refact...

2017-03-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17059
  
**[Test build #73693 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73693/testReport)**
 for PR 17059 at commit 
[`a1e32aa`](https://github.com/apache/spark/commit/a1e32aa3b600841118060cdb3a299b6569438816).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17116: [SPARK-18890][CORE](try 2) Move task serialization from ...

2017-03-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17116
  
**[Test build #73692 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73692/testReport)**
 for PR 17116 at commit 
[`1c26e8c`](https://github.com/apache/spark/commit/1c26e8c98317ec6f97c00da3262050959e1d6910).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15505: [SPARK-18890][CORE] Move task serialization from the Tas...

2017-03-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15505
  
**[Test build #73694 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73694/testReport)**
 for PR 15505 at commit 
[`335b7b9`](https://github.com/apache/spark/commit/335b7b937a7aaa355a6810b9e8d8080732f19078).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #9571: [SPARK-11373] [CORE] Add metrics to the History Server an...

2017-03-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/9571
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73695/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #9571: [SPARK-11373] [CORE] Add metrics to the History Server an...

2017-03-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/9571
  
**[Test build #73695 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73695/testReport)**
 for PR 9571 at commit 
[`d8ae876`](https://github.com/apache/spark/commit/d8ae876505de9599480929905f88612dcbc3905b).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #9571: [SPARK-11373] [CORE] Add metrics to the History Server an...

2017-03-01 Thread steveloughran

Github user steveloughran commented on the issue:

https://github.com/apache/spark/pull/9571
  
Style police. FWIW I think the lines that failed were already >100 chars, 
it was just they got indented slightly more.
```
Scalastyle checks failed at following occurrences:
[error] 
/home/jenkins/workspace/SparkPullRequestBuilder/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala:275:
 File line length exceeds 100 characters
[error] 
/home/jenkins/workspace/SparkPullRequestBuilder/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala:283:
 File line length exceeds 100 characters
[error] 
/home/jenkins/workspace/SparkPullRequestBuilder/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala:284:
 File line length exceeds 100 characters
[error] 
/home/jenkins/workspace/SparkPullRequestBuilder/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala:498:
 File line length exceeds 100 characters
```
will fix


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17093: [SPARK-19761][SQL]create InMemoryFileIndex with a...

2017-03-01 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/17093


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17120: [SPARK-19715][Structured Streaming] Option to Strip Path...

2017-03-01 Thread steveloughran

Github user steveloughran commented on the issue:

https://github.com/apache/spark/pull/17120
  
-1, non binding

I understand the rationale for this, to aid migration from s3/s3n to s3a, 
but given the need is schema independence, you should be using the full path 
name from `Path.getUri().getPath()` instead of getName(), which means only the 
filename is checked.

match only on name and the two files
```
s3a://bucket/incoming/dataset.avro
s3a://bucket/2015/12/dataset.avro
```
will be mistaken for the same file, even when they aren't. If this scenario 
arises then someone will end up fielding support calls about missing data, or 
worse, incorrect query results.

If you use the full path, that problem goes away and the filtering is only 
on schema and filesystem/bucket name.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #9571: [SPARK-11373] [CORE] Add metrics to the History Server an...

2017-03-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/9571
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17093: [SPARK-19761][SQL]create InMemoryFileIndex with an empty...

2017-03-01 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/17093
  
Thanks! Merging to master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16909: [SPARK-13450] Introduce ExternalAppendOnlyUnsafeRowArray...

2017-03-01 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16909
  
**[Test build #73723 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73723/testReport)**
 for PR 16909 at commit 
[`173b5d5`](https://github.com/apache/spark/commit/173b5d57d180603133ebebd1c64dad424aa8d61a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17099: [SPARK-19766][SQL] Constant alias columns in INNE...

2017-03-01 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/17099#discussion_r103844650
  
--- Diff: sql/core/src/test/resources/sql-tests/results/inner-join.sql.out 
---
@@ -0,0 +1,68 @@
+-- Automatically generated by SQLQueryTestSuite
+-- Number of queries: 13
--- End diff --

Actually, this number is wrong. Next time, please do not manually change 
this file. You should run the command to generate the file. @stanzhai 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15505: [SPARK-18890][CORE] Move task serialization from the Tas...

2017-03-01 Thread witgo

Github user witgo commented on the issue:

https://github.com/apache/spark/pull/15505
  
@kayousterhout  It takes some time to update the test report.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16954: [SPARK-18874][SQL] First phase: Deferring the cor...

2017-03-01 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/16954#discussion_r103850385
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala
 ---
@@ -365,17 +385,66 @@ object TypeCoercion {
   }
 
   /**
-   * Convert the value and in list expressions to the common operator type
-   * by looking at all the argument types and finding the closest one that
-   * all the arguments can be cast to. When no common operator type is 
found
-   * the original expression will be returned and an Analysis Exception 
will
-   * be raised at type checking phase.
+   * Handles type coercion for both IN expression with subquery and IN
+   * expressions without subquery.
+   * 1. In the first case, find the common type by comparing the left hand 
side
+   *expression types against corresponding right hand side expression 
derived
+   *from the subquery expression's plan output. Inject appropriate 
casts in the
+   *LHS and RHS side of IN expression.
+   *
+   * 2. In the second case, convert the value and in list expressions to 
the
+   *common operator type by looking at all the argument types and 
finding
+   *the closest one that all the arguments can be cast to. When no 
common
+   *operator type is found the original expression will be returned 
and an
+   *Analysis Exception will be raised at the type checking phase.
*/
   object InConversion extends Rule[LogicalPlan] {
 def apply(plan: LogicalPlan): LogicalPlan = plan resolveExpressions {
   // Skip nodes who's children have not been resolved yet.
   case e if !e.childrenResolved => e
 
+  // Handle type casting required between value expression and 
subquery output
+  // in IN subquery.
+  case i @ In(a, Seq(ListQuery(sub, children, exprId))) if !i.resolved 
=>
+// lhs is the value expression of IN subquery.
--- End diff --

`lhs` -> `LHS`. Please correct all the similar cases in comments.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16954: [SPARK-18874][SQL] First phase: Deferring the cor...

2017-03-01 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/16954#discussion_r103852025
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/subquery.scala
 ---
@@ -83,29 +95,150 @@ object RewritePredicateSubquery extends 
Rule[LogicalPlan] with PredicateHelper {
   }
 
   /**
-   * Given a predicate expression and an input plan, it rewrites
-   * any embedded existential sub-query into an existential join.
-   * It returns the rewritten expression together with the updated plan.
-   * Currently, it does not support null-aware joins. Embedded NOT IN 
predicates
-   * are blocked in the Analyzer.
+   * Given a predicate expression and an input plan, it rewrites any 
embedded existential sub-query
+   * into an existential join. It returns the rewritten expression 
together with the updated plan.
+   * Currently, it does not support NOT IN nested inside a NOT expression. 
This case is blocked in
+   * the Analyzer.
*/
   private def rewriteExistentialExpr(
   exprs: Seq[Expression],
   plan: LogicalPlan): (Option[Expression], LogicalPlan) = {
 var newPlan = plan
 val newExprs = exprs.map { e =>
   e transformUp {
-case PredicateSubquery(sub, conditions, nullAware, _) =>
-  // TODO: support null-aware join
+case Exists(sub, conditions, exprId) =>
--- End diff --

`case Exists(sub, conditions, _)`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 3 4 5 6 >

1 - 100 of 562 matches

Mail list logo