[GitHub] spark issue #19528: [SPARK-20393][WEBU UI][1.6] Strengthen Spark to prevent ...

2018-01-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19528
  
**[Test build #86410 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86410/consoleFull)**
 for PR 19528 at commit 
[`76ad8c5`](https://github.com/apache/spark/commit/76ad8c5e62a7233c16399043716139b52ee1c97d).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19528: [SPARK-20393][WEBU UI][1.6] Strengthen Spark to prevent ...

2018-01-19 Thread felixcheung
Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/19528
  
Jenkins test this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20226: [SPARK-23034][SQL] Override `nodeName` for all *S...

2018-01-19 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/20226#discussion_r162776148
  
--- Diff: sql/core/src/test/resources/sql-tests/results/operators.sql.out 
---
@@ -233,7 +233,7 @@ struct
 -- !query 28 output
 == Physical Plan ==
 *Project [null AS (CAST(concat(a, CAST(1 AS STRING)) AS DOUBLE) + CAST(2 
AS DOUBLE))#x]
-+- Scan OneRowRelation[]
++- Scan Scan RDD OneRowRelation [][]
--- End diff --

?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20087: [SPARK-21786][SQL] The 'spark.sql.parquet.compres...

2018-01-19 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/20087#discussion_r162776118
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/CompressionCodecSuite.scala 
---
@@ -0,0 +1,354 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.hive
+
+import java.io.File
+
+import scala.collection.JavaConverters._
+
+import org.apache.hadoop.fs.Path
+import org.apache.orc.OrcConf.COMPRESS
+import org.apache.parquet.hadoop.ParquetOutputFormat
+import org.scalatest.BeforeAndAfterAll
+
+import org.apache.spark.sql.execution.datasources.orc.OrcOptions
+import org.apache.spark.sql.execution.datasources.parquet.{ParquetOptions, 
ParquetTest}
+import org.apache.spark.sql.hive.orc.OrcFileOperator
+import org.apache.spark.sql.hive.test.TestHiveSingleton
+import org.apache.spark.sql.internal.SQLConf
+
+class CompressionCodecSuite extends TestHiveSingleton with ParquetTest 
with BeforeAndAfterAll {
+  import spark.implicits._
+
+  override def beforeAll(): Unit = {
+super.beforeAll()
+(0 until 
maxRecordNum).toDF("a").createOrReplaceTempView("table_source")
+  }
+
+  override def afterAll(): Unit = {
+try {
+  spark.catalog.dropTempView("table_source")
+} finally {
+  super.afterAll()
+}
+  }
+
+  private val maxRecordNum = 50
+
+  private def getConvertMetastoreConfName(format: String): String = 
format.toLowerCase match {
+case "parquet" => HiveUtils.CONVERT_METASTORE_PARQUET.key
+case "orc" => HiveUtils.CONVERT_METASTORE_ORC.key
+  }
+
+  private def getSparkCompressionConfName(format: String): String = 
format.toLowerCase match {
+case "parquet" => SQLConf.PARQUET_COMPRESSION.key
+case "orc" => SQLConf.ORC_COMPRESSION.key
+  }
+
+  private def getHiveCompressPropName(format: String): String = 
format.toLowerCase match {
+case "parquet" => ParquetOutputFormat.COMPRESSION
+case "orc" => COMPRESS.getAttribute
+  }
+
+  private def normalizeCodecName(format: String, name: String): String = {
+format.toLowerCase match {
+  case "parquet" => ParquetOptions.getParquetCompressionCodecName(name)
+  case "orc" => OrcOptions.getORCCompressionCodecName(name)
+}
+  }
+
+  private def getTableCompressionCodec(path: String, format: String): 
Seq[String] = {
+val hadoopConf = spark.sessionState.newHadoopConf()
+val codecs = format.toLowerCase match {
+  case "parquet" => for {
+footer <- readAllFootersWithoutSummaryFiles(new Path(path), 
hadoopConf)
+block <- footer.getParquetMetadata.getBlocks.asScala
+column <- block.getColumns.asScala
+  } yield column.getCodec.name()
+  case "orc" => new File(path).listFiles().filter { file =>
+file.isFile && !file.getName.endsWith(".crc") && file.getName != 
"_SUCCESS"
+  }.map { orcFile =>
+
OrcFileOperator.getFileReader(orcFile.toPath.toString).get.getCompression.toString
+  }.toSeq
+}
+codecs.distinct
+  }
+
+  private def createTable(
+  rootDir: File,
+  tableName: String,
+  isPartitioned: Boolean,
+  format: String,
+  compressionCodec: Option[String]): Unit = {
+val tblProperties = compressionCodec match {
+  case Some(prop) => 
s"TBLPROPERTIES('${getHiveCompressPropName(format)}'='$prop')"
+  case _ => ""
+}
+val partitionCreate = if (isPartitioned) "PARTITIONED BY (p string)" 
else ""
+sql(
+  s"""
+|CREATE TABLE $tableName(a int)
+|$partitionCreate
+|STORED AS $format
+|LOCATION '${rootDir.toURI.toString.stripSuffix("/")}/$tableName'
+|$tblProperties
+  """.stripMargin)
+  }
+
+  private def writeDataToTable(
+  tableName: String,
+  

[GitHub] spark pull request #20324: [SPARK-23091][ML] Incorrect unit test for approxQ...

2018-01-19 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/20324


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20324: [SPARK-23091][ML] Incorrect unit test for approxQuantile

2018-01-19 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/20324
  
Thanks! Merged to master/2.3


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20324: [SPARK-23091][ML] Incorrect unit test for approxQuantile

2018-01-19 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/20324
  
LGTM


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20330: [SPARK-23121][core] Fix for ui becoming unaccessible for...

2018-01-19 Thread jiangxb1987
Github user jiangxb1987 commented on the issue:

https://github.com/apache/spark/pull/20330
  
cc @gengliangwang 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19993: [SPARK-22799][ML] Bucketizer should throw exception if s...

2018-01-19 Thread jkbradley
Github user jkbradley commented on the issue:

https://github.com/apache/spark/pull/19993
  
OK, sent: https://github.com/mgaido91/spark/pull/1


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20287: [SPARK-23121][WEB-UI] When the Spark Streaming app is ru...

2018-01-19 Thread guoxiaolongzte
Github user guoxiaolongzte commented on the issue:

https://github.com/apache/spark/pull/20287
  
@smurakozi @vanzin @srowen 
Thanks, i will close the PR.




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20087: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2018-01-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20087
  
**[Test build #86409 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86409/testReport)**
 for PR 20087 at commit 
[`5b5e1df`](https://github.com/apache/spark/commit/5b5e1df983af6ff03ec6ef6c83208c8b25af93e2).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20297: [SPARK-23020][CORE] Fix races in launcher code, test.

2018-01-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20297
  
**[Test build #4068 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4068/testReport)**
 for PR 20297 at commit 
[`95bac27`](https://github.com/apache/spark/commit/95bac2773ee7adab9f57aa4377ff2e998353f02f).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19054: [SPARK-18067] Avoid shuffling child if join keys are sup...

2018-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19054
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20297: [SPARK-23020][CORE] Fix races in launcher code, test.

2018-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20297
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86405/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19054: [SPARK-18067] Avoid shuffling child if join keys are sup...

2018-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19054
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86406/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20297: [SPARK-23020][CORE] Fix races in launcher code, test.

2018-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20297
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19054: [SPARK-18067] Avoid shuffling child if join keys are sup...

2018-01-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19054
  
**[Test build #86406 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86406/testReport)**
 for PR 19054 at commit 
[`00bb14b`](https://github.com/apache/spark/commit/00bb14b0145a2bd42c8b4c8a9d4f108322804f71).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20297: [SPARK-23020][CORE] Fix races in launcher code, test.

2018-01-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20297
  
**[Test build #86405 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86405/testReport)**
 for PR 20297 at commit 
[`95bac27`](https://github.com/apache/spark/commit/95bac2773ee7adab9f57aa4377ff2e998353f02f).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20226: [SPARK-23034][SQL] Override `nodeName` for all *ScanExec...

2018-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20226
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86407/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20226: [SPARK-23034][SQL] Override `nodeName` for all *ScanExec...

2018-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20226
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20226: [SPARK-23034][SQL] Override `nodeName` for all *ScanExec...

2018-01-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20226
  
**[Test build #86407 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86407/testReport)**
 for PR 20226 at commit 
[`bf90ac7`](https://github.com/apache/spark/commit/bf90ac713f1ea909572486b136c44f9e4badc50c).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20203: [SPARK-22577] [core] executor page blacklist status shou...

2018-01-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20203
  
**[Test build #86408 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86408/testReport)**
 for PR 20203 at commit 
[`f388c45`](https://github.com/apache/spark/commit/f388c45ee56c17f48d393240f29901f73865bb74).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20330: [SPARK-23121][core] Fix for ui becoming unaccessible for...

2018-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20330
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86404/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20330: [SPARK-23121][core] Fix for ui becoming unaccessible for...

2018-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20330
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20330: [SPARK-23121][core] Fix for ui becoming unaccessible for...

2018-01-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20330
  
**[Test build #86404 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86404/testReport)**
 for PR 20330 at commit 
[`f19d3a1`](https://github.com/apache/spark/commit/f19d3a1dce67cb8af682c1de9bd41411be1d8b0d).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20331: [SPARK-23158] [SQL] Move HadoopFsRelationTest test suite...

2018-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20331
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86403/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20331: [SPARK-23158] [SQL] Move HadoopFsRelationTest test suite...

2018-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20331
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20226: [SPARK-23034][SQL] Override `nodeName` for all *ScanExec...

2018-01-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20226
  
**[Test build #86407 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86407/testReport)**
 for PR 20226 at commit 
[`bf90ac7`](https://github.com/apache/spark/commit/bf90ac713f1ea909572486b136c44f9e4badc50c).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20226: [SPARK-23034][SQL] Override `nodeName` for all *ScanExec...

2018-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20226
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20331: [SPARK-23158] [SQL] Move HadoopFsRelationTest test suite...

2018-01-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20331
  
**[Test build #86403 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86403/testReport)**
 for PR 20331 at commit 
[`b83f859`](https://github.com/apache/spark/commit/b83f859137ca9ed33c3c7e4295c433b7bbca6eee).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `abstract class HadoopFsRelationTest extends QueryTest with 
SQLTestUtils `
  * `class JsonHadoopFsRelationSuite extends HadoopFsRelationTest with 
SharedSQLContext `
  * `abstract class OrcHadoopFsRelationBase extends HadoopFsRelationTest `
  * `class ParquetHadoopFsRelationSuite extends HadoopFsRelationTest with 
SharedSQLContext `
  * `class HiveOrcHadoopFsRelationSuite extends OrcHadoopFsRelationBase 
with TestHiveSingleton `


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20226: [SPARK-23034][SQL] Override `nodeName` for all *ScanExec...

2018-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20226
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/54/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20226: [SPARK-23034][SQL] Override `nodeName` for all *S...

2018-01-19 Thread tejasapatil
Github user tejasapatil commented on a diff in the pull request:

https://github.com/apache/spark/pull/20226#discussion_r162769562
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/LocalTableScanExec.scala 
---
@@ -30,6 +30,8 @@ case class LocalTableScanExec(
 output: Seq[Attribute],
 @transient rows: Seq[InternalRow]) extends LeafExecNode {
 
+  override val nodeName: String = s"Scan LocalTable 
${output.map(_.name).mkString("[", ",", "]")}"
--- End diff --

I believe you are referring to the duplication at : 

https://github.com/apache/spark/blob/3f958a99921d149fb9fdf7ba7e78957afdad1405/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreeNode.scala#L466

```
def simpleString: String = s"$nodeName $argString".trim
```

Am changing this line to just have `Scan LocalTable`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19054: [SPARK-18067] Avoid shuffling child if join keys ...

2018-01-19 Thread tejasapatil
Github user tejasapatil commented on a diff in the pull request:

https://github.com/apache/spark/pull/19054#discussion_r162768714
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/EnsureRequirements.scala
 ---
@@ -271,23 +325,24 @@ case class EnsureRequirements(conf: SQLConf) extends 
Rule[SparkPlan] {
*/
   private def reorderJoinPredicates(plan: SparkPlan): SparkPlan = {
 plan.transformUp {
-  case BroadcastHashJoinExec(leftKeys, rightKeys, joinType, buildSide, 
condition, left,
--- End diff --

Removal of `BroadcastHashJoinExec` is intentional. The children are 
expected to have `BroadcastDistribution` or `UnspecifiedDistribution` so this 
method wont help here (this optimization only helps in case of shuffle based 
joins)


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19054: [SPARK-18067] Avoid shuffling child if join keys ...

2018-01-19 Thread tejasapatil
Github user tejasapatil commented on a diff in the pull request:

https://github.com/apache/spark/pull/19054#discussion_r162768516
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/EnsureRequirements.scala
 ---
@@ -220,45 +220,76 @@ case class EnsureRequirements(conf: SQLConf) extends 
Rule[SparkPlan] {
 operator.withNewChildren(children)
   }
 
+  private def isSubset(biggerSet: Seq[Expression], smallerSet: 
Seq[Expression]): Boolean =
+smallerSet.length <= biggerSet.length &&
+  smallerSet.forall(x => biggerSet.exists(_.semanticEquals(x)))
+
   private def reorder(
   leftKeys: Seq[Expression],
   rightKeys: Seq[Expression],
-  expectedOrderOfKeys: Seq[Expression],
-  currentOrderOfKeys: Seq[Expression]): (Seq[Expression], 
Seq[Expression]) = {
-val leftKeysBuffer = ArrayBuffer[Expression]()
-val rightKeysBuffer = ArrayBuffer[Expression]()
+  expectedOrderOfKeys: Seq[Expression], // comes from child's output 
partitioning
+  currentOrderOfKeys: Seq[Expression]): // comes from join predicate
+  (Seq[Expression], Seq[Expression], Seq[Expression], Seq[Expression]) = {
+
+assert(leftKeys.length == rightKeys.length)
+
+val allLeftKeys = ArrayBuffer[Expression]()
+val allRightKeys = ArrayBuffer[Expression]()
+val reorderedLeftKeys = ArrayBuffer[Expression]()
+val reorderedRightKeys = ArrayBuffer[Expression]()
+val processedIndicies = mutable.Set[Int]()
 
 expectedOrderOfKeys.foreach(expression => {
-  val index = currentOrderOfKeys.indexWhere(e => 
e.semanticEquals(expression))
-  leftKeysBuffer.append(leftKeys(index))
-  rightKeysBuffer.append(rightKeys(index))
+  val index = currentOrderOfKeys.zipWithIndex.find { case (currKey, i) 
=>
+!processedIndicies.contains(i) && 
currKey.semanticEquals(expression)
+  }.get._2
+  processedIndicies.add(index)
+
+  reorderedLeftKeys.append(leftKeys(index))
+  allLeftKeys.append(leftKeys(index))
+
+  reorderedRightKeys.append(rightKeys(index))
+  allRightKeys.append(rightKeys(index))
 })
-(leftKeysBuffer, rightKeysBuffer)
+
+// If len(currentOrderOfKeys) > len(expectedOrderOfKeys), then the 
re-ordering won't have
+// all the keys. Append the remaining keys to the end so that we are 
covering all the keys
+for (i <- leftKeys.indices) {
+  if (!processedIndicies.contains(i)) {
+allLeftKeys.append(leftKeys(i))
+allRightKeys.append(rightKeys(i))
+  }
+}
+
+assert(allLeftKeys.length == leftKeys.length)
+assert(allRightKeys.length == rightKeys.length)
+assert(reorderedLeftKeys.length == reorderedRightKeys.length)
+
+(allLeftKeys, reorderedLeftKeys, allRightKeys, reorderedRightKeys)
   }
 
   private def reorderJoinKeys(
   leftKeys: Seq[Expression],
   rightKeys: Seq[Expression],
   leftPartitioning: Partitioning,
-  rightPartitioning: Partitioning): (Seq[Expression], Seq[Expression]) 
= {
+  rightPartitioning: Partitioning):
+  (Seq[Expression], Seq[Expression], Seq[Expression], Seq[Expression]) = {
+
 if (leftKeys.forall(_.deterministic) && 
rightKeys.forall(_.deterministic)) {
   leftPartitioning match {
-case HashPartitioning(leftExpressions, _)
-  if leftExpressions.length == leftKeys.length &&
-leftKeys.forall(x => 
leftExpressions.exists(_.semanticEquals(x))) =>
+case HashPartitioning(leftExpressions, _) if isSubset(leftKeys, 
leftExpressions) =>
   reorder(leftKeys, rightKeys, leftExpressions, leftKeys)
--- End diff --

given that this was only done over `SortMergeJoinExec` and 
`ShuffledHashJoinExec` where both the partitionings are `HashPartitioning`, 
things worked fine. I have changed this to have a stricter check.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19054: [SPARK-18067] Avoid shuffling child if join keys are sup...

2018-01-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19054
  
**[Test build #86406 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86406/testReport)**
 for PR 19054 at commit 
[`00bb14b`](https://github.com/apache/spark/commit/00bb14b0145a2bd42c8b4c8a9d4f108322804f71).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19054: [SPARK-18067] Avoid shuffling child if join keys are sup...

2018-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19054
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19054: [SPARK-18067] Avoid shuffling child if join keys are sup...

2018-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19054
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/53/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19054: [SPARK-18067] Avoid shuffling child if join keys ...

2018-01-19 Thread tejasapatil
Github user tejasapatil commented on a diff in the pull request:

https://github.com/apache/spark/pull/19054#discussion_r162768446
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/EnsureRequirements.scala
 ---
@@ -220,45 +220,76 @@ case class EnsureRequirements(conf: SQLConf) extends 
Rule[SparkPlan] {
 operator.withNewChildren(children)
   }
 
+  private def isSubset(biggerSet: Seq[Expression], smallerSet: 
Seq[Expression]): Boolean =
+smallerSet.length <= biggerSet.length &&
+  smallerSet.forall(x => biggerSet.exists(_.semanticEquals(x)))
+
   private def reorder(
   leftKeys: Seq[Expression],
   rightKeys: Seq[Expression],
-  expectedOrderOfKeys: Seq[Expression],
-  currentOrderOfKeys: Seq[Expression]): (Seq[Expression], 
Seq[Expression]) = {
-val leftKeysBuffer = ArrayBuffer[Expression]()
-val rightKeysBuffer = ArrayBuffer[Expression]()
+  expectedOrderOfKeys: Seq[Expression], // comes from child's output 
partitioning
+  currentOrderOfKeys: Seq[Expression]): // comes from join predicate
+  (Seq[Expression], Seq[Expression], Seq[Expression], Seq[Expression]) = {
+
+assert(leftKeys.length == rightKeys.length)
+
+val allLeftKeys = ArrayBuffer[Expression]()
+val allRightKeys = ArrayBuffer[Expression]()
+val reorderedLeftKeys = ArrayBuffer[Expression]()
+val reorderedRightKeys = ArrayBuffer[Expression]()
+val processedIndicies = mutable.Set[Int]()
 
 expectedOrderOfKeys.foreach(expression => {
-  val index = currentOrderOfKeys.indexWhere(e => 
e.semanticEquals(expression))
-  leftKeysBuffer.append(leftKeys(index))
-  rightKeysBuffer.append(rightKeys(index))
+  val index = currentOrderOfKeys.zipWithIndex.find { case (currKey, i) 
=>
+!processedIndicies.contains(i) && 
currKey.semanticEquals(expression)
+  }.get._2
+  processedIndicies.add(index)
+
+  reorderedLeftKeys.append(leftKeys(index))
+  allLeftKeys.append(leftKeys(index))
+
+  reorderedRightKeys.append(rightKeys(index))
+  allRightKeys.append(rightKeys(index))
 })
-(leftKeysBuffer, rightKeysBuffer)
+
+// If len(currentOrderOfKeys) > len(expectedOrderOfKeys), then the 
re-ordering won't have
+// all the keys. Append the remaining keys to the end so that we are 
covering all the keys
+for (i <- leftKeys.indices) {
+  if (!processedIndicies.contains(i)) {
+allLeftKeys.append(leftKeys(i))
+allRightKeys.append(rightKeys(i))
+  }
+}
+
+assert(allLeftKeys.length == leftKeys.length)
+assert(allRightKeys.length == rightKeys.length)
+assert(reorderedLeftKeys.length == reorderedRightKeys.length)
+
+(allLeftKeys, reorderedLeftKeys, allRightKeys, reorderedRightKeys)
   }
 
   private def reorderJoinKeys(
   leftKeys: Seq[Expression],
   rightKeys: Seq[Expression],
   leftPartitioning: Partitioning,
-  rightPartitioning: Partitioning): (Seq[Expression], Seq[Expression]) 
= {
+  rightPartitioning: Partitioning):
+  (Seq[Expression], Seq[Expression], Seq[Expression], Seq[Expression]) = {
--- End diff --

added more doc. I wasn't sure how to make it easier to understand. Hope 
that the example helps with that


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20297: [SPARK-23020][CORE] Fix races in launcher code, test.

2018-01-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20297
  
**[Test build #86405 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86405/testReport)**
 for PR 20297 at commit 
[`95bac27`](https://github.com/apache/spark/commit/95bac2773ee7adab9f57aa4377ff2e998353f02f).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20297: [SPARK-23020][CORE] Fix races in launcher code, test.

2018-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20297
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/52/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20297: [SPARK-23020][CORE] Fix races in launcher code, test.

2018-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20297
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20297: [SPARK-23020][CORE] Fix races in launcher code, test.

2018-01-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20297
  
**[Test build #4068 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4068/testReport)**
 for PR 20297 at commit 
[`95bac27`](https://github.com/apache/spark/commit/95bac2773ee7adab9f57aa4377ff2e998353f02f).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20297: [SPARK-23020][CORE] Fix races in launcher code, test.

2018-01-19 Thread vanzin
Github user vanzin commented on the issue:

https://github.com/apache/spark/pull/20297
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20330: [SPARK-23121][core] Fix for ui becoming unaccessi...

2018-01-19 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/20330#discussion_r162764494
  
--- Diff: core/src/main/scala/org/apache/spark/ui/jobs/StagePage.scala ---
@@ -1002,4 +1000,12 @@ private object ApiHelper {
 }
   }
 
+  def lastStageNameAndDescription(store: AppStatusStore, job: JobData): 
(String, String) = {
+store.asOption(store.lastStageAttempt(job.stageIds.max)) match {
+  case Some(lastStageAttempt) =>
+(lastStageAttempt.name, 
lastStageAttempt.description.getOrElse(job.name))
+  case None => ("", "")
--- End diff --

This would probably be simpler:

```
val stage = store.asOption(...)
(stage.map(_.name).getOrElse(""), 
stage.map(_.description.getOrElse(job.name)))
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20330: [SPARK-23121][core] Fix for ui becoming unaccessible for...

2018-01-19 Thread vanzin
Github user vanzin commented on the issue:

https://github.com/apache/spark/pull/20330
  
You could also add 'Closes #20287' to the PR description to close the other 
PR for the same bug automatically.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20330: [SPARK-23121][core] Fix for ui becoming unaccessi...

2018-01-19 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/20330#discussion_r162762976
  
--- Diff: core/src/main/scala/org/apache/spark/ui/jobs/StagePage.scala ---
@@ -1002,4 +1000,12 @@ private object ApiHelper {
 }
   }
 
+  def lastStageNameAndDescription(store: AppStatusStore, job: JobData): 
(String, String) = {
+store.asOption(store.lastStageAttempt(job.stageIds.max)) match {
+  case Some(lastStageAttempt) =>
+(lastStageAttempt.name, 
lastStageAttempt.description.getOrElse(job.name))
+  case None => ("", "")
--- End diff --

Before, you were doing `if (lastStageDescription.isEmpty) job.name else 
blah` at the call site.

Now, when the last stage is not in the store, the call site is getting an 
empty string as the description, instead of using the job name.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20324: [SPARK-23091][ML] Incorrect unit test for approxQuantile

2018-01-19 Thread WeichenXu123
Github user WeichenXu123 commented on the issue:

https://github.com/apache/spark/pull/20324
  
LGTM. Thanks! 👍 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20203: [SPARK-22577] [core] executor page blacklist status shou...

2018-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20203
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86400/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20203: [SPARK-22577] [core] executor page blacklist status shou...

2018-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20203
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20203: [SPARK-22577] [core] executor page blacklist status shou...

2018-01-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20203
  
**[Test build #86400 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86400/testReport)**
 for PR 20203 at commit 
[`cf6e0c9`](https://github.com/apache/spark/commit/cf6e0c919e151c26772ec78a10abc6d2454f7dd5).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20330: [SPARK-23121][core] Fix for ui becoming unaccessible for...

2018-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20330
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86399/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20330: [SPARK-23121][core] Fix for ui becoming unaccessible for...

2018-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20330
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20330: [SPARK-23121][core] Fix for ui becoming unaccessible for...

2018-01-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20330
  
**[Test build #86399 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86399/testReport)**
 for PR 20330 at commit 
[`d5fdabb`](https://github.com/apache/spark/commit/d5fdabb678f4df7c101d8660cb7c37086e35489a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20324: [SPARK-23091][ML] Incorrect unit test for approxQuantile

2018-01-19 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/20324
  
cc @WeichenXu123 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20330: [SPARK-23121][core] Fix for ui becoming unaccessible for...

2018-01-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20330
  
**[Test build #86404 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86404/testReport)**
 for PR 20330 at commit 
[`f19d3a1`](https://github.com/apache/spark/commit/f19d3a1dce67cb8af682c1de9bd41411be1d8b0d).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20325: [SPARK-22808][DOCS] add insertInto when save hive...

2018-01-19 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/20325#discussion_r162760318
  
--- Diff: docs/sql-programming-guide.md ---
@@ -580,6 +580,9 @@ default local Hive metastore (using Derby) for you. 
Unlike the `createOrReplaceT
 Hive metastore. Persistent tables will still exist even after your Spark 
program has restarted, as
 long as you maintain your connection to the same metastore. A DataFrame 
for a persistent table can
 be created by calling the `table` method on a `SparkSession` with the name 
of the table.
+Notice that for `DataFrames` is built on Hive table, `insertInto` should 
be used instead of `saveAsTable`.
--- End diff --

Let us get rid of `Notice that for DataFrames is built on Hive table,`.  
`insertInto` can work for any existing table. More importantly, `DataFrames` 
might be created from scratch. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20046: [SPARK-22362][SQL] Add unit test for Window Aggregate Fu...

2018-01-19 Thread jiangxb1987
Github user jiangxb1987 commented on the issue:

https://github.com/apache/spark/pull/20046
  
We shall also cover the sql interface, you can find some example in 
`sql/core/src/test/resources/sql-tests/inputs/udaf.sql`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20330: [SPARK-23121][core] Fix for ui becoming unaccessi...

2018-01-19 Thread smurakozi
Github user smurakozi commented on a diff in the pull request:

https://github.com/apache/spark/pull/20330#discussion_r162759866
  
--- Diff: core/src/main/scala/org/apache/spark/ui/jobs/AllJobsPage.scala ---
@@ -427,23 +435,21 @@ private[ui] class JobDataSource(
 val formattedDuration = duration.map(d => 
UIUtils.formatDuration(d)).getOrElse("Unknown")
 val submissionTime = jobData.submissionTime
 val formattedSubmissionTime = 
submissionTime.map(UIUtils.formatDate).getOrElse("Unknown")
-val lastStageAttempt = store.lastStageAttempt(jobData.stageIds.max)
-val lastStageDescription = lastStageAttempt.description.getOrElse("")
+val (lastStageName, lastStageDescription) = 
lastStageNameAndDescription(store, jobData)
 
-val formattedJobDescription =
-  UIUtils.makeDescription(lastStageDescription, basePath, plainText = 
false)
+val jobDescription = UIUtils.makeDescription(lastStageDescription, 
basePath, plainText = false)
--- End diff --

I've moved this logic to `lastStageNameAndDescription`, so it's uniform.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18983: [SPARK-21771][SQL]remove useless hive client in S...

2018-01-19 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/18983


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18983: [SPARK-21771][SQL]remove useless hive client in SparkSQL...

2018-01-19 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/18983
  
Thanks! Merged to master/2.3


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20046: [SPARK-22362][SQL] Add unit test for Window Aggre...

2018-01-19 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request:

https://github.com/apache/spark/pull/20046#discussion_r162759216
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/DataFrameWindowFunctionsSuite.scala
 ---
@@ -86,6 +93,429 @@ class DataFrameWindowFunctionsSuite extends QueryTest 
with SharedSQLContext {
 assert(e.message.contains("requires window to be ordered"))
   }
 
+  test("aggregation and rows between") {
+val df = Seq((1, "1"), (2, "1"), (2, "2"), (1, "1"), (2, 
"2")).toDF("key", "value")
--- End diff --

We shall also include null data.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20333: [SPARK-23087][SQL] CheckCartesianProduct too rest...

2018-01-19 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/20333#discussion_r162759088
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
 ---
@@ -1108,15 +1108,19 @@ object CheckCartesianProducts extends 
Rule[LogicalPlan] with PredicateHelper {
*/
   def isCartesianProduct(join: Join): Boolean = {
 val conditions = 
join.condition.map(splitConjunctivePredicates).getOrElse(Nil)
-!conditions.map(_.references).exists(refs => 
refs.exists(join.left.outputSet.contains)
-&& refs.exists(join.right.outputSet.contains))
+
+conditions match {
+  case Seq(Literal.FalseLiteral) | Seq(Literal(null, BooleanType)) => 
false
+  case _ => !conditions.map(_.references).exists(refs =>
+refs.exists(join.left.outputSet.contains) && 
refs.exists(join.right.outputSet.contains))
+}
   }
 
   def apply(plan: LogicalPlan): LogicalPlan =
 if (SQLConf.get.crossJoinEnabled) {
   plan
 } else plan transform {
-  case j @ Join(left, right, Inner | LeftOuter | RightOuter | 
FullOuter, condition)
+  case j @ Join(left, right, Inner | LeftOuter | RightOuter | 
FullOuter, _)
--- End diff --

For inner joins, we will not hit this, because it is already optimized to 
an empty relation. For the other outer join types, we face the exactly same 
issue as the condition is true. That is, the size of the join result sets is 
still the same.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20333: [SPARK-23087][SQL] CheckCartesianProduct too restrictive...

2018-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20333
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20333: [SPARK-23087][SQL] CheckCartesianProduct too restrictive...

2018-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20333
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86401/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20333: [SPARK-23087][SQL] CheckCartesianProduct too restrictive...

2018-01-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20333
  
**[Test build #86401 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86401/testReport)**
 for PR 20333 at commit 
[`9c88781`](https://github.com/apache/spark/commit/9c88781dcd4cd301373927bfbe7f3530c80f4f05).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20333: [SPARK-23087][SQL] CheckCartesianProduct too rest...

2018-01-19 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/20333#discussion_r162758553
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/DataFrameJoinSuite.scala ---
@@ -274,4 +274,18 @@ class DataFrameJoinSuite extends QueryTest with 
SharedSQLContext {
 checkAnswer(innerJoin, Row(1) :: Nil)
   }
 
+  test("SPARK-23087: don't throw Analysis Exception in 
CheckCartesianProduct when join condition " +
+"is false or null") {
+val df = spark.range(10)
+val dfNull = spark.range(10).select(lit(null).as("b"))
+val planNull = df.join(dfNull, $"id" === $"b", 
"left").queryExecution.analyzed
+
+spark.sessionState.executePlan(planNull).optimizedPlan
+
+val dfOne = df.select(lit(1).as("a"))
+val dfTwo = spark.range(10).select(lit(2).as("a"))
--- End diff --

`a` -> `b`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20331: [SPARK-23158] [SQL] Move HadoopFsRelationTest test suite...

2018-01-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20331
  
**[Test build #86403 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86403/testReport)**
 for PR 20331 at commit 
[`b83f859`](https://github.com/apache/spark/commit/b83f859137ca9ed33c3c7e4295c433b7bbca6eee).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20331: [SPARK-23158] [SQL] Move HadoopFsRelationTest test suite...

2018-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20331
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20331: [SPARK-23158] [SQL] Move HadoopFsRelationTest test suite...

2018-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20331
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/51/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20331: [SPARK-23158] [SQL] Move HadoopFsRelationTest test suite...

2018-01-19 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/20331
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20331: [SPARK-23158] [SQL] Move HadoopFsRelationTest test suite...

2018-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20331
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86402/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20331: [SPARK-23158] [SQL] Move HadoopFsRelationTest test suite...

2018-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20331
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20331: [SPARK-23158] [SQL] Move HadoopFsRelationTest test suite...

2018-01-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20331
  
**[Test build #86402 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86402/testReport)**
 for PR 20331 at commit 
[`b83f859`](https://github.com/apache/spark/commit/b83f859137ca9ed33c3c7e4295c433b7bbca6eee).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `abstract class HadoopFsRelationTest extends QueryTest with 
SQLTestUtils `
  * `class JsonHadoopFsRelationSuite extends HadoopFsRelationTest with 
SharedSQLContext `
  * `abstract class OrcHadoopFsRelationBase extends HadoopFsRelationTest `
  * `class ParquetHadoopFsRelationSuite extends HadoopFsRelationTest with 
SharedSQLContext `
  * `class HiveOrcHadoopFsRelationSuite extends OrcHadoopFsRelationBase 
with TestHiveSingleton `


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20316: [SPARK-23149][SQL] polish ColumnarBatch

2018-01-19 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/20316#discussion_r162753323
  
--- Diff: 
sql/core/src/main/java/org/apache/spark/sql/vectorized/ColumnarBatch.java ---
@@ -96,16 +90,6 @@ public void setNumRows(int numRows) {
*/
   public int numRows() { return numRows; }
 
-  /**
-   * Returns the schema that makes up this batch.
-   */
-  public StructType schema() { return schema; }
-
-  /**
-   * Returns the max capacity (in number of rows) for this batch.
-   */
-  public int capacity() { return capacity; }
--- End diff --

I agree to remove these fields `schema` and `capacity` from `ColumnarBatch`.
Is it better to prepare APIs to get `schema` and `capacity` from a set of 
`ColumnVector`s?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20203: [SPARK-22577] [core] executor page blacklist status shou...

2018-01-19 Thread squito
Github user squito commented on the issue:

https://github.com/apache/spark/pull/20203
  
@attilapiros test failures look real (you probably just need to regenerate 
some of those expectations).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20203: [SPARK-22577] [core] executor page blacklist status shou...

2018-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20203
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86398/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20203: [SPARK-22577] [core] executor page blacklist status shou...

2018-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20203
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20203: [SPARK-22577] [core] executor page blacklist status shou...

2018-01-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20203
  
**[Test build #86398 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86398/testReport)**
 for PR 20203 at commit 
[`41dd7bb`](https://github.com/apache/spark/commit/41dd7bbc1f62e093738e730bf3f5bfeb3dff16fb).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class SparkListenerNodeBlacklistedForStage(`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20331: [SPARK-23158] [SQL] Move HadoopFsRelationTest test suite...

2018-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20331
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20331: [SPARK-23158] [SQL] Move HadoopFsRelationTest test suite...

2018-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20331
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/50/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20177: [SPARK-22954][SQL] Fix the exception thrown by Analyze c...

2018-01-19 Thread jiangxb1987
Github user jiangxb1987 commented on the issue:

https://github.com/apache/spark/pull/20177
  
We shall also update for `AnalyzePartitionCommand` and 
`AnalyzeColumnCommand`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20331: [SPARK-23158] [SQL] Move HadoopFsRelationTest test suite...

2018-01-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20331
  
**[Test build #86402 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86402/testReport)**
 for PR 20331 at commit 
[`b83f859`](https://github.com/apache/spark/commit/b83f859137ca9ed33c3c7e4295c433b7bbca6eee).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20177: [SPARK-22954][SQL] Fix the exception thrown by An...

2018-01-19 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request:

https://github.com/apache/spark/pull/20177#discussion_r162749089
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/SQLViewSuite.scala ---
@@ -154,11 +155,17 @@ abstract class SQLViewSuite extends QueryTest with 
SQLTestUtils {
   assertNoSuchTable(s"TRUNCATE TABLE $viewName")
   assertNoSuchTable(s"SHOW CREATE TABLE $viewName")
   assertNoSuchTable(s"SHOW PARTITIONS $viewName")
-  assertNoSuchTable(s"ANALYZE TABLE $viewName COMPUTE STATISTICS")
-  assertNoSuchTable(s"ANALYZE TABLE $viewName COMPUTE STATISTICS FOR 
COLUMNS id")
+  assertAnalysisException(s"ANALYZE TABLE $viewName COMPUTE 
STATISTICS")
--- End diff --

We should also check the error message to ensure the `AnalysisException` is 
not thrown from elsewhere.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20277: [SPARK-23090][SQL] polish ColumnVector

2018-01-19 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/20277#discussion_r162748352
  
--- Diff: 
sql/core/src/main/java/org/apache/spark/sql/vectorized/ArrowColumnVector.java 
---
@@ -55,164 +43,82 @@ public void close() {
 if (childColumns != null) {
   for (int i = 0; i < childColumns.length; i++) {
 childColumns[i].close();
+childColumns[i] = null;
--- End diff --

Is it OK not to call `close()` while `ColumnVector.close()` is provided?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20277: [SPARK-23090][SQL] polish ColumnVector

2018-01-19 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/20277#discussion_r162747998
  
--- Diff: 
sql/core/src/main/java/org/apache/spark/sql/vectorized/ArrowColumnVector.java 
---
@@ -33,18 +33,6 @@
   private final ArrowVectorAccessor accessor;
   private ArrowColumnVector[] childColumns;
 
-  private void ensureAccessible(int index) {
-ensureAccessible(index, 1);
-  }
-
-  private void ensureAccessible(int index, int count) {
--- End diff --

I agree with this in non-debug version. Can we add assert of this check at 
each caller site for debugging?

p.s. Sorry for slow reviews since I am on vacation this week.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20297: [SPARK-23020][CORE] Fix races in launcher code, test.

2018-01-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20297
  
**[Test build #4067 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4067/testReport)**
 for PR 20297 at commit 
[`95bac27`](https://github.com/apache/spark/commit/95bac2773ee7adab9f57aa4377ff2e998353f02f).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19993: [SPARK-22799][ML] Bucketizer should throw excepti...

2018-01-19 Thread jkbradley
Github user jkbradley commented on a diff in the pull request:

https://github.com/apache/spark/pull/19993#discussion_r162747183
  
--- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Bucketizer.scala 
---
@@ -201,9 +184,13 @@ final class Bucketizer @Since("1.4.0") 
(@Since("1.4.0") override val uid: String
 
   @Since("1.4.0")
   override def transformSchema(schema: StructType): StructType = {
-if (isBucketizeMultipleColumns()) {
+ParamValidators.checkExclusiveParams(this, "inputCol", "inputCols")
--- End diff --

I see.  I'll see if I can come up with something which is generic but 
handles these other checks.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20297: [SPARK-23020][CORE] Fix races in launcher code, test.

2018-01-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20297
  
**[Test build #4066 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4066/testReport)**
 for PR 20297 at commit 
[`95bac27`](https://github.com/apache/spark/commit/95bac2773ee7adab9f57aa4377ff2e998353f02f).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20177: [SPARK-22954][SQL] Fix the exception thrown by An...

2018-01-19 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request:

https://github.com/apache/spark/pull/20177#discussion_r162746808
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzeTableCommand.scala
 ---
@@ -31,9 +31,9 @@ case class AnalyzeTableCommand(
 
   override def run(sparkSession: SparkSession): Seq[Row] = {
 val sessionState = sparkSession.sessionState
-val db = 
tableIdent.database.getOrElse(sessionState.catalog.getCurrentDatabase)
-val tableIdentWithDB = TableIdentifier(tableIdent.table, Some(db))
-val tableMeta = sessionState.catalog.getTableMetadata(tableIdentWithDB)
+val db = tableIdent.database
+val tableIdentWithDB = TableIdentifier(tableIdent.table, db)
+val tableMeta = 
sessionState.catalog.getTempViewOrPermanentTableMetadata(tableIdentWithDB)
--- End diff --

Wouldn't this fail if we have a table that neglect the current database in 
tableIdent?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20297: [SPARK-23020][CORE] Fix races in launcher code, test.

2018-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20297
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86395/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20297: [SPARK-23020][CORE] Fix races in launcher code, test.

2018-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20297
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20297: [SPARK-23020][CORE] Fix races in launcher code, test.

2018-01-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20297
  
**[Test build #86395 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86395/testReport)**
 for PR 20297 at commit 
[`95bac27`](https://github.com/apache/spark/commit/95bac2773ee7adab9f57aa4377ff2e998353f02f).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20277: [SPARK-23090][SQL] polish ColumnVector

2018-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20277
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20277: [SPARK-23090][SQL] polish ColumnVector

2018-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20277
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86394/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20277: [SPARK-23090][SQL] polish ColumnVector

2018-01-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20277
  
**[Test build #86394 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86394/testReport)**
 for PR 20277 at commit 
[`3972093`](https://github.com/apache/spark/commit/397209342646a253a56650df8a00dfb6d66c834e).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20333: [SPARK-23087][SQL] CheckCartesianProduct too restrictive...

2018-01-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20333
  
**[Test build #86401 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86401/testReport)**
 for PR 20333 at commit 
[`9c88781`](https://github.com/apache/spark/commit/9c88781dcd4cd301373927bfbe7f3530c80f4f05).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20333: [SPARK-23087][SQL] CheckCartesianProduct too restrictive...

2018-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20333
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20333: [SPARK-23087][SQL] CheckCartesianProduct too restrictive...

2018-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20333
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/49/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20333: [SPARK-23087][SQL] CheckCartesianProduct too rest...

2018-01-19 Thread mgaido91
GitHub user mgaido91 opened a pull request:

https://github.com/apache/spark/pull/20333

[SPARK-23087][SQL] CheckCartesianProduct too restrictive when condition is 
false/null

## What changes were proposed in this pull request?

CheckCartesianProduct raises an AnalysisException also when the join 
condition is always false/null. In this case, we shouldn't raise it, since the 
result will not be a cartesian product. 

## How was this patch tested?

added UT

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mgaido91/spark SPARK-23087

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/20333.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #20333


commit 9c88781dcd4cd301373927bfbe7f3530c80f4f05
Author: Marco Gaido 
Date:   2018-01-19T20:45:29Z

[SPARK-23087][SQL] CheckCartesianProduct too restrictive when condition is 
false/null




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20299: [SPARK-23135][ui] Fix rendering of accumulators i...

2018-01-19 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/20299


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   >