[GitHub] spark issue #20937: [SPARK-23094][SPARK-23723][SPARK-23724][SQL] Support cus...

2018-04-23 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/20937
  
Seems fine but please allow me to take another look, which I will take 
within this weekend.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20146: [SPARK-11215][ML] Add multiple columns support to String...

2018-04-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20146
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2609/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20146: [SPARK-11215][ML] Add multiple columns support to String...

2018-04-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20146
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21100: [SPARK-24012][SQL] Union of map and other compati...

2018-04-23 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/21100#discussion_r183611758
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala 
---
@@ -896,6 +896,25 @@ class SQLQuerySuite extends QueryTest with 
SharedSQLContext {
 }
   }
 
+  test("SPARK-24012 Union of map and other compatible columns") {
--- End diff --

OK, please remove this test and it's ready to go.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20146: [SPARK-11215][ML] Add multiple columns support to String...

2018-04-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20146
  
**[Test build #89761 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89761/testReport)**
 for PR 20146 at commit 
[`ed35d87`](https://github.com/apache/spark/commit/ed35d875414ba3cf8751a77463f61665e9c373b0).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21123: [SPARK-24045][SQL]Create base class for file data...

2018-04-23 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request:

https://github.com/apache/spark/pull/21123#discussion_r183611332
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/sources/v2/FileDataSourceV2Suite.scala
 ---
@@ -0,0 +1,179 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.sql.sources.v2
+
+import java.util.{List => JList, Optional}
+
+import org.apache.spark.sql.{AnalysisException, QueryTest, Row, SaveMode}
+import org.apache.spark.sql.execution.datasources.FileFormat
+import 
org.apache.spark.sql.execution.datasources.parquet.{ParquetFileFormat, 
ParquetTest}
+import org.apache.spark.sql.execution.datasources.v2.FileDataSourceV2
+import org.apache.spark.sql.internal.SQLConf
+import org.apache.spark.sql.sources.v2.reader.{DataReaderFactory, 
DataSourceReader}
+import org.apache.spark.sql.sources.v2.writer.{DataSourceWriter, 
DataWriterFactory, WriterCommitMessage}
+import org.apache.spark.sql.test.SharedSQLContext
+import org.apache.spark.sql.types.StructType
+
+class DummyReadOnlyFileDataSourceV2 extends FileDataSourceV2 with 
ReadSupport {
+  class DummyFileReader extends DataSourceReader {
+override def readSchema(): StructType = {
+  throw new AnalysisException("hehe")
+}
+
+override def createDataReaderFactories(): 
JList[DataReaderFactory[Row]] =
+  java.util.Arrays.asList()
+  }
+
+  override def createReader(options: DataSourceOptions): DataSourceReader 
= {
+throw new AnalysisException("Dummy file reader")
+  }
+
+  override def fallBackFileFormat: Class[_ <: FileFormat] = 
classOf[ParquetFileFormat]
+
+  override def shortName(): String = "parquet"
+}
+
+class DummyWriteOnlyFileDataSourceV2 extends FileDataSourceV2 with 
WriteSupport {
+  override def fallBackFileFormat: Class[_ <: FileFormat] = 
classOf[ParquetFileFormat]
+
+  override def shortName(): String = "parquet"
+
+  override def createWriter(
+  jobId: String,
+  schema: StructType,
+  mode: SaveMode,
+  options: DataSourceOptions): Optional[DataSourceWriter] = {
+throw new AnalysisException("Dummy file writer")
+  }
+}
+
+class SimpleFileDataSourceV2 extends SimpleDataSourceV2 with 
FileDataSourceV2 {
+  override def fallBackFileFormat: Class[_ <: FileFormat] = 
classOf[ParquetFileFormat]
+
+  override def shortName(): String = "parquet"
+}
+
+class FileDataSourceV2Suite extends QueryTest with ParquetTest with 
SharedSQLContext {
+  class DummyFileWriter extends DataSourceWriter {
+override def createWriterFactory(): DataWriterFactory[Row] = {
+  throw new AnalysisException("hehe")
+}
+
+override def commit(messages: Array[WriterCommitMessage]): Unit = {}
+
+override def abort(messages: Array[WriterCommitMessage]): Unit = {}
+  }
+
+  private val dummyParquetReaderV2 = 
classOf[DummyReadOnlyFileDataSourceV2].getName
+  private val dummyParquetWriterV2 = 
classOf[DummyWriteOnlyFileDataSourceV2].getName
+  private val simpleFileDataSourceV2 = 
classOf[SimpleFileDataSourceV2].getName
+  private val parquetV1 = classOf[ParquetFileFormat].getCanonicalName
+
+  test("Fall back to v1 when writing to file with read only 
FileDataSourceV2") {
+val df = spark.range(1, 10).toDF()
+withTempPath { file =>
+  val path = file.getCanonicalPath
+  // Writing file should fall back to v1 and succeed.
+  df.write.format(dummyParquetReaderV2).save(path)
+
+  // Validate write result with [[ParquetFileFormat]].
+  checkAnswer(spark.read.format(parquetV1).load(path), df)
+
+  // Dummy File reader should fail as expected.
+  val exception = intercept[AnalysisException] {
+spark.read.format(dummyParquetReaderV2).load(path)
+  }
+  assert(exception.message.equals("Dummy file reader"))
+}
+  }
+
+  test("Fall back 

[GitHub] spark pull request #21123: [SPARK-24045][SQL]Create base class for file data...

2018-04-23 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request:

https://github.com/apache/spark/pull/21123#discussion_r183611071
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/sources/v2/FileDataSourceV2Suite.scala
 ---
@@ -0,0 +1,179 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.sql.sources.v2
+
+import java.util.{List => JList, Optional}
+
+import org.apache.spark.sql.{AnalysisException, QueryTest, Row, SaveMode}
+import org.apache.spark.sql.execution.datasources.FileFormat
+import 
org.apache.spark.sql.execution.datasources.parquet.{ParquetFileFormat, 
ParquetTest}
+import org.apache.spark.sql.execution.datasources.v2.FileDataSourceV2
+import org.apache.spark.sql.internal.SQLConf
+import org.apache.spark.sql.sources.v2.reader.{DataReaderFactory, 
DataSourceReader}
+import org.apache.spark.sql.sources.v2.writer.{DataSourceWriter, 
DataWriterFactory, WriterCommitMessage}
+import org.apache.spark.sql.test.SharedSQLContext
+import org.apache.spark.sql.types.StructType
+
+class DummyReadOnlyFileDataSourceV2 extends FileDataSourceV2 with 
ReadSupport {
+  class DummyFileReader extends DataSourceReader {
+override def readSchema(): StructType = {
+  throw new AnalysisException("hehe")
+}
+
+override def createDataReaderFactories(): 
JList[DataReaderFactory[Row]] =
+  java.util.Arrays.asList()
+  }
+
+  override def createReader(options: DataSourceOptions): DataSourceReader 
= {
+throw new AnalysisException("Dummy file reader")
+  }
+
+  override def fallBackFileFormat: Class[_ <: FileFormat] = 
classOf[ParquetFileFormat]
+
+  override def shortName(): String = "parquet"
+}
+
+class DummyWriteOnlyFileDataSourceV2 extends FileDataSourceV2 with 
WriteSupport {
+  override def fallBackFileFormat: Class[_ <: FileFormat] = 
classOf[ParquetFileFormat]
+
+  override def shortName(): String = "parquet"
+
+  override def createWriter(
+  jobId: String,
+  schema: StructType,
+  mode: SaveMode,
+  options: DataSourceOptions): Optional[DataSourceWriter] = {
+throw new AnalysisException("Dummy file writer")
+  }
+}
+
+class SimpleFileDataSourceV2 extends SimpleDataSourceV2 with 
FileDataSourceV2 {
+  override def fallBackFileFormat: Class[_ <: FileFormat] = 
classOf[ParquetFileFormat]
+
+  override def shortName(): String = "parquet"
+}
+
+class FileDataSourceV2Suite extends QueryTest with ParquetTest with 
SharedSQLContext {
+  class DummyFileWriter extends DataSourceWriter {
+override def createWriterFactory(): DataWriterFactory[Row] = {
+  throw new AnalysisException("hehe")
+}
+
+override def commit(messages: Array[WriterCommitMessage]): Unit = {}
+
+override def abort(messages: Array[WriterCommitMessage]): Unit = {}
+  }
+
+  private val dummyParquetReaderV2 = 
classOf[DummyReadOnlyFileDataSourceV2].getName
+  private val dummyParquetWriterV2 = 
classOf[DummyWriteOnlyFileDataSourceV2].getName
+  private val simpleFileDataSourceV2 = 
classOf[SimpleFileDataSourceV2].getName
+  private val parquetV1 = classOf[ParquetFileFormat].getCanonicalName
+
+  test("Fall back to v1 when writing to file with read only 
FileDataSourceV2") {
+val df = spark.range(1, 10).toDF()
+withTempPath { file =>
+  val path = file.getCanonicalPath
+  // Writing file should fall back to v1 and succeed.
+  df.write.format(dummyParquetReaderV2).save(path)
+
+  // Validate write result with [[ParquetFileFormat]].
+  checkAnswer(spark.read.format(parquetV1).load(path), df)
+
+  // Dummy File reader should fail as expected.
+  val exception = intercept[AnalysisException] {
+spark.read.format(dummyParquetReaderV2).load(path)
+  }
+  assert(exception.message.equals("Dummy file reader"))
+}
+  }
+
+  test("Fall back 

[GitHub] spark issue #20146: [SPARK-11215][ML] Add multiple columns support to String...

2018-04-23 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/20146
  
retest this please.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21137: [SPARK-23589][SQL][FOLLOW-UP] Reuse InternalRow in Exter...

2018-04-23 Thread maropu
Github user maropu commented on the issue:

https://github.com/apache/spark/pull/21137
  
@hvanhovell 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21100: [SPARK-24012][SQL] Union of map and other compati...

2018-04-23 Thread liutang123
Github user liutang123 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21100#discussion_r183608838
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala 
---
@@ -896,6 +896,25 @@ class SQLQuerySuite extends QueryTest with 
SharedSQLContext {
 }
   }
 
+  test("SPARK-24012 Union of map and other compatible columns") {
--- End diff --

@cloud-fan , Yes, I am not familiar with TypeCoercionSuite. In order to 
save time, in my opinion, this PR can be merged first. Thanks a lot.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20998: [SPARK-23888][CORE] correct the comment of hasAttemptOnH...

2018-04-23 Thread Ngone51
Github user Ngone51 commented on the issue:

https://github.com/apache/spark/pull/20998
  
Agree and thank you @squito .

And thanks for all of you. @felixcheung @mridulm @jiangxb1987 @srowen 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21135: [SPARK-24060][TEST] StreamingSymmetricHashJoinHelperSuit...

2018-04-23 Thread jose-torres
Github user jose-torres commented on the issue:

https://github.com/apache/spark/pull/21135
  
LGTM, I think it's broadly correct for query nodes to assume the session 
has been initialized


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21137: [SPARK-23589][SQL][FOLLOW-UP] Reuse InternalRow in Exter...

2018-04-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21137
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2608/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21137: [SPARK-23589][SQL][FOLLOW-UP] Reuse InternalRow in Exter...

2018-04-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21137
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21136: [SPARK-24061][SS]Add TypedFilter support for continuous ...

2018-04-23 Thread jose-torres
Github user jose-torres commented on the issue:

https://github.com/apache/spark/pull/21136
  
LGTM. TypedFilter and Filter share the FilterExec execution node, so this 
should just work.

Ideally we would add a test to ContinuousSuite to ensure that TypedFilter 
does execute properly, but I understand this may not be sustainable. Eventually 
we'll just remove this whitelist and find some way to broadly execute all the 
streaming tests for continuous processing.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21137: [SPARK-23589][SQL][FOLLOW-UP] Reuse InternalRow in Exter...

2018-04-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21137
  
**[Test build #89760 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89760/testReport)**
 for PR 21137 at commit 
[`b39b6cd`](https://github.com/apache/spark/commit/b39b6cd5d1bfb47f3a796e57bd421f62aab507e5).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21126: [SPARK-24050][SS] Calculate input / processing ra...

2018-04-23 Thread jose-torres
Github user jose-torres commented on a diff in the pull request:

https://github.com/apache/spark/pull/21126#discussion_r183605577
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingQuerySuite.scala
 ---
@@ -492,6 +492,77 @@ class StreamingQuerySuite extends StreamTest with 
BeforeAndAfter with Logging wi
 assert(progress.sources(0).numInputRows === 10)
   }
 
+
+  test("input row calculation with trigger having data for one of two V2 
sources") {
+val streamInput1 = MemoryStream[Int]
+val streamInput2 = MemoryStream[Int]
+
+testStream(streamInput1.toDF().union(streamInput2.toDF()), useV2Sink = 
true)(
+  AddData(streamInput1, 1, 2, 3),
+  CheckAnswer(1, 2, 3),
+  AssertOnQuery { q =>
+val lastProgress = getLastProgressWithData(q)
+assert(lastProgress.nonEmpty)
+assert(lastProgress.get.numInputRows == 3)
+assert(lastProgress.get.sources.length == 2)
+assert(lastProgress.get.sources(0).numInputRows == 3)
+assert(lastProgress.get.sources(1).numInputRows == 0)
+true
+  }
--- End diff --

nit: i'd suggest doing an AddData() for the other stream after, to make 
sure there's not some weird order dependence


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21137: [SPARK-23589][SQL][FOLLOW-UP] Reuse InternalRow i...

2018-04-23 Thread maropu
GitHub user maropu opened a pull request:

https://github.com/apache/spark/pull/21137

[SPARK-23589][SQL][FOLLOW-UP] Reuse InternalRow in ExternalMapToCatalyst 
eval

## What changes were proposed in this pull request?
This pr is a follow-up of #20980 and fixes code to reuse `InternalRow` for 
converting input keys/values in `ExternalMapToCatalyst` eval.

## How was this patch tested?
Existing tests.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/maropu/spark SPARK-23589-FOLLOWUP

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/21137.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #21137


commit b39b6cd5d1bfb47f3a796e57bd421f62aab507e5
Author: Takeshi Yamamuro 
Date:   2018-04-24T04:57:59Z

Fix




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21070: [SPARK-23972][BUILD][SQL] Update Parquet to 1.10.0.

2018-04-23 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/21070
  
Can we run a TPCDS and show that this upgrade doesn't cause performance 
regression in Spark? I can see that this new version doesn't have perf 
regression at parquet side, just want to be sure the Spark parquet integration 
is also OK.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21136: [SPARK-24061][SS]Add TypedFilter support for continuous ...

2018-04-23 Thread xuanyuanking
Github user xuanyuanking commented on the issue:

https://github.com/apache/spark/pull/21136
  
+1 for this.
We find this by CP app use filter with functions, this can be supported by 
current implement.
cc @jose-torres @zsxwing @tdas


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21136: [SPARK-24061][SS]Add TypedFilter support for cont...

2018-04-23 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request:

https://github.com/apache/spark/pull/21136#discussion_r183604217
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/UnsupportedOperationsSuite.scala
 ---
@@ -771,7 +778,16 @@ class UnsupportedOperationsSuite extends SparkFunSuite 
{
 }
   }
 
-  /**
+  /** Assert that the logical plan is not supportd for continuous 
procsssing mode */
+  def assertSupportedForContinuousProcessing(
+name: String,
+plan: LogicalPlan,
+outputMode: OutputMode): Unit = {
+test(s"continuous processing - $name: supported") {
+  UnsupportedOperationChecker.checkForContinuous(plan, outputMode)
+}
+  }
+  /**
--- End diff --

nits: Indent here


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21126: [SPARK-24050][SS] Calculate input / processing ra...

2018-04-23 Thread jose-torres
Github user jose-torres commented on a diff in the pull request:

https://github.com/apache/spark/pull/21126#discussion_r183604136
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/ProgressReporter.scala
 ---
@@ -207,62 +209,92 @@ trait ProgressReporter extends Logging {
   return ExecutionStats(Map.empty, stateOperators, watermarkTimestamp)
 }
 
-// We want to associate execution plan leaves to sources that generate 
them, so that we match
-// the their metrics (e.g. numOutputRows) to the sources. To do this 
we do the following.
-// Consider the translation from the streaming logical plan to the 
final executed plan.
-//
-//  streaming logical plan (with sources) <==> trigger's logical plan 
<==> executed plan
-//
-// 1. We keep track of streaming sources associated with each leaf in 
the trigger's logical plan
-//- Each logical plan leaf will be associated with a single 
streaming source.
-//- There can be multiple logical plan leaves associated with a 
streaming source.
-//- There can be leaves not associated with any streaming source, 
because they were
-//  generated from a batch source (e.g. stream-batch joins)
-//
-// 2. Assuming that the executed plan has same number of leaves in the 
same order as that of
-//the trigger logical plan, we associate executed plan leaves with 
corresponding
-//streaming sources.
-//
-// 3. For each source, we sum the metrics of the associated execution 
plan leaves.
-//
-val logicalPlanLeafToSource = newData.flatMap { case (source, 
logicalPlan) =>
-  logicalPlan.collectLeaves().map { leaf => leaf -> source }
+val numInputRows = extractSourceToNumInputRows()
+
+val eventTimeStats = lastExecution.executedPlan.collect {
+  case e: EventTimeWatermarkExec if e.eventTimeStats.value.count > 0 =>
+val stats = e.eventTimeStats.value
+Map(
+  "max" -> stats.max,
+  "min" -> stats.min,
+  "avg" -> stats.avg.toLong).mapValues(formatTimestamp)
+}.headOption.getOrElse(Map.empty) ++ watermarkTimestamp
+
+ExecutionStats(numInputRows, stateOperators, eventTimeStats)
+  }
+
+  /** Extract number of input sources for each streaming source in plan */
+  private def extractSourceToNumInputRows(): Map[BaseStreamingSource, 
Long] = {
+
+def sumRows(tuples: Seq[(BaseStreamingSource, Long)]): 
Map[BaseStreamingSource, Long] = {
+  tuples.groupBy(_._1).mapValues(_.map(_._2).sum) // sum up rows for 
each source
 }
-val allLogicalPlanLeaves = lastExecution.logical.collectLeaves() // 
includes non-streaming
-val allExecPlanLeaves = lastExecution.executedPlan.collectLeaves()
-val numInputRows: Map[BaseStreamingSource, Long] =
+
+val onlyDataSourceV2Sources = {
+  // Check whether the streaming query's logical plan has only V2 data 
sources
+  val allStreamingLeaves =
+logicalPlan.collect { case s: StreamingExecutionRelation => s }
+  allStreamingLeaves.forall { _.source.isInstanceOf[MicroBatchReader] }
--- End diff --

A point fix here won't be sufficient - right now the metrics don't make it 
to the driver at all in continuous processing.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21136: [SPARK-24061][SS]Add TypedFilter support for continuous ...

2018-04-23 Thread yanlin-Lynn
Github user yanlin-Lynn commented on the issue:

https://github.com/apache/spark/pull/21136
  
@xuanyuanking , please help to review for this path. Thank you!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21136: [SPARK-24061][SS]Add TypedFilter support for continuous ...

2018-04-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21136
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21136: [SPARK-24061][SS]Add TypedFilter support for continuous ...

2018-04-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21136
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21136: [SPARK-24061][SS]Add TypedFilter support for cont...

2018-04-23 Thread yanlin-Lynn
GitHub user yanlin-Lynn opened a pull request:

https://github.com/apache/spark/pull/21136

[SPARK-24061][SS]Add TypedFilter support for continuous processing

## What changes were proposed in this pull request?

Add TypedFilter support for continuous processing application.

## How was this patch tested?

unit tests

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/yanlin-Lynn/spark SPARK-24061

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/21136.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #21136


commit 87a169767ea2d495d3f51d5decf398946560d6d0
Author: wangyanlin01 
Date:   2018-04-24T04:35:59Z

[SPARK-24061][SS]Add TypedFilter support for continuous processing




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20937: [SPARK-23094][SPARK-23723][SPARK-23724][SQL] Support cus...

2018-04-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20937
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20937: [SPARK-23094][SPARK-23723][SPARK-23724][SQL] Support cus...

2018-04-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20937
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89751/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20937: [SPARK-23094][SPARK-23723][SPARK-23724][SQL] Support cus...

2018-04-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20937
  
**[Test build #89751 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89751/testReport)**
 for PR 20937 at commit 
[`a7be182`](https://github.com/apache/spark/commit/a7be1821886cd1699981588c82af19c9031be399).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21100: [SPARK-24012][SQL] Union of map and other compati...

2018-04-23 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/21100#discussion_r183602450
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala 
---
@@ -896,6 +896,25 @@ class SQLQuerySuite extends QueryTest with 
SharedSQLContext {
 }
   }
 
+  test("SPARK-24012 Union of map and other compatible columns") {
--- End diff --

discussed with @gatorsmile , we should put end-to-end test in a single 
place, and currently we encourage people to put SQL related end-to-end test in 
the SQL golden files. That is to say, we should remove this test from 
`SQLQuerySuite`.

In the meanwhile, a bug fix should also have a unit test. For this case, we 
should add a test case in `TypeCoercionSuite`. @liutang123 if you are not 
familiar with that test suite, please let us know, we can merge your PR first 
and add UT in `TypeCoercionSuite` in a followup.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21100: [SPARK-24012][SQL] Union of map and other compati...

2018-04-23 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/21100#discussion_r183601150
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala 
---
@@ -896,6 +896,25 @@ class SQLQuerySuite extends QueryTest with 
SharedSQLContext {
 }
   }
 
+  test("SPARK-24012 Union of map and other compatible columns") {
--- End diff --

yes. please add them to SQLQueryTestSuite


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21100: [SPARK-24012][SQL] Union of map and other compati...

2018-04-23 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/21100#discussion_r183601024
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala 
---
@@ -896,6 +896,25 @@ class SQLQuerySuite extends QueryTest with 
SharedSQLContext {
 }
   }
 
+  test("SPARK-24012 Union of map and other compatible columns") {
--- End diff --

cc @gatorsmile , what's the policy for end-to-end tests? Shall we add it in 
both the sql golden file and `SQLQuerySuite`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21078: [SPARK-23990][ML] Instruments logging improvements - ML ...

2018-04-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21078
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89754/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21078: [SPARK-23990][ML] Instruments logging improvements - ML ...

2018-04-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21078
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21078: [SPARK-23990][ML] Instruments logging improvements - ML ...

2018-04-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21078
  
**[Test build #89754 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89754/testReport)**
 for PR 21078 at commit 
[`492dc46`](https://github.com/apache/spark/commit/492dc4605b0779e5b337d0db41691e372adec45e).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21073: [SPARK-23936][SQL] Implement map_concat

2018-04-23 Thread bersprockets
Github user bersprockets commented on a diff in the pull request:

https://github.com/apache/spark/pull/21073#discussion_r183600236
  
--- Diff: python/pyspark/sql/functions.py ---
@@ -2186,6 +2186,29 @@ def map_values(col):
 return Column(sc._jvm.functions.map_values(_to_java_column(col)))
 
 
+@since(2.4)
+def map_concat(*cols):
+"""Returns the union of all the given maps. If a key is found in 
multiple given maps,
+that key's value in the resulting map comes from the last one of those 
maps.
+
+:param cols: list of column names (string) or list of :class:`Column` 
expressions
+
+>>> from pyspark.sql.functions import map_concat
+>>> df = spark.sql("SELECT map(1, 'a', 2, 'b') as map1, map(3, 'c', 1, 
'd') as map2")
+>>> df.select(map_concat("map1", 
"map2").alias("map3")).show(truncate=False)
+++
+|map3|
+++
+|[1 -> d, 2 -> b, 3 -> c]|
+++
+"""
+sc = SparkContext._active_spark_context
+if len(cols) == 1 and isinstance(cols[0], (list, set)):
+cols = cols[0]
--- End diff --

>what's this for?

Excellent question. I don't know, except that it seems sometimes the first 
column is a list of columns. I used other functions as a template.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21073: [SPARK-23936][SQL] Implement map_concat

2018-04-23 Thread bersprockets
Github user bersprockets commented on a diff in the pull request:

https://github.com/apache/spark/pull/21073#discussion_r18365
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CollectionExpressionsSuite.scala
 ---
@@ -56,6 +58,26 @@ class CollectionExpressionsSuite extends SparkFunSuite 
with ExpressionEvalHelper
 checkEvaluation(MapValues(m2), null)
   }
 
+  test("Map Concat") {
+val m0 = Literal.create(Map("a" -> "1", "b" -> "2"), 
MapType(StringType, StringType))
+val m1 = Literal.create(Map("c" -> "3", "a" -> "4"), 
MapType(StringType, StringType))
+val m2 = Literal.create(Map("d" -> "4", "e" -> "5"), 
MapType(StringType, StringType))
+val mNull = Literal.create(null, MapType(StringType, StringType))
+
+// overlapping maps
+checkEvaluation(MapConcat(Seq(m0, m1)),
+  mutable.LinkedHashMap("a" -> "4", "b" -> "2", "c" -> "3"))
+// maps with no overlap
+checkEvaluation(MapConcat(Seq(m0, m2)),
+  mutable.LinkedHashMap("a" -> "1", "b" -> "2", "d" -> "4", "e" -> 
"5"))
+// 3 maps
+checkEvaluation(MapConcat(Seq(m0, m1, m2)),
+  mutable.LinkedHashMap("a" -> "4", "b" -> "2", "c" -> "3", "d" -> 
"4", "e" -> "5"))
+// null map
+checkEvaluation(MapConcat(Seq(m0, mNull)),
--- End diff --

Done!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21073: [SPARK-23936][SQL] Implement map_concat

2018-04-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21073
  
**[Test build #89759 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89759/testReport)**
 for PR 21073 at commit 
[`13baf96`](https://github.com/apache/spark/commit/13baf96aa0a6087f288d45a7d57881744e187826).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21127: [SPARK-24052][CORE][UI] Add spark version informa...

2018-04-23 Thread caneGuy
Github user caneGuy closed the pull request at:

https://github.com/apache/spark/pull/21127


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21127: [SPARK-24052][CORE][UI] Add spark version information on...

2018-04-23 Thread caneGuy
Github user caneGuy commented on the issue:

https://github.com/apache/spark/pull/21127
  
Ok i will close this pr.
Thanks for your time @srowen @vanzin 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20923: [SPARK-23807][BUILD] Add Hadoop 3.1 profile with relevan...

2018-04-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20923
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20923: [SPARK-23807][BUILD] Add Hadoop 3.1 profile with relevan...

2018-04-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20923
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89748/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20923: [SPARK-23807][BUILD] Add Hadoop 3.1 profile with relevan...

2018-04-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20923
  
**[Test build #89748 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89748/testReport)**
 for PR 20923 at commit 
[`f6b9dc8`](https://github.com/apache/spark/commit/f6b9dc83d56c20d887166ddba7a7b876a57d65cb).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21135: [SPARK-24060][TEST] StreamingSymmetricHashJoinHelperSuit...

2018-04-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21135
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21135: [SPARK-24060][TEST] StreamingSymmetricHashJoinHelperSuit...

2018-04-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21135
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21135: [SPARK-24060][TEST] StreamingSymmetricHashJoinHelperSuit...

2018-04-23 Thread pwoody
Github user pwoody commented on the issue:

https://github.com/apache/spark/pull/21135
  
@ericl @gatorsmile @jose-torres 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21135: [SPARK-24060][TEST] StreamingSymmetricHashJoinHel...

2018-04-23 Thread pwoody
GitHub user pwoody opened a pull request:

https://github.com/apache/spark/pull/21135

[SPARK-24060][TEST] StreamingSymmetricHashJoinHelperSuite should initialize 
after SparkSession creation

## What changes were proposed in this pull request?
We should ensure that the SparkSession for this test suite has initialized 
before creating the LocalTableScanExecs

## How was this patch tested?
Existing tests 


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/pwoody/spark pw/initfix

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/21135.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #21135


commit dbaa00ca99161cd2820b9a74aebac9665c626581
Author: Patrick Woody 
Date:   2018-04-24T03:40:10Z

Initialize StreamingSymmetricHashJoinHelperSuite LocalTableScanExec after 
SparkSession initialization




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21100: [SPARK-24012][SQL] Union of map and other compatible col...

2018-04-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21100
  
**[Test build #89757 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89757/testReport)**
 for PR 21100 at commit 
[`8cb240f`](https://github.com/apache/spark/commit/8cb240fbcab257c4151246c36725b6f1ee873d46).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20940: [SPARK-23429][CORE] Add executor memory metrics to heart...

2018-04-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20940
  
**[Test build #89758 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89758/testReport)**
 for PR 20940 at commit 
[`8ae0126`](https://github.com/apache/spark/commit/8ae012654b81040cd65ea9b4b5f6d9bec7acd718).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20940: [SPARK-23429][CORE] Add executor memory metrics to heart...

2018-04-23 Thread squito
Github user squito commented on the issue:

https://github.com/apache/spark/pull/20940
  
Jenkins, retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21129: [SPARK-7132][ML] Add fit with validation set to spark.ml...

2018-04-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21129
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2607/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21129: [SPARK-7132][ML] Add fit with validation set to spark.ml...

2018-04-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21129
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21129: [SPARK-7132][ML] Add fit with validation set to spark.ml...

2018-04-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21129
  
**[Test build #89756 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89756/testReport)**
 for PR 21129 at commit 
[`170e08f`](https://github.com/apache/spark/commit/170e08f38ba4853087a3a3be19f8a1a69656de32).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21127: [SPARK-24052][CORE][UI] Add spark version information on...

2018-04-23 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/21127
  
Version info is already available, in the code and in the UI. "Compiled by" 
info has never struck me as useful. The rest is from the env. I don't think 
this adds anything.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21127: [SPARK-24052][CORE][UI] Add spark version information on...

2018-04-23 Thread caneGuy
Github user caneGuy commented on the issue:

https://github.com/apache/spark/pull/21127
  
How about the other information?
As mentioned,the build info @vanzin 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20633: [SPARK-23455][ML] Default Params in ML should be saved s...

2018-04-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20633
  
**[Test build #4156 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4156/testReport)**
 for PR 20633 at commit 
[`80b668a`](https://github.com/apache/spark/commit/80b668afb0303b67ead8aed8d4d1f1996fa02658).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21128: [SPARK-24053][CORE] Support add subdirectory name...

2018-04-23 Thread caneGuy
Github user caneGuy closed the pull request at:

https://github.com/apache/spark/pull/21128


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21128: [SPARK-24053][CORE] Support add subdirectory named as us...

2018-04-23 Thread caneGuy
Github user caneGuy commented on the issue:

https://github.com/apache/spark/pull/21128
  
Got it,thanks @vanzin 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21116: [SPARK-24038][SS] Refactor continuous writing to its own...

2018-04-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21116
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21116: [SPARK-24038][SS] Refactor continuous writing to its own...

2018-04-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21116
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89747/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21116: [SPARK-24038][SS] Refactor continuous writing to its own...

2018-04-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21116
  
**[Test build #89747 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89747/testReport)**
 for PR 21116 at commit 
[`b676dc8`](https://github.com/apache/spark/commit/b676dc85d5ab74b9d3457d45a97c062fd5a51dd3).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21100: [SPARK-24012][SQL] Union of map and other compatible col...

2018-04-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21100
  
**[Test build #89755 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89755/testReport)**
 for PR 21100 at commit 
[`670824f`](https://github.com/apache/spark/commit/670824fa5fc1f8aa72ca4047893104c3786a0295).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21078: [SPARK-23990][ML] Instruments logging improvements - ML ...

2018-04-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21078
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2606/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21078: [SPARK-23990][ML] Instruments logging improvements - ML ...

2018-04-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21078
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21078: [SPARK-23990][ML] Instruments logging improvements - ML ...

2018-04-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21078
  
**[Test build #89754 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89754/testReport)**
 for PR 21078 at commit 
[`492dc46`](https://github.com/apache/spark/commit/492dc4605b0779e5b337d0db41691e372adec45e).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20940: [SPARK-23429][CORE] Add executor memory metrics to heart...

2018-04-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20940
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89746/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20940: [SPARK-23429][CORE] Add executor memory metrics to heart...

2018-04-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20940
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20940: [SPARK-23429][CORE] Add executor memory metrics to heart...

2018-04-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20940
  
**[Test build #89746 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89746/testReport)**
 for PR 20940 at commit 
[`8ae0126`](https://github.com/apache/spark/commit/8ae012654b81040cd65ea9b4b5f6d9bec7acd718).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21113: [MINOR][DOCS] Fix comments of SQLExecution#withExecution...

2018-04-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21113
  
**[Test build #89753 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89753/testReport)**
 for PR 21113 at commit 
[`c8adec6`](https://github.com/apache/spark/commit/c8adec619f4337e29509a760e8ca84ddc3f851ec).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21123: [SPARK-24045][SQL]Create base class for file data...

2018-04-23 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/21123#discussion_r183589976
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/sources/v2/FileDataSourceV2Suite.scala
 ---
@@ -0,0 +1,179 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.sql.sources.v2
+
+import java.util.{List => JList, Optional}
+
+import org.apache.spark.sql.{AnalysisException, QueryTest, Row, SaveMode}
+import org.apache.spark.sql.execution.datasources.FileFormat
+import 
org.apache.spark.sql.execution.datasources.parquet.{ParquetFileFormat, 
ParquetTest}
+import org.apache.spark.sql.execution.datasources.v2.FileDataSourceV2
+import org.apache.spark.sql.internal.SQLConf
+import org.apache.spark.sql.sources.v2.reader.{DataReaderFactory, 
DataSourceReader}
+import org.apache.spark.sql.sources.v2.writer.{DataSourceWriter, 
DataWriterFactory, WriterCommitMessage}
+import org.apache.spark.sql.test.SharedSQLContext
+import org.apache.spark.sql.types.StructType
+
+class DummyReadOnlyFileDataSourceV2 extends FileDataSourceV2 with 
ReadSupport {
+  class DummyFileReader extends DataSourceReader {
+override def readSchema(): StructType = {
+  throw new AnalysisException("hehe")
+}
+
+override def createDataReaderFactories(): 
JList[DataReaderFactory[Row]] =
+  java.util.Arrays.asList()
+  }
+
+  override def createReader(options: DataSourceOptions): DataSourceReader 
= {
+throw new AnalysisException("Dummy file reader")
+  }
+
+  override def fallBackFileFormat: Class[_ <: FileFormat] = 
classOf[ParquetFileFormat]
+
+  override def shortName(): String = "parquet"
+}
+
+class DummyWriteOnlyFileDataSourceV2 extends FileDataSourceV2 with 
WriteSupport {
+  override def fallBackFileFormat: Class[_ <: FileFormat] = 
classOf[ParquetFileFormat]
+
+  override def shortName(): String = "parquet"
+
+  override def createWriter(
+  jobId: String,
+  schema: StructType,
+  mode: SaveMode,
+  options: DataSourceOptions): Optional[DataSourceWriter] = {
+throw new AnalysisException("Dummy file writer")
+  }
+}
+
+class SimpleFileDataSourceV2 extends SimpleDataSourceV2 with 
FileDataSourceV2 {
+  override def fallBackFileFormat: Class[_ <: FileFormat] = 
classOf[ParquetFileFormat]
+
+  override def shortName(): String = "parquet"
+}
+
+class FileDataSourceV2Suite extends QueryTest with ParquetTest with 
SharedSQLContext {
+  class DummyFileWriter extends DataSourceWriter {
+override def createWriterFactory(): DataWriterFactory[Row] = {
+  throw new AnalysisException("hehe")
+}
+
+override def commit(messages: Array[WriterCommitMessage]): Unit = {}
+
+override def abort(messages: Array[WriterCommitMessage]): Unit = {}
+  }
+
+  private val dummyParquetReaderV2 = 
classOf[DummyReadOnlyFileDataSourceV2].getName
+  private val dummyParquetWriterV2 = 
classOf[DummyWriteOnlyFileDataSourceV2].getName
+  private val simpleFileDataSourceV2 = 
classOf[SimpleFileDataSourceV2].getName
+  private val parquetV1 = classOf[ParquetFileFormat].getCanonicalName
+
+  test("Fall back to v1 when writing to file with read only 
FileDataSourceV2") {
+val df = spark.range(1, 10).toDF()
+withTempPath { file =>
+  val path = file.getCanonicalPath
+  // Writing file should fall back to v1 and succeed.
+  df.write.format(dummyParquetReaderV2).save(path)
+
+  // Validate write result with [[ParquetFileFormat]].
+  checkAnswer(spark.read.format(parquetV1).load(path), df)
+
+  // Dummy File reader should fail as expected.
+  val exception = intercept[AnalysisException] {
+spark.read.format(dummyParquetReaderV2).load(path)
+  }
+  assert(exception.message.equals("Dummy file reader"))
+}
+  }
+
+  test("Fall back to 

[GitHub] spark pull request #21123: [SPARK-24045][SQL]Create base class for file data...

2018-04-23 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/21123#discussion_r183589841
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/sources/v2/FileDataSourceV2Suite.scala
 ---
@@ -0,0 +1,179 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.sql.sources.v2
+
+import java.util.{List => JList, Optional}
+
+import org.apache.spark.sql.{AnalysisException, QueryTest, Row, SaveMode}
+import org.apache.spark.sql.execution.datasources.FileFormat
+import 
org.apache.spark.sql.execution.datasources.parquet.{ParquetFileFormat, 
ParquetTest}
+import org.apache.spark.sql.execution.datasources.v2.FileDataSourceV2
+import org.apache.spark.sql.internal.SQLConf
+import org.apache.spark.sql.sources.v2.reader.{DataReaderFactory, 
DataSourceReader}
+import org.apache.spark.sql.sources.v2.writer.{DataSourceWriter, 
DataWriterFactory, WriterCommitMessage}
+import org.apache.spark.sql.test.SharedSQLContext
+import org.apache.spark.sql.types.StructType
+
+class DummyReadOnlyFileDataSourceV2 extends FileDataSourceV2 with 
ReadSupport {
+  class DummyFileReader extends DataSourceReader {
+override def readSchema(): StructType = {
+  throw new AnalysisException("hehe")
+}
+
+override def createDataReaderFactories(): 
JList[DataReaderFactory[Row]] =
+  java.util.Arrays.asList()
+  }
+
+  override def createReader(options: DataSourceOptions): DataSourceReader 
= {
+throw new AnalysisException("Dummy file reader")
+  }
+
+  override def fallBackFileFormat: Class[_ <: FileFormat] = 
classOf[ParquetFileFormat]
+
+  override def shortName(): String = "parquet"
+}
+
+class DummyWriteOnlyFileDataSourceV2 extends FileDataSourceV2 with 
WriteSupport {
+  override def fallBackFileFormat: Class[_ <: FileFormat] = 
classOf[ParquetFileFormat]
+
+  override def shortName(): String = "parquet"
+
+  override def createWriter(
+  jobId: String,
+  schema: StructType,
+  mode: SaveMode,
+  options: DataSourceOptions): Optional[DataSourceWriter] = {
+throw new AnalysisException("Dummy file writer")
+  }
+}
+
+class SimpleFileDataSourceV2 extends SimpleDataSourceV2 with 
FileDataSourceV2 {
+  override def fallBackFileFormat: Class[_ <: FileFormat] = 
classOf[ParquetFileFormat]
+
+  override def shortName(): String = "parquet"
+}
+
+class FileDataSourceV2Suite extends QueryTest with ParquetTest with 
SharedSQLContext {
+  class DummyFileWriter extends DataSourceWriter {
+override def createWriterFactory(): DataWriterFactory[Row] = {
+  throw new AnalysisException("hehe")
+}
+
+override def commit(messages: Array[WriterCommitMessage]): Unit = {}
+
+override def abort(messages: Array[WriterCommitMessage]): Unit = {}
+  }
+
+  private val dummyParquetReaderV2 = 
classOf[DummyReadOnlyFileDataSourceV2].getName
+  private val dummyParquetWriterV2 = 
classOf[DummyWriteOnlyFileDataSourceV2].getName
+  private val simpleFileDataSourceV2 = 
classOf[SimpleFileDataSourceV2].getName
+  private val parquetV1 = classOf[ParquetFileFormat].getCanonicalName
+
+  test("Fall back to v1 when writing to file with read only 
FileDataSourceV2") {
+val df = spark.range(1, 10).toDF()
+withTempPath { file =>
+  val path = file.getCanonicalPath
+  // Writing file should fall back to v1 and succeed.
+  df.write.format(dummyParquetReaderV2).save(path)
+
+  // Validate write result with [[ParquetFileFormat]].
+  checkAnswer(spark.read.format(parquetV1).load(path), df)
+
+  // Dummy File reader should fail as expected.
+  val exception = intercept[AnalysisException] {
+spark.read.format(dummyParquetReaderV2).load(path)
+  }
+  assert(exception.message.equals("Dummy file reader"))
+}
+  }
+
+  test("Fall back to 

[GitHub] spark pull request #21123: [SPARK-24045][SQL]Create base class for file data...

2018-04-23 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/21123#discussion_r183589906
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/sources/v2/FileDataSourceV2Suite.scala
 ---
@@ -0,0 +1,179 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.sql.sources.v2
+
+import java.util.{List => JList, Optional}
+
+import org.apache.spark.sql.{AnalysisException, QueryTest, Row, SaveMode}
+import org.apache.spark.sql.execution.datasources.FileFormat
+import 
org.apache.spark.sql.execution.datasources.parquet.{ParquetFileFormat, 
ParquetTest}
+import org.apache.spark.sql.execution.datasources.v2.FileDataSourceV2
+import org.apache.spark.sql.internal.SQLConf
+import org.apache.spark.sql.sources.v2.reader.{DataReaderFactory, 
DataSourceReader}
+import org.apache.spark.sql.sources.v2.writer.{DataSourceWriter, 
DataWriterFactory, WriterCommitMessage}
+import org.apache.spark.sql.test.SharedSQLContext
+import org.apache.spark.sql.types.StructType
+
+class DummyReadOnlyFileDataSourceV2 extends FileDataSourceV2 with 
ReadSupport {
+  class DummyFileReader extends DataSourceReader {
+override def readSchema(): StructType = {
+  throw new AnalysisException("hehe")
+}
+
+override def createDataReaderFactories(): 
JList[DataReaderFactory[Row]] =
+  java.util.Arrays.asList()
+  }
+
+  override def createReader(options: DataSourceOptions): DataSourceReader 
= {
+throw new AnalysisException("Dummy file reader")
+  }
+
+  override def fallBackFileFormat: Class[_ <: FileFormat] = 
classOf[ParquetFileFormat]
+
+  override def shortName(): String = "parquet"
+}
+
+class DummyWriteOnlyFileDataSourceV2 extends FileDataSourceV2 with 
WriteSupport {
+  override def fallBackFileFormat: Class[_ <: FileFormat] = 
classOf[ParquetFileFormat]
+
+  override def shortName(): String = "parquet"
+
+  override def createWriter(
+  jobId: String,
+  schema: StructType,
+  mode: SaveMode,
+  options: DataSourceOptions): Optional[DataSourceWriter] = {
+throw new AnalysisException("Dummy file writer")
+  }
+}
+
+class SimpleFileDataSourceV2 extends SimpleDataSourceV2 with 
FileDataSourceV2 {
+  override def fallBackFileFormat: Class[_ <: FileFormat] = 
classOf[ParquetFileFormat]
+
+  override def shortName(): String = "parquet"
+}
+
+class FileDataSourceV2Suite extends QueryTest with ParquetTest with 
SharedSQLContext {
+  class DummyFileWriter extends DataSourceWriter {
+override def createWriterFactory(): DataWriterFactory[Row] = {
+  throw new AnalysisException("hehe")
+}
+
+override def commit(messages: Array[WriterCommitMessage]): Unit = {}
+
+override def abort(messages: Array[WriterCommitMessage]): Unit = {}
+  }
+
+  private val dummyParquetReaderV2 = 
classOf[DummyReadOnlyFileDataSourceV2].getName
+  private val dummyParquetWriterV2 = 
classOf[DummyWriteOnlyFileDataSourceV2].getName
+  private val simpleFileDataSourceV2 = 
classOf[SimpleFileDataSourceV2].getName
+  private val parquetV1 = classOf[ParquetFileFormat].getCanonicalName
+
+  test("Fall back to v1 when writing to file with read only 
FileDataSourceV2") {
+val df = spark.range(1, 10).toDF()
+withTempPath { file =>
+  val path = file.getCanonicalPath
+  // Writing file should fall back to v1 and succeed.
+  df.write.format(dummyParquetReaderV2).save(path)
+
+  // Validate write result with [[ParquetFileFormat]].
+  checkAnswer(spark.read.format(parquetV1).load(path), df)
+
+  // Dummy File reader should fail as expected.
+  val exception = intercept[AnalysisException] {
+spark.read.format(dummyParquetReaderV2).load(path)
+  }
+  assert(exception.message.equals("Dummy file reader"))
+}
+  }
+
+  test("Fall back to 

[GitHub] spark pull request #21123: [SPARK-24045][SQL]Create base class for file data...

2018-04-23 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/21123#discussion_r183589673
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/sources/v2/FileDataSourceV2Suite.scala
 ---
@@ -0,0 +1,179 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.sql.sources.v2
+
+import java.util.{List => JList, Optional}
+
+import org.apache.spark.sql.{AnalysisException, QueryTest, Row, SaveMode}
+import org.apache.spark.sql.execution.datasources.FileFormat
+import 
org.apache.spark.sql.execution.datasources.parquet.{ParquetFileFormat, 
ParquetTest}
+import org.apache.spark.sql.execution.datasources.v2.FileDataSourceV2
+import org.apache.spark.sql.internal.SQLConf
+import org.apache.spark.sql.sources.v2.reader.{DataReaderFactory, 
DataSourceReader}
+import org.apache.spark.sql.sources.v2.writer.{DataSourceWriter, 
DataWriterFactory, WriterCommitMessage}
+import org.apache.spark.sql.test.SharedSQLContext
+import org.apache.spark.sql.types.StructType
+
+class DummyReadOnlyFileDataSourceV2 extends FileDataSourceV2 with 
ReadSupport {
+  class DummyFileReader extends DataSourceReader {
+override def readSchema(): StructType = {
+  throw new AnalysisException("hehe")
+}
+
+override def createDataReaderFactories(): 
JList[DataReaderFactory[Row]] =
+  java.util.Arrays.asList()
+  }
+
+  override def createReader(options: DataSourceOptions): DataSourceReader 
= {
+throw new AnalysisException("Dummy file reader")
+  }
+
+  override def fallBackFileFormat: Class[_ <: FileFormat] = 
classOf[ParquetFileFormat]
+
+  override def shortName(): String = "parquet"
+}
+
+class DummyWriteOnlyFileDataSourceV2 extends FileDataSourceV2 with 
WriteSupport {
+  override def fallBackFileFormat: Class[_ <: FileFormat] = 
classOf[ParquetFileFormat]
+
+  override def shortName(): String = "parquet"
+
+  override def createWriter(
+  jobId: String,
+  schema: StructType,
+  mode: SaveMode,
+  options: DataSourceOptions): Optional[DataSourceWriter] = {
+throw new AnalysisException("Dummy file writer")
+  }
+}
+
+class SimpleFileDataSourceV2 extends SimpleDataSourceV2 with 
FileDataSourceV2 {
+  override def fallBackFileFormat: Class[_ <: FileFormat] = 
classOf[ParquetFileFormat]
+
+  override def shortName(): String = "parquet"
+}
+
+class FileDataSourceV2Suite extends QueryTest with ParquetTest with 
SharedSQLContext {
+  class DummyFileWriter extends DataSourceWriter {
+override def createWriterFactory(): DataWriterFactory[Row] = {
+  throw new AnalysisException("hehe")
+}
+
+override def commit(messages: Array[WriterCommitMessage]): Unit = {}
+
+override def abort(messages: Array[WriterCommitMessage]): Unit = {}
+  }
+
+  private val dummyParquetReaderV2 = 
classOf[DummyReadOnlyFileDataSourceV2].getName
+  private val dummyParquetWriterV2 = 
classOf[DummyWriteOnlyFileDataSourceV2].getName
+  private val simpleFileDataSourceV2 = 
classOf[SimpleFileDataSourceV2].getName
+  private val parquetV1 = classOf[ParquetFileFormat].getCanonicalName
+
+  test("Fall back to v1 when writing to file with read only 
FileDataSourceV2") {
+val df = spark.range(1, 10).toDF()
+withTempPath { file =>
+  val path = file.getCanonicalPath
+  // Writing file should fall back to v1 and succeed.
+  df.write.format(dummyParquetReaderV2).save(path)
+
+  // Validate write result with [[ParquetFileFormat]].
+  checkAnswer(spark.read.format(parquetV1).load(path), df)
+
+  // Dummy File reader should fail as expected.
+  val exception = intercept[AnalysisException] {
+spark.read.format(dummyParquetReaderV2).load(path)
+  }
+  assert(exception.message.equals("Dummy file reader"))
+}
+  }
+
+  test("Fall back to 

[GitHub] spark pull request #21123: [SPARK-24045][SQL]Create base class for file data...

2018-04-23 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/21123#discussion_r183589528
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/sources/v2/FileDataSourceV2Suite.scala
 ---
@@ -0,0 +1,179 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.sql.sources.v2
+
+import java.util.{List => JList, Optional}
+
+import org.apache.spark.sql.{AnalysisException, QueryTest, Row, SaveMode}
+import org.apache.spark.sql.execution.datasources.FileFormat
+import 
org.apache.spark.sql.execution.datasources.parquet.{ParquetFileFormat, 
ParquetTest}
+import org.apache.spark.sql.execution.datasources.v2.FileDataSourceV2
+import org.apache.spark.sql.internal.SQLConf
+import org.apache.spark.sql.sources.v2.reader.{DataReaderFactory, 
DataSourceReader}
+import org.apache.spark.sql.sources.v2.writer.{DataSourceWriter, 
DataWriterFactory, WriterCommitMessage}
+import org.apache.spark.sql.test.SharedSQLContext
+import org.apache.spark.sql.types.StructType
+
+class DummyReadOnlyFileDataSourceV2 extends FileDataSourceV2 with 
ReadSupport {
+  class DummyFileReader extends DataSourceReader {
+override def readSchema(): StructType = {
+  throw new AnalysisException("hehe")
+}
+
+override def createDataReaderFactories(): 
JList[DataReaderFactory[Row]] =
+  java.util.Arrays.asList()
+  }
+
+  override def createReader(options: DataSourceOptions): DataSourceReader 
= {
+throw new AnalysisException("Dummy file reader")
+  }
+
+  override def fallBackFileFormat: Class[_ <: FileFormat] = 
classOf[ParquetFileFormat]
+
+  override def shortName(): String = "parquet"
+}
+
+class DummyWriteOnlyFileDataSourceV2 extends FileDataSourceV2 with 
WriteSupport {
+  override def fallBackFileFormat: Class[_ <: FileFormat] = 
classOf[ParquetFileFormat]
+
+  override def shortName(): String = "parquet"
+
+  override def createWriter(
+  jobId: String,
+  schema: StructType,
+  mode: SaveMode,
+  options: DataSourceOptions): Optional[DataSourceWriter] = {
+throw new AnalysisException("Dummy file writer")
+  }
+}
+
+class SimpleFileDataSourceV2 extends SimpleDataSourceV2 with 
FileDataSourceV2 {
+  override def fallBackFileFormat: Class[_ <: FileFormat] = 
classOf[ParquetFileFormat]
+
+  override def shortName(): String = "parquet"
+}
+
+class FileDataSourceV2Suite extends QueryTest with ParquetTest with 
SharedSQLContext {
+  class DummyFileWriter extends DataSourceWriter {
+override def createWriterFactory(): DataWriterFactory[Row] = {
+  throw new AnalysisException("hehe")
+}
+
+override def commit(messages: Array[WriterCommitMessage]): Unit = {}
+
+override def abort(messages: Array[WriterCommitMessage]): Unit = {}
+  }
+
+  private val dummyParquetReaderV2 = 
classOf[DummyReadOnlyFileDataSourceV2].getName
+  private val dummyParquetWriterV2 = 
classOf[DummyWriteOnlyFileDataSourceV2].getName
+  private val simpleFileDataSourceV2 = 
classOf[SimpleFileDataSourceV2].getName
+  private val parquetV1 = classOf[ParquetFileFormat].getCanonicalName
+
+  test("Fall back to v1 when writing to file with read only 
FileDataSourceV2") {
+val df = spark.range(1, 10).toDF()
+withTempPath { file =>
+  val path = file.getCanonicalPath
+  // Writing file should fall back to v1 and succeed.
+  df.write.format(dummyParquetReaderV2).save(path)
+
+  // Validate write result with [[ParquetFileFormat]].
+  checkAnswer(spark.read.format(parquetV1).load(path), df)
+
+  // Dummy File reader should fail as expected.
--- End diff --

`Dummy File reader should be picked and fail as expected`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, 

[GitHub] spark pull request #21123: [SPARK-24045][SQL]Create base class for file data...

2018-04-23 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/21123#discussion_r183589427
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/sources/v2/FileDataSourceV2Suite.scala
 ---
@@ -0,0 +1,179 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.sql.sources.v2
+
+import java.util.{List => JList, Optional}
+
+import org.apache.spark.sql.{AnalysisException, QueryTest, Row, SaveMode}
+import org.apache.spark.sql.execution.datasources.FileFormat
+import 
org.apache.spark.sql.execution.datasources.parquet.{ParquetFileFormat, 
ParquetTest}
+import org.apache.spark.sql.execution.datasources.v2.FileDataSourceV2
+import org.apache.spark.sql.internal.SQLConf
+import org.apache.spark.sql.sources.v2.reader.{DataReaderFactory, 
DataSourceReader}
+import org.apache.spark.sql.sources.v2.writer.{DataSourceWriter, 
DataWriterFactory, WriterCommitMessage}
+import org.apache.spark.sql.test.SharedSQLContext
+import org.apache.spark.sql.types.StructType
+
+class DummyReadOnlyFileDataSourceV2 extends FileDataSourceV2 with 
ReadSupport {
+  class DummyFileReader extends DataSourceReader {
+override def readSchema(): StructType = {
+  throw new AnalysisException("hehe")
+}
+
+override def createDataReaderFactories(): 
JList[DataReaderFactory[Row]] =
+  java.util.Arrays.asList()
+  }
+
+  override def createReader(options: DataSourceOptions): DataSourceReader 
= {
+throw new AnalysisException("Dummy file reader")
+  }
+
+  override def fallBackFileFormat: Class[_ <: FileFormat] = 
classOf[ParquetFileFormat]
+
+  override def shortName(): String = "parquet"
+}
+
+class DummyWriteOnlyFileDataSourceV2 extends FileDataSourceV2 with 
WriteSupport {
+  override def fallBackFileFormat: Class[_ <: FileFormat] = 
classOf[ParquetFileFormat]
+
+  override def shortName(): String = "parquet"
+
+  override def createWriter(
+  jobId: String,
+  schema: StructType,
+  mode: SaveMode,
+  options: DataSourceOptions): Optional[DataSourceWriter] = {
+throw new AnalysisException("Dummy file writer")
+  }
+}
+
+class SimpleFileDataSourceV2 extends SimpleDataSourceV2 with 
FileDataSourceV2 {
+  override def fallBackFileFormat: Class[_ <: FileFormat] = 
classOf[ParquetFileFormat]
+
+  override def shortName(): String = "parquet"
+}
+
+class FileDataSourceV2Suite extends QueryTest with ParquetTest with 
SharedSQLContext {
+  class DummyFileWriter extends DataSourceWriter {
+override def createWriterFactory(): DataWriterFactory[Row] = {
+  throw new AnalysisException("hehe")
+}
+
+override def commit(messages: Array[WriterCommitMessage]): Unit = {}
+
+override def abort(messages: Array[WriterCommitMessage]): Unit = {}
+  }
+
+  private val dummyParquetReaderV2 = 
classOf[DummyReadOnlyFileDataSourceV2].getName
+  private val dummyParquetWriterV2 = 
classOf[DummyWriteOnlyFileDataSourceV2].getName
+  private val simpleFileDataSourceV2 = 
classOf[SimpleFileDataSourceV2].getName
+  private val parquetV1 = classOf[ParquetFileFormat].getCanonicalName
+
+  test("Fall back to v1 when writing to file with read only 
FileDataSourceV2") {
+val df = spark.range(1, 10).toDF()
--- End diff --

nit: `val df = spark.range(10)`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21123: [SPARK-24045][SQL]Create base class for file data...

2018-04-23 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/21123#discussion_r183589226
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/sources/v2/FileDataSourceV2Suite.scala
 ---
@@ -0,0 +1,179 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.sql.sources.v2
+
+import java.util.{List => JList, Optional}
+
+import org.apache.spark.sql.{AnalysisException, QueryTest, Row, SaveMode}
+import org.apache.spark.sql.execution.datasources.FileFormat
+import 
org.apache.spark.sql.execution.datasources.parquet.{ParquetFileFormat, 
ParquetTest}
+import org.apache.spark.sql.execution.datasources.v2.FileDataSourceV2
+import org.apache.spark.sql.internal.SQLConf
+import org.apache.spark.sql.sources.v2.reader.{DataReaderFactory, 
DataSourceReader}
+import org.apache.spark.sql.sources.v2.writer.{DataSourceWriter, 
DataWriterFactory, WriterCommitMessage}
+import org.apache.spark.sql.test.SharedSQLContext
+import org.apache.spark.sql.types.StructType
+
+class DummyReadOnlyFileDataSourceV2 extends FileDataSourceV2 with 
ReadSupport {
+  class DummyFileReader extends DataSourceReader {
+override def readSchema(): StructType = {
+  throw new AnalysisException("hehe")
+}
+
+override def createDataReaderFactories(): 
JList[DataReaderFactory[Row]] =
+  java.util.Arrays.asList()
+  }
+
+  override def createReader(options: DataSourceOptions): DataSourceReader 
= {
+throw new AnalysisException("Dummy file reader")
+  }
+
+  override def fallBackFileFormat: Class[_ <: FileFormat] = 
classOf[ParquetFileFormat]
+
+  override def shortName(): String = "parquet"
+}
+
+class DummyWriteOnlyFileDataSourceV2 extends FileDataSourceV2 with 
WriteSupport {
+  override def fallBackFileFormat: Class[_ <: FileFormat] = 
classOf[ParquetFileFormat]
+
+  override def shortName(): String = "parquet"
+
+  override def createWriter(
+  jobId: String,
+  schema: StructType,
+  mode: SaveMode,
+  options: DataSourceOptions): Optional[DataSourceWriter] = {
+throw new AnalysisException("Dummy file writer")
+  }
+}
+
+class SimpleFileDataSourceV2 extends SimpleDataSourceV2 with 
FileDataSourceV2 {
+  override def fallBackFileFormat: Class[_ <: FileFormat] = 
classOf[ParquetFileFormat]
+
+  override def shortName(): String = "parquet"
+}
+
+class FileDataSourceV2Suite extends QueryTest with ParquetTest with 
SharedSQLContext {
--- End diff --

`FileDataSourceV2FallbackSuite `


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21123: [SPARK-24045][SQL]Create base class for file data...

2018-04-23 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/21123#discussion_r183589146
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/internal/BaseSessionStateBuilder.scala
 ---
@@ -158,6 +158,7 @@ abstract class BaseSessionStateBuilder(
 override val extendedResolutionRules: Seq[Rule[LogicalPlan]] =
   new FindDataSourceTable(session) +:
 new ResolveSQLOnFile(session) +:
+new FallBackFileDataSourceToV1(session) +:
--- End diff --

We also need to add this rule to `HiveSessionStateBuilder.analyzer`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21072: [SPARK-23973][SQL] Remove consecutive Sorts

2018-04-23 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/21072


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21072: [SPARK-23973][SQL] Remove consecutive Sorts

2018-04-23 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/21072
  
thanks, merging to master!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21127: [SPARK-24052][CORE][UI] Add spark version information on...

2018-04-23 Thread vanzin
Github user vanzin commented on the issue:

https://github.com/apache/spark/pull/21127
  
The SHS shows the application's Spark version in the title (or the SHS 
version in the listing page).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21128: [SPARK-24053][CORE] Support add subdirectory named as us...

2018-04-23 Thread vanzin
Github user vanzin commented on the issue:

https://github.com/apache/spark/pull/21128
  
See my last comment. You don't need code in Spark to achieve what you want 
to do.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21131: [SPARK-23433][CORE] Late zombie task completions update ...

2018-04-23 Thread squito
Github user squito commented on the issue:

https://github.com/apache/spark/pull/21131
  
@markhamstra @zsxwing @jiangxb1987 @Ngone51 would appreciate a review, 
thanks


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20923: [SPARK-23807][BUILD] Add Hadoop 3.1 profile with relevan...

2018-04-23 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/20923
  
+1 for @jerryshao 's comment. Some of Hive UTs will fail with Hadoop 3 
profile.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21113: [MINOR][DOCS] Fix comments of SQLExecution#withExecution...

2018-04-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21113
  
**[Test build #4155 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4155/testReport)**
 for PR 21113 at commit 
[`6aceb43`](https://github.com/apache/spark/commit/6aceb43c56641a38b97b5e090379ef5143f17dd0).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21127: [SPARK-24052][CORE][UI] Add spark version information on...

2018-04-23 Thread caneGuy
Github user caneGuy commented on the issue:

https://github.com/apache/spark/pull/21127
  
It is version for the historyserver @vanzin 
May be i can reuse SparkBuildInfo? @srowen @vanzin 
Thanks!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21113: [MINOR][DOCS] Fix comments of SQLExecution#withEx...

2018-04-23 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/21113#discussion_r183582806
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/SQLExecution.scala ---
@@ -88,7 +88,7 @@ object SQLExecution {
   /**
* Wrap an action with a known executionId. When running a different 
action in a different
* thread from the original one, this method can be used to connect the 
Spark jobs in this action
-   * with the known executionId, e.g., `BroadcastHashJoin.broadcastFuture`.
+   * with the known executionId, e.g., `BroadcastExchange.relationFuture`.
--- End diff --

`BroadcastExchange` -> `BroadcastExchangeExec`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21081: [SPARK-23975][ML]Allow Clustering to take Arrays of Doub...

2018-04-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21081
  
**[Test build #4157 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4157/testReport)**
 for PR 21081 at commit 
[`c4e1a51`](https://github.com/apache/spark/commit/c4e1a51551993008d7b082b112d2296cbc4eb97b).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20923: [SPARK-23807][BUILD] Add Hadoop 3.1 profile with relevan...

2018-04-23 Thread jerryshao
Github user jerryshao commented on the issue:

https://github.com/apache/spark/pull/20923
  
I would guess the test here doesn't actually run on Hadoop 3 profile. So we 
actually doesn't test anything.

Also we still cannot use Hadoop3 even if we merge this because of Hive 
issue. Unless we use some tricks mentioned above. So I'm not sure if we should 
address Hive issue first.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21128: [SPARK-24053][CORE] Support add subdirectory named as us...

2018-04-23 Thread caneGuy
Github user caneGuy commented on the issue:

https://github.com/apache/spark/pull/21128
  
@vanzin Thanks!
Since we set the common staging dir as 
```
hdfs://xxx/spark/xxx/staging
```
This patch will give us a directory for given user like:
```
hdfs://xxx/spark/xxx/staging/u_zhoukang
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21078: [SPARK-23990][ML] Instruments logging improvements - ML ...

2018-04-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21078
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89752/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21078: [SPARK-23990][ML] Instruments logging improvements - ML ...

2018-04-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21078
  
**[Test build #89752 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89752/testReport)**
 for PR 21078 at commit 
[`5ee7e23`](https://github.com/apache/spark/commit/5ee7e233bcaf1e78afbbe6f9b877f48cecb03da6).
 * This patch **fails to build**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21078: [SPARK-23990][ML] Instruments logging improvements - ML ...

2018-04-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21078
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20535: [SPARK-23341][SQL] define some standard options f...

2018-04-23 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/20535#discussion_r183580907
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala ---
@@ -193,10 +196,13 @@ class DataFrameReader private[sql](sparkSession: 
SparkSession) extends Logging {
   if (ds.isInstanceOf[ReadSupport] || 
ds.isInstanceOf[ReadSupportWithSchema]) {
 val sessionOptions = DataSourceV2Utils.extractSessionConfigs(
   ds = ds, conf = sparkSession.sessionState.conf)
+val pathsOption = {
+  val objectMapper = new ObjectMapper()
+  DataSourceOptions.PATHS_KEY -> 
objectMapper.writeValueAsString(paths.toArray)
+}
 Dataset.ofRows(sparkSession, DataSourceV2Relation.create(
-  ds, extraOptions.toMap ++ sessionOptions,
+  ds, extraOptions.toMap ++ sessionOptions + pathsOption,
--- End diff --

Basically we may have duplicated entries in session configs and 
`DataFrameReader/Writer` options, not only path. The rule is, 
`DataFrameReader/Writer` options should overwrite session configs.

cc @jiangxb1987 can you submit a PR to explicitly document it in 
`SessionConfigSupport`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21132: [SPARK-24029][core] Follow up: set SO_REUSEADDR o...

2018-04-23 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/21132


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21078: [SPARK-23990][ML] Instruments logging improvements - ML ...

2018-04-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21078
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2605/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21078: [SPARK-23990][ML] Instruments logging improvements - ML ...

2018-04-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21078
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21132: [SPARK-24029][core] Follow up: set SO_REUSEADDR on the s...

2018-04-23 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/21132
  
Merged to master


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21081: [SPARK-23975][ML]Allow Clustering to take Arrays of Doub...

2018-04-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21081
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21081: [SPARK-23975][ML]Allow Clustering to take Arrays of Doub...

2018-04-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21081
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89749/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21081: [SPARK-23975][ML]Allow Clustering to take Arrays of Doub...

2018-04-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21081
  
**[Test build #89749 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89749/testReport)**
 for PR 21081 at commit 
[`c4e1a51`](https://github.com/apache/spark/commit/c4e1a51551993008d7b082b112d2296cbc4eb97b).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   6   >