date:20181022

[GitHub] spark issue #22790: [SPARK-25793][ML]call SaveLoadV2_0.load for classNameV2_...

2018-10-22 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22790
  
**[Test build #97904 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97904/testReport)**
 for PR 22790 at commit 
[`77902bc`](https://github.com/apache/spark/commit/77902bc3a08d1397c1b69b68ad7aecaf1038defb).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22754: [SPARK-25776][CORE][MINOR]The disk write buffer size mus...

2018-10-22 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22754
  
**[Test build #97903 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97903/testReport)**
 for PR 22754 at commit 
[`6f8404b`](https://github.com/apache/spark/commit/6f8404b474539a989e08459949f54395bcd7ed10).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22754: [SPARK-25776][CORE][MINOR]The disk write buffer size mus...

2018-10-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22754
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4393/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22754: [SPARK-25776][CORE][MINOR]The disk write buffer size mus...

2018-10-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22754
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22754: [SPARK-25776][CORE][MINOR]The disk write buffer size mus...

2018-10-22 Thread 10110346

Github user 10110346 commented on the issue:

https://github.com/apache/spark/pull/22754
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20433: [SPARK-23264][SQL] Make INTERVAL keyword optional in INT...

2018-10-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20433
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22787: [SPARK-25040][SQL] Empty string for non string ty...

2018-10-22 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/22787


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20433: [SPARK-23264][SQL] Make INTERVAL keyword optional in INT...

2018-10-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20433
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97890/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20433: [SPARK-23264][SQL] Make INTERVAL keyword optional in INT...

2018-10-22 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20433
  
**[Test build #97890 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97890/testReport)**
 for PR 20433 at commit 
[`554c122`](https://github.com/apache/spark/commit/554c122a4cf7a09ccf3202911560dc1c5d2f8b7a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22799: [SPARK-25805][SQL][TEST] Fix test for SPARK-25159

2018-10-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22799
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97892/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22799: [SPARK-25805][SQL][TEST] Fix test for SPARK-25159

2018-10-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22799
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22799: [SPARK-25805][SQL][TEST] Fix test for SPARK-25159

2018-10-22 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22799
  
**[Test build #97892 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97892/testReport)**
 for PR 22799 at commit 
[`65be98b`](https://github.com/apache/spark/commit/65be98bf4572ca57a2f747582ec796935b4ede54).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22787: [SPARK-25040][SQL] Empty string for non string types sho...

2018-10-22 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22787
  
Merged to master.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22801: [SPARK-25656][SQL][DOC][EXAMPLE] Add a doc and examples ...

2018-10-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22801
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97900/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22801: [SPARK-25656][SQL][DOC][EXAMPLE] Add a doc and examples ...

2018-10-22 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22801
  
**[Test build #97900 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97900/testReport)**
 for PR 22801 at commit 
[`1bd23a4`](https://github.com/apache/spark/commit/1bd23a41b3d6dbe4a2eff2565c79d0b2379f894b).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22801: [SPARK-25656][SQL][DOC][EXAMPLE] Add a doc and examples ...

2018-10-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22801
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22512: [SPARK-25498][SQL] InterpretedMutableProjection should h...

2018-10-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22512
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22512: [SPARK-25498][SQL] InterpretedMutableProjection should h...

2018-10-22 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22512
  
**[Test build #97902 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97902/testReport)**
 for PR 22512 at commit 
[`45e65e5`](https://github.com/apache/spark/commit/45e65e5b3e61f702ebcb1203e95434051adbe437).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22790: [SPARK-25793][ML]call SaveLoadV2_0.load for class...

2018-10-22 Thread huaxingao

Github user huaxingao commented on a diff in the pull request:

https://github.com/apache/spark/pull/22790#discussion_r227229331
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/mllib/clustering/BisectingKMeansModel.scala
 ---
@@ -126,7 +126,7 @@ object BisectingKMeansModel extends 
Loader[BisectingKMeansModel] {
 val model = SaveLoadV1_0.load(sc, path)
 model
   case (SaveLoadV2_0.thisClassName, SaveLoadV2_0.thisFormatVersion) =>
-val model = SaveLoadV1_0.load(sc, path)
+val model = SaveLoadV2_0.load(sc, path)
--- End diff --

@viirya @mgaido91 
I will change the ```BisectingKMeansModel.save``` to
```
BisectingKMeansModel.SaveLoadV2_0.save(sc, this, path)
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22512: [SPARK-25498][SQL] InterpretedMutableProjection should h...

2018-10-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22512
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4392/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22512: [SPARK-25498][SQL] InterpretedMutableProjection should h...

2018-10-22 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22512
  
**[Test build #97901 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97901/testReport)**
 for PR 22512 at commit 
[`b8c5a17`](https://github.com/apache/spark/commit/b8c5a177b2ac72b35ad89d7076a063117568c768).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22512: [SPARK-25498][SQL] InterpretedMutableProjection should h...

2018-10-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22512
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22512: [SPARK-25498][SQL] InterpretedMutableProjection should h...

2018-10-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22512
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4391/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22801: [SPARK-25656][SQL][DOC][EXAMPLE] Add a doc and examples ...

2018-10-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22801
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22801: [SPARK-25656][SQL][DOC][EXAMPLE] Add a doc and examples ...

2018-10-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22801
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4390/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22723: [SPARK-25729][CORE]It is better to replace `minPartition...

2018-10-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22723
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97888/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22723: [SPARK-25729][CORE]It is better to replace `minPartition...

2018-10-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22723
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22723: [SPARK-25729][CORE]It is better to replace `minPartition...

2018-10-22 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22723
  
**[Test build #97888 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97888/testReport)**
 for PR 22723 at commit 
[`b80bf66`](https://github.com/apache/spark/commit/b80bf66a8109faa7f58d45b92417a981666866a0).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22512: [SPARK-25498][SQL] InterpretedMutableProjection s...

2018-10-22 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/22512#discussion_r227226902
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala ---
@@ -140,6 +141,14 @@ class SQLQueryTestSuite extends QueryTest with 
SharedSQLContext {
 val input = fileToString(new File(testCase.inputFile))
 
 val (comments, code) = input.split("\n").partition(_.startsWith("--"))
+
+// Runs all the tests on both codegen-only and interpreter modes. 
Since explain results differ
+// when `WHOLESTAGE_CODEGEN_ENABLED` disabled, we don't run these 
tests now.
+val codegenConfigSets = Array(("false", "NO_CODEGEN"), ("true", 
"CODEGEN_ONLY")).map {
+  case (wholeStageCodegenEnabled, codegenFactoryMode) =>
+Array( // SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key -> 
wholeStageCodegenEnabled,
--- End diff --

ok


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22801: [SPARK-25656][DOC][EXAMPLE] Add a doc and examples about...

2018-10-22 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22801
  
**[Test build #97900 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97900/testReport)**
 for PR 22801 at commit 
[`1bd23a4`](https://github.com/apache/spark/commit/1bd23a41b3d6dbe4a2eff2565c79d0b2379f894b).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22801: [SPARK-25656][DOC][EXAMPLE] Add a doc and example...

2018-10-22 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/22801#discussion_r227226704
  
--- Diff: docs/sql-data-sources-load-save-functions.md ---
@@ -82,6 +82,50 @@ To load a CSV file you can use:
 
 
 
+The extra options are also used during write operation.
+For example, you can control bloom filters and dictionary encodings for 
ORC data sources.
+The following ORC example will create bloom filter and use dictionary 
encoding only for `favorite_color`.
+For Parquet, there exists `parquet.enable.dictionary`, too.
+To find more detailed information about the extra ORC/Parquet options,
+visit the official Apache ORC/Parquet websites.
+
+
+
+
+{% include_example manual_save_options_orc 
scala/org/apache/spark/examples/sql/SQLDataSourceExample.scala %}
+
+
+
+{% include_example manual_save_options_orc 
java/org/apache/spark/examples/sql/JavaSQLDataSourceExample.java %}
+
+
+
+{% include_example manual_save_options_orc python/sql/datasource.py %}
+
+
+
+{% include_example manual_save_options_orc r/RSparkSQLExample.R %}
+
+
+
+
+{% highlight sql %}
+CREATE TABLE users_with_options (
+  name STRING,
+  favorite_color STRING,
+  favorite_numbers array
+) USING ORC
+OPTIONS (
+  orc.bloom.filter.columns 'favorite_color',
+  orc.dictionary.key.threshold '1.0',
+  orc.column.encoding.direct 'name'
--- End diff --

Could you review this, @gatorsmile ? This is the example we discussed 
previously. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22801: [SPARK-25656][DOC][EXAMPLE] Add a doc and example...

2018-10-22 Thread dongjoon-hyun

GitHub user dongjoon-hyun opened a pull request:

https://github.com/apache/spark/pull/22801

[SPARK-25656][DOC][EXAMPLE] Add a doc and examples about extra data source 
options

## What changes were proposed in this pull request?

Our current doc does not explain how we are passing the data source 
specific options to the underlying data source. According to [the review 
comment](https://github.com/apache/spark/pull/22622#discussion_r222911529), 
this PR aims to add more detailed information and examples 

## How was this patch tested?

Manual.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dongjoon-hyun/spark SPARK-25656

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/22801.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #22801


commit 1bd23a41b3d6dbe4a2eff2565c79d0b2379f894b
Author: Dongjoon Hyun 
Date:   2018-10-23T04:57:15Z

[SPARK-25656][DOC][EXAMPLE] Add a doc and examples about extra data source 
save options




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22512: [SPARK-25498][SQL] InterpretedMutableProjection should h...

2018-10-22 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/22512
  
ok, I'll add tests.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22512: [SPARK-25498][SQL] InterpretedMutableProjection s...

2018-10-22 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/22512#discussion_r227224209
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/InterpretedMutableProjection.scala
 ---
@@ -49,10 +51,54 @@ class InterpretedMutableProjection(expressions: 
Seq[Expression]) extends Mutable
   def currentValue: InternalRow = mutableRow
 
   override def target(row: InternalRow): MutableProjection = {
+// If `mutableRow` is `UnsafeRow`, `MutableProjection` accepts 
fixed-length types only
+assert(!row.isInstanceOf[UnsafeRow] ||
+  validExprs.forall { case (e, _) => 
UnsafeRow.isFixedLength(e.dataType) })
 mutableRow = row
 this
   }
 
+  private[this] val fieldWriters = validExprs.map { case (e, i) =>
+val writer = generateRowWriter(i, e.dataType)
+if (!e.nullable) {
+  (v: Any) => writer(v)
+} else {
+  (v: Any) => {
+if (v == null) {
+  mutableRow.setNullAt(i)
+} else {
+  writer(v)
+}
+  }
+}
+  }
+
+  private def generateRowWriter(ordinal: Int, dt: DataType): Any => Unit = 
dt match {
--- End diff --

oh, yes! yea, I will.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22512: [SPARK-25498][SQL] InterpretedMutableProjection s...

2018-10-22 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/22512#discussion_r227224030
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/InterpretedMutableProjection.scala
 ---
@@ -49,10 +51,54 @@ class InterpretedMutableProjection(expressions: 
Seq[Expression]) extends Mutable
   def currentValue: InternalRow = mutableRow
 
   override def target(row: InternalRow): MutableProjection = {
+// If `mutableRow` is `UnsafeRow`, `MutableProjection` accepts 
fixed-length types only
+assert(!row.isInstanceOf[UnsafeRow] ||
+  validExprs.forall { case (e, _) => 
UnsafeRow.isFixedLength(e.dataType) })
 mutableRow = row
 this
   }
 
+  private[this] val fieldWriters = validExprs.map { case (e, i) =>
+val writer = generateRowWriter(i, e.dataType)
+if (!e.nullable) {
+  (v: Any) => writer(v)
+} else {
+  (v: Any) => {
+if (v == null) {
+  mutableRow.setNullAt(i)
+} else {
+  writer(v)
+}
+  }
+}
+  }
+
+  private def generateRowWriter(ordinal: Int, dt: DataType): Any => Unit = 
dt match {
+case BooleanType =>
+  v => mutableRow.setBoolean(ordinal, v.asInstanceOf[Boolean])
+case ByteType =>
+  v => mutableRow.setByte(ordinal, v.asInstanceOf[Byte])
+case ShortType =>
+  v => mutableRow.setShort(ordinal, v.asInstanceOf[Short])
+case IntegerType | DateType =>
+  v => mutableRow.setInt(ordinal, v.asInstanceOf[Int])
+case LongType | TimestampType =>
+  v => mutableRow.setLong(ordinal, v.asInstanceOf[Long])
+case FloatType =>
+  v => mutableRow.setFloat(ordinal, v.asInstanceOf[Float])
+case DoubleType =>
+  v => mutableRow.setDouble(ordinal, v.asInstanceOf[Double])
+case DecimalType.Fixed(precision, _) =>
+  v => mutableRow.setDecimal(ordinal, v.asInstanceOf[Decimal], 
precision)
+case CalendarIntervalType | BinaryType | _: ArrayType | StringType | 
_: StructType |
+ _: MapType | _: UserDefinedType[_] =>
+  v => mutableRow.update(ordinal, v)
+case NullType =>
+  v => {}
--- End diff --

We need to take care of `e.nullable && e.dataType == NullType` here?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22754: [SPARK-25776][CORE][MINOR]The disk write buffer size mus...

2018-10-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22754
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22754: [SPARK-25776][CORE][MINOR]The disk write buffer size mus...

2018-10-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22754
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97887/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22754: [SPARK-25776][CORE][MINOR]The disk write buffer size mus...

2018-10-22 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22754
  
**[Test build #97887 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97887/testReport)**
 for PR 22754 at commit 
[`6f8404b`](https://github.com/apache/spark/commit/6f8404b474539a989e08459949f54395bcd7ed10).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22612: [SPARK-24958] Add executors' process tree total memory i...

2018-10-22 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22612
  
**[Test build #97899 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97899/testReport)**
 for PR 22612 at commit 
[`7f7ed2b`](https://github.com/apache/spark/commit/7f7ed2bdf5740bd2c4ae8cf2090ba7f016ffb023).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22612: [SPARK-24958] Add executors' process tree total m...

2018-10-22 Thread rezasafi

Github user rezasafi commented on a diff in the pull request:

https://github.com/apache/spark/pull/22612#discussion_r22741
  
--- Diff: 
core/src/main/scala/org/apache/spark/metrics/ExecutorMetricType.scala ---
@@ -95,10 +135,29 @@ private[spark] object ExecutorMetricType {
 OnHeapUnifiedMemory,
 OffHeapUnifiedMemory,
 DirectPoolMemory,
-MappedPoolMemory
+MappedPoolMemory,
+ProcessTreeMetrics
+  )
+ // List of defined metrics
+  val definedMetrics = IndexedSeq(
+"JVMHeapMemory",
+"JVMOffHeapMemory",
+"OnHeapExecutionMemory",
+"OffHeapExecutionMemory",
+"OnHeapStorageMemory",
+"OffHeapStorageMemory",
+"OnHeapUnifiedMemory",
+"OffHeapUnifiedMemory",
+"DirectPoolMemory",
+"MappedPoolMemory",
+"ProcessTreeJVMVMemory",
+"ProcessTreeJVMRSSMemory",
+"ProcessTreePythonVMemory",
+"ProcessTreePythonRSSMemory",
+"ProcessTreeOtherVMemory",
+"ProcessTreeOtherRSSMemory"
--- End diff --

I changed this in a way similar to what you suggested to avoid having 
separate names and also using arrays instead of maps


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22612: [SPARK-24958] Add executors' process tree total memory i...

2018-10-22 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22612
  
**[Test build #97898 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97898/testReport)**
 for PR 22612 at commit 
[`a3f2c9b`](https://github.com/apache/spark/commit/a3f2c9bbf8897f2ebb68b8e4607beff5f0cc29fe).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22204: [SPARK-25196][SQL] Extends Analyze commands for cached t...

2018-10-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22204
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22204: [SPARK-25196][SQL] Extends Analyze commands for cached t...

2018-10-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22204
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97894/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22204: [SPARK-25196][SQL] Extends Analyze commands for cached t...

2018-10-22 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22204
  
**[Test build #97894 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97894/testReport)**
 for PR 22204 at commit 
[`d5bff88`](https://github.com/apache/spark/commit/d5bff888a17439a62f8b4f0762a2488cdd57e817).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22800: [SPARK-24499][SQL][DOC][follow-up] Fix spelling i...

2018-10-22 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/22800


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22800: [SPARK-24499][SQL][DOC][follow-up] Fix spelling in doc

2018-10-22 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/22800
  
thanks, merging to master/2.4!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22787: [SPARK-25040][SQL] Empty string for non string types sho...

2018-10-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22787
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97884/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22787: [SPARK-25040][SQL] Empty string for non string types sho...

2018-10-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22787
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22787: [SPARK-25040][SQL] Empty string for non string types sho...

2018-10-22 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22787
  
**[Test build #97884 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97884/testReport)**
 for PR 22787 at commit 
[`c04ea64`](https://github.com/apache/spark/commit/c04ea649a9087f9cb5ffbb7361fbb711d74d9266).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22749: [SPARK-25746][SQL] Refactoring ExpressionEncoder to get ...

2018-10-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22749
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97885/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22749: [SPARK-25746][SQL] Refactoring ExpressionEncoder to get ...

2018-10-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22749
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22749: [SPARK-25746][SQL] Refactoring ExpressionEncoder to get ...

2018-10-22 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22749
  
**[Test build #97885 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97885/testReport)**
 for PR 22749 at commit 
[`400f878`](https://github.com/apache/spark/commit/400f87817183640006140e2db1839f8d78a13856).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22797: [SPARK-19851][SQL] Add support for EVERY and ANY (SOME) ...

2018-10-22 Thread dilipbiswal

Github user dilipbiswal commented on the issue:

https://github.com/apache/spark/pull/22797
  
cc @cloud-fan @gatorsmile 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22755: [SPARK-25755][SQL][Test] Supplementation of non-CodeGen ...

2018-10-22 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22755
  
**[Test build #97897 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97897/testReport)**
 for PR 22755 at commit 
[`0d75328`](https://github.com/apache/spark/commit/0d753280828cba5e7658edafdf66b1ebbde527b0).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22755: [SPARK-25755][SQL][Test] Supplementation of non-C...

2018-10-22 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/22755#discussion_r227215773
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/joins/ExistenceJoinSuite.scala
 ---
@@ -122,19 +122,22 @@ class ExistenceJoinSuite extends SparkPlanTest with 
SharedSQLContext {
 
 test(s"$testName using BroadcastHashJoin") {
   extractJoinParts().foreach { case (_, leftKeys, rightKeys, 
boundCondition, _, _) =>
-withSQLConf(SQLConf.SHUFFLE_PARTITIONS.key -> "1") {
-  checkAnswer2(leftRows, rightRows, (left: SparkPlan, right: 
SparkPlan) =>
-EnsureRequirements(left.sqlContext.sessionState.conf).apply(
-  BroadcastHashJoinExec(
-leftKeys, rightKeys, joinType, BuildRight, boundCondition, 
left, right)),
-expectedAnswer,
-sortAnswers = true)
-  checkAnswer2(leftRows, rightRows, (left: SparkPlan, right: 
SparkPlan) =>
-EnsureRequirements(left.sqlContext.sessionState.conf).apply(
-  createLeftSemiPlusJoin(BroadcastHashJoinExec(
-leftKeys, rightKeys, leftSemiPlus, BuildRight, 
boundCondition, left, right))),
-expectedAnswer,
-sortAnswers = true)
+Seq("false", "true").foreach { v =>
--- End diff --

nit: v -> codegenEnabled


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22755: [SPARK-25755][SQL][Test] Supplementation of non-CodeGen ...

2018-10-22 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/22755
  
ok to test


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22784: [SPARK-25790][MLLIB] PCA: Support more than 65535 column...

2018-10-22 Thread kiszk

Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/22784
  
Sorry for my mistake. My keyboard '4' sometimes has a trouble.
> I think, INT_MAX is 2147483647, so n ~= sqrt(2*2147483647) = 65536.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22778: [SPARK-25784][SQL] Infer filters from constraints...

2018-10-22 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/22778#discussion_r227215135
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/RewriteSubquerySuite.scala
 ---
@@ -33,23 +34,44 @@ class RewriteSubquerySuite extends PlanTest {
   Batch("Rewrite Subquery", FixedPoint(1),
 RewritePredicateSubquery,
 ColumnPruning,
+InferFiltersFromConstraints,
+PushDownPredicate,
 CollapseProject,
 RemoveRedundantProject) :: Nil
   }
 
   test("Column pruning after rewriting predicate subquery") {
-val relation = LocalRelation('a.int, 'b.int)
-val relInSubquery = LocalRelation('x.int, 'y.int, 'z.int)
+withSQLConf(SQLConf.CONSTRAINT_PROPAGATION_ENABLED.key -> "false") {
+  val relation = LocalRelation('a.int, 'b.int)
+  val relInSubquery = LocalRelation('x.int, 'y.int, 'z.int)
 
-val query = 
relation.where('a.in(ListQuery(relInSubquery.select('x.select('a)
+  val query = 
relation.where('a.in(ListQuery(relInSubquery.select('x.select('a)
 
-val optimized = Optimize.execute(query.analyze)
-val correctAnswer = relation
-  .select('a)
-  .join(relInSubquery.select('x), LeftSemi, Some('a === 'x))
-  .analyze
+  val optimized = Optimize.execute(query.analyze)
+  val correctAnswer = relation
+.select('a)
+.join(relInSubquery.select('x), LeftSemi, Some('a === 'x))
+.analyze
 
-comparePlans(optimized, correctAnswer)
+  comparePlans(optimized, correctAnswer)
+}
+  }
+
+  test("Infer filters and push down predicate after rewriting predicate 
subquery") {
--- End diff --

How about making the test title simple, then leaving comments about what's 
tested clearly here?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22778: [SPARK-25784][SQL] Infer filters from constraints...

2018-10-22 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/22778#discussion_r227214593
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/RewriteSubquerySuite.scala
 ---
@@ -33,23 +34,44 @@ class RewriteSubquerySuite extends PlanTest {
   Batch("Rewrite Subquery", FixedPoint(1),
 RewritePredicateSubquery,
 ColumnPruning,
+InferFiltersFromConstraints,
+PushDownPredicate,
 CollapseProject,
 RemoveRedundantProject) :: Nil
   }
 
   test("Column pruning after rewriting predicate subquery") {
-val relation = LocalRelation('a.int, 'b.int)
-val relInSubquery = LocalRelation('x.int, 'y.int, 'z.int)
+withSQLConf(SQLConf.CONSTRAINT_PROPAGATION_ENABLED.key -> "false") {
+  val relation = LocalRelation('a.int, 'b.int)
+  val relInSubquery = LocalRelation('x.int, 'y.int, 'z.int)
 
-val query = 
relation.where('a.in(ListQuery(relInSubquery.select('x.select('a)
+  val query = 
relation.where('a.in(ListQuery(relInSubquery.select('x.select('a)
 
-val optimized = Optimize.execute(query.analyze)
-val correctAnswer = relation
-  .select('a)
-  .join(relInSubquery.select('x), LeftSemi, Some('a === 'x))
-  .analyze
+  val optimized = Optimize.execute(query.analyze)
+  val correctAnswer = relation
+.select('a)
+.join(relInSubquery.select('x), LeftSemi, Some('a === 'x))
+.analyze
 
-comparePlans(optimized, correctAnswer)
+  comparePlans(optimized, correctAnswer)
+}
+  }
+
+  test("Infer filters and push down predicate after rewriting predicate 
subquery") {
--- End diff --

Need the column pruning in the test title?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22778: [SPARK-25784][SQL] Infer filters from constraints...

2018-10-22 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/22778#discussion_r227214404
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/RewriteSubquerySuite.scala
 ---
@@ -33,23 +34,44 @@ class RewriteSubquerySuite extends PlanTest {
   Batch("Rewrite Subquery", FixedPoint(1),
 RewritePredicateSubquery,
 ColumnPruning,
+InferFiltersFromConstraints,
+PushDownPredicate,
 CollapseProject,
 RemoveRedundantProject) :: Nil
   }
 
   test("Column pruning after rewriting predicate subquery") {
-val relation = LocalRelation('a.int, 'b.int)
-val relInSubquery = LocalRelation('x.int, 'y.int, 'z.int)
+withSQLConf(SQLConf.CONSTRAINT_PROPAGATION_ENABLED.key -> "false") {
--- End diff --

Ah, I see. Thanks.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22675: [SPARK-25347][ML][DOC] Spark datasource for image/libsvm...

2018-10-22 Thread imatiach-msft

Github user imatiach-msft commented on the issue:

https://github.com/apache/spark/pull/22675
  
LGTM, this is great to have!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22675: [SPARK-25347][ML][DOC] Spark datasource for image...

2018-10-22 Thread imatiach-msft

Github user imatiach-msft commented on a diff in the pull request:

https://github.com/apache/spark/pull/22675#discussion_r227214199
  
--- Diff: docs/ml-datasource.md ---
@@ -0,0 +1,113 @@
+---
+layout: global
+title: Data sources
+displayTitle: Data sources
+---
+
+In this section, we introduce how to use data source in ML to load data.
+Beside some general data sources such as Parquet, CSV, JSON and JDBC, we 
also provide some specific data source for ML.
+
+**Table of Contents**
+
+* This will become a table of contents (this text will be scraped).
--- End diff --

ah, ok, great


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21632: [SPARK-19591][ML][MLlib] Add sample weights to decision ...

2018-10-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21632
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4389/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21632: [SPARK-19591][ML][MLlib] Add sample weights to decision ...

2018-10-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21632
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21632: [SPARK-19591][ML][MLlib] Add sample weights to decision ...

2018-10-22 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21632
  
**[Test build #97896 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97896/testReport)**
 for PR 21632 at commit 
[`bca2eaf`](https://github.com/apache/spark/commit/bca2eaf0de57180268494e21d93ddf3e66213321).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21632: [SPARK-19591][ML][MLlib] Add sample weights to decision ...

2018-10-22 Thread imatiach-msft

Github user imatiach-msft commented on the issue:

https://github.com/apache/spark/pull/21632
  
jenkins retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22482: WIP - [SPARK-10816][SS] Support session window natively

2018-10-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22482
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97883/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22482: WIP - [SPARK-10816][SS] Support session window natively

2018-10-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22482
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22482: WIP - [SPARK-10816][SS] Support session window natively

2018-10-22 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22482
  
**[Test build #97883 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97883/testReport)**
 for PR 22482 at commit 
[`1f6e496`](https://github.com/apache/spark/commit/1f6e496c9b3474d37e735fcede4ac3587136de35).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22800: [SPARK-24499][SQL][DOC][follow-up] Fix spelling in doc

2018-10-22 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/22800
  
LGTM


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22788: [SPARK-25769][SQL]escape nested columns by backtick each...

2018-10-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22788
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22788: [SPARK-25769][SQL]escape nested columns by backtick each...

2018-10-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22788
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97881/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22675: [SPARK-25347][ML][DOC] Spark datasource for image/libsvm...

2018-10-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22675
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97889/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22675: [SPARK-25347][ML][DOC] Spark datasource for image/libsvm...

2018-10-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22675
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22788: [SPARK-25769][SQL]escape nested columns by backtick each...

2018-10-22 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22788
  
**[Test build #97881 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97881/testReport)**
 for PR 22788 at commit 
[`99bfd00`](https://github.com/apache/spark/commit/99bfd0099378200e606293368c2914d5628adc44).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22675: [SPARK-25347][ML][DOC] Spark datasource for image/libsvm...

2018-10-22 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22675
  
**[Test build #97889 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97889/testReport)**
 for PR 22675 at commit 
[`8231cb2`](https://github.com/apache/spark/commit/8231cb25c30fe17bd076145b3ebed020265ac173).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22800: [SPARK-24499][SQL][DOC][follow-up] Fix spelling in doc

2018-10-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22800
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22790: [SPARK-25793][ML]call SaveLoadV2_0.load for class...

2018-10-22 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/22790#discussion_r227210362
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/mllib/clustering/BisectingKMeansModel.scala
 ---
@@ -126,7 +126,7 @@ object BisectingKMeansModel extends 
Loader[BisectingKMeansModel] {
 val model = SaveLoadV1_0.load(sc, path)
 model
   case (SaveLoadV2_0.thisClassName, SaveLoadV2_0.thisFormatVersion) =>
-val model = SaveLoadV1_0.load(sc, path)
+val model = SaveLoadV2_0.load(sc, path)
--- End diff --

Do we have ever use `SaveLoadV2_0` to save model for now? Looks 
`BisectingKMeansModel.save` simply calls `SaveLoadV1_0.save` to save models.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22800: [SPARK-24499][SQL][DOC][follow-up] Fix spelling in doc

2018-10-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22800
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97895/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22800: [SPARK-24499][SQL][DOC][follow-up] Fix spelling in doc

2018-10-22 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22800
  
**[Test build #97895 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97895/testReport)**
 for PR 22800 at commit 
[`a13493b`](https://github.com/apache/spark/commit/a13493b2e3ede38129de6d32ea19d6886cc13b80).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22778: [SPARK-25784][SQL] Infer filters from constraints...

2018-10-22 Thread wangyum

Github user wangyum commented on a diff in the pull request:

https://github.com/apache/spark/pull/22778#discussion_r227208719
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/RewriteSubquerySuite.scala
 ---
@@ -33,23 +34,44 @@ class RewriteSubquerySuite extends PlanTest {
   Batch("Rewrite Subquery", FixedPoint(1),
 RewritePredicateSubquery,
 ColumnPruning,
+InferFiltersFromConstraints,
+PushDownPredicate,
 CollapseProject,
 RemoveRedundantProject) :: Nil
   }
 
   test("Column pruning after rewriting predicate subquery") {
-val relation = LocalRelation('a.int, 'b.int)
-val relInSubquery = LocalRelation('x.int, 'y.int, 'z.int)
+withSQLConf(SQLConf.CONSTRAINT_PROPAGATION_ENABLED.key -> "false") {
--- End diff --

Yes, `spark.sql.constraintPropagation.enabled=false` to test 
`ColumnPruning`.
`spark.sql.constraintPropagation.enabled=true` to test `ColumnPruning`, 
`InferFiltersFromConstraints` and `PushDownPredicate`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22800: [SPARK-24499][SQL][DOC][follow-up] Fix spelling in doc

2018-10-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22800
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4388/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22800: [SPARK-24499][SQL][DOC][follow-up] Fix spelling in doc

2018-10-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22800
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22800: [SPARK-24499][SQL][DOC][follow-up] Fix spelling in doc

2018-10-22 Thread kiszk

Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/22800
  
cc @cloud-fan @gatorsmile @HyukjinKwon @xuanyuanking


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22800: [SPARK-24499][SQL][DOC][follow-up] Fix spelling in doc

2018-10-22 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22800
  
**[Test build #97895 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97895/testReport)**
 for PR 22800 at commit 
[`a13493b`](https://github.com/apache/spark/commit/a13493b2e3ede38129de6d32ea19d6886cc13b80).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22800: [SPARK-24499][SQL][DOC][follow-up] Fix spelling i...

2018-10-22 Thread kiszk

GitHub user kiszk opened a pull request:

https://github.com/apache/spark/pull/22800

[SPARK-24499][SQL][DOC][follow-up] Fix spelling in doc

## What changes were proposed in this pull request?

This PR replaces `turing` with `tuning` in files and a file name. 
Currently, in the left side menu, `Turing` is shown.

![image](https://user-images.githubusercontent.com/1315079/47332714-20a96180-d6bb-11e8-9a5a-0a8dad292626.png)

## How was this patch tested?

`grep -rin turing docs` && `find docs -name "*turing*"`


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kiszk/spark SPARK-24499-follow

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/22800.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #22800


commit a13493b2e3ede38129de6d32ea19d6886cc13b80
Author: Kazuaki Ishizaki 
Date:   2018-10-23T02:56:18Z

turing -> tuning




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22790: [SPARK-25793][ML]call SaveLoadV2_0.load for classNameV2_...

2018-10-22 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/22790
  
cc @mengxr @WeichenXu123 how serious is it? shall we treat it as a blocker?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22778: [SPARK-25784][SQL] Infer filters from constraints...

2018-10-22 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/22778#discussion_r227205054
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/RewriteSubquerySuite.scala
 ---
@@ -33,23 +34,44 @@ class RewriteSubquerySuite extends PlanTest {
   Batch("Rewrite Subquery", FixedPoint(1),
 RewritePredicateSubquery,
 ColumnPruning,
+InferFiltersFromConstraints,
+PushDownPredicate,
 CollapseProject,
 RemoveRedundantProject) :: Nil
   }
 
   test("Column pruning after rewriting predicate subquery") {
-val relation = LocalRelation('a.int, 'b.int)
-val relInSubquery = LocalRelation('x.int, 'y.int, 'z.int)
+withSQLConf(SQLConf.CONSTRAINT_PROPAGATION_ENABLED.key -> "false") {
--- End diff --

We need to modify this existing test?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22729: [SPARK-25737][CORE] Remove JavaSparkContextVarargsWorkar...

2018-10-22 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22729
  
**[Test build #4385 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4385/testReport)**
 for PR 22729 at commit 
[`0860d27`](https://github.com/apache/spark/commit/0860d27a205d3dd3d94e6bbe2c9db49b7e432ef4).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22799: [SPARK-25805][SQL][TEST] Fix test for SPARK-25159

2018-10-22 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/22799
  
LGTM


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22778: [SPARK-25784][SQL] Infer filters from constraints after ...

2018-10-22 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/22778
  
Also, to make sure no performance regression in the optimizer, can you 
check optimizer statistics in TPCDS by running `TPCDSQuerySuite`, too?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22204: [SPARK-25196][SQL] Extends Analyze commands for cached t...

2018-10-22 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22204
  
**[Test build #97894 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97894/testReport)**
 for PR 22204 at commit 
[`d5bff88`](https://github.com/apache/spark/commit/d5bff888a17439a62f8b4f0762a2488cdd57e817).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22204: [SPARK-25196][SQL] Extends Analyze commands for cached t...

2018-10-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22204
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4387/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22204: [SPARK-25196][SQL] Extends Analyze commands for cached t...

2018-10-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22204
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22730: [SPARK-16775][CORE] Remove deprecated accumulator v1 API...

2018-10-22 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22730
  
**[Test build #4384 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4384/testReport)**
 for PR 22730 at commit 
[`41f02f4`](https://github.com/apache/spark/commit/41f02f461d0f632606adb68a36d03a7ed9f044c4).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22778: [SPARK-25784][SQL] Infer filters from constraints after ...

2018-10-22 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/22778
  
Can you put the concrete example of the missing case you described in the 
PR description?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22797: [SPARK-19851][SQL] Add support for EVERY and ANY (SOME) ...

2018-10-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22797
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97876/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22797: [SPARK-19851][SQL] Add support for EVERY and ANY (SOME) ...

2018-10-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22797
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22797: [SPARK-19851][SQL] Add support for EVERY and ANY (SOME) ...

2018-10-22 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22797
  
**[Test build #97876 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97876/testReport)**
 for PR 22797 at commit 
[`905652a`](https://github.com/apache/spark/commit/905652a55018433c4e9a7ec1a849c39cf04d8920).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22788: [SPARK-25769][SQL]escape nested columns by backti...

2018-10-22 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/22788#discussion_r227203199
  
--- Diff: 
sql/core/src/test/resources/sql-tests/results/columnresolution-negative.sql.out 
---
@@ -81,7 +81,7 @@ SELECT t1.i1 FROM t1, mydb1.t1
 struct<>
 -- !query 9 output
 org.apache.spark.sql.AnalysisException
-Reference 't1.i1' is ambiguous, could be: mydb1.t1.i1, mydb1.t1.i1.; line 
1 pos 7
+Reference '`t1`.`i1`' is ambiguous, could be: mydb1.t1.i1, mydb1.t1.i1.; 
line 1 pos 7
--- End diff --

These examples only make sense when we have the outer backticks. e.g. 
`'t1.i1'` is good.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1073 matches

Mail list logo