[GitHub] spark issue #22790: [SPARK-25793][ML]call SaveLoadV2_0.load for classNameV2_...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22790 **[Test build #97904 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97904/testReport)** for PR 22790 at commit [`77902bc`](https://github.com/apache/spark/commit/77902bc3a08d1397c1b69b68ad7aecaf1038defb). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22754: [SPARK-25776][CORE][MINOR]The disk write buffer size mus...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22754 **[Test build #97903 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97903/testReport)** for PR 22754 at commit [`6f8404b`](https://github.com/apache/spark/commit/6f8404b474539a989e08459949f54395bcd7ed10). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22754: [SPARK-25776][CORE][MINOR]The disk write buffer size mus...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22754 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4393/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22754: [SPARK-25776][CORE][MINOR]The disk write buffer size mus...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22754 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22754: [SPARK-25776][CORE][MINOR]The disk write buffer size mus...
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/22754 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20433: [SPARK-23264][SQL] Make INTERVAL keyword optional in INT...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20433 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22787: [SPARK-25040][SQL] Empty string for non string ty...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/22787 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20433: [SPARK-23264][SQL] Make INTERVAL keyword optional in INT...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20433 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97890/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20433: [SPARK-23264][SQL] Make INTERVAL keyword optional in INT...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20433 **[Test build #97890 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97890/testReport)** for PR 20433 at commit [`554c122`](https://github.com/apache/spark/commit/554c122a4cf7a09ccf3202911560dc1c5d2f8b7a). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22799: [SPARK-25805][SQL][TEST] Fix test for SPARK-25159
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22799 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97892/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22799: [SPARK-25805][SQL][TEST] Fix test for SPARK-25159
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22799 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22799: [SPARK-25805][SQL][TEST] Fix test for SPARK-25159
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22799 **[Test build #97892 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97892/testReport)** for PR 22799 at commit [`65be98b`](https://github.com/apache/spark/commit/65be98bf4572ca57a2f747582ec796935b4ede54). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22787: [SPARK-25040][SQL] Empty string for non string types sho...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22787 Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22801: [SPARK-25656][SQL][DOC][EXAMPLE] Add a doc and examples ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22801 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97900/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22801: [SPARK-25656][SQL][DOC][EXAMPLE] Add a doc and examples ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22801 **[Test build #97900 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97900/testReport)** for PR 22801 at commit [`1bd23a4`](https://github.com/apache/spark/commit/1bd23a41b3d6dbe4a2eff2565c79d0b2379f894b). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22801: [SPARK-25656][SQL][DOC][EXAMPLE] Add a doc and examples ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22801 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22512: [SPARK-25498][SQL] InterpretedMutableProjection should h...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22512 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22512: [SPARK-25498][SQL] InterpretedMutableProjection should h...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22512 **[Test build #97902 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97902/testReport)** for PR 22512 at commit [`45e65e5`](https://github.com/apache/spark/commit/45e65e5b3e61f702ebcb1203e95434051adbe437). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22790: [SPARK-25793][ML]call SaveLoadV2_0.load for class...
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/22790#discussion_r227229331 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/BisectingKMeansModel.scala --- @@ -126,7 +126,7 @@ object BisectingKMeansModel extends Loader[BisectingKMeansModel] { val model = SaveLoadV1_0.load(sc, path) model case (SaveLoadV2_0.thisClassName, SaveLoadV2_0.thisFormatVersion) => -val model = SaveLoadV1_0.load(sc, path) +val model = SaveLoadV2_0.load(sc, path) --- End diff -- @viirya @mgaido91 I will change the ```BisectingKMeansModel.save``` to ``` BisectingKMeansModel.SaveLoadV2_0.save(sc, this, path) ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22512: [SPARK-25498][SQL] InterpretedMutableProjection should h...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22512 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4392/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22512: [SPARK-25498][SQL] InterpretedMutableProjection should h...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22512 **[Test build #97901 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97901/testReport)** for PR 22512 at commit [`b8c5a17`](https://github.com/apache/spark/commit/b8c5a177b2ac72b35ad89d7076a063117568c768). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22512: [SPARK-25498][SQL] InterpretedMutableProjection should h...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22512 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22512: [SPARK-25498][SQL] InterpretedMutableProjection should h...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22512 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4391/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22801: [SPARK-25656][SQL][DOC][EXAMPLE] Add a doc and examples ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22801 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22801: [SPARK-25656][SQL][DOC][EXAMPLE] Add a doc and examples ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22801 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4390/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22723: [SPARK-25729][CORE]It is better to replace `minPartition...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22723 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97888/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22723: [SPARK-25729][CORE]It is better to replace `minPartition...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22723 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22723: [SPARK-25729][CORE]It is better to replace `minPartition...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22723 **[Test build #97888 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97888/testReport)** for PR 22723 at commit [`b80bf66`](https://github.com/apache/spark/commit/b80bf66a8109faa7f58d45b92417a981666866a0). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22512: [SPARK-25498][SQL] InterpretedMutableProjection s...
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/22512#discussion_r227226902 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala --- @@ -140,6 +141,14 @@ class SQLQueryTestSuite extends QueryTest with SharedSQLContext { val input = fileToString(new File(testCase.inputFile)) val (comments, code) = input.split("\n").partition(_.startsWith("--")) + +// Runs all the tests on both codegen-only and interpreter modes. Since explain results differ +// when `WHOLESTAGE_CODEGEN_ENABLED` disabled, we don't run these tests now. +val codegenConfigSets = Array(("false", "NO_CODEGEN"), ("true", "CODEGEN_ONLY")).map { + case (wholeStageCodegenEnabled, codegenFactoryMode) => +Array( // SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key -> wholeStageCodegenEnabled, --- End diff -- ok --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22801: [SPARK-25656][DOC][EXAMPLE] Add a doc and examples about...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22801 **[Test build #97900 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97900/testReport)** for PR 22801 at commit [`1bd23a4`](https://github.com/apache/spark/commit/1bd23a41b3d6dbe4a2eff2565c79d0b2379f894b). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22801: [SPARK-25656][DOC][EXAMPLE] Add a doc and example...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/22801#discussion_r227226704 --- Diff: docs/sql-data-sources-load-save-functions.md --- @@ -82,6 +82,50 @@ To load a CSV file you can use: +The extra options are also used during write operation. +For example, you can control bloom filters and dictionary encodings for ORC data sources. +The following ORC example will create bloom filter and use dictionary encoding only for `favorite_color`. +For Parquet, there exists `parquet.enable.dictionary`, too. +To find more detailed information about the extra ORC/Parquet options, +visit the official Apache ORC/Parquet websites. + + + + +{% include_example manual_save_options_orc scala/org/apache/spark/examples/sql/SQLDataSourceExample.scala %} + + + +{% include_example manual_save_options_orc java/org/apache/spark/examples/sql/JavaSQLDataSourceExample.java %} + + + +{% include_example manual_save_options_orc python/sql/datasource.py %} + + + +{% include_example manual_save_options_orc r/RSparkSQLExample.R %} + + + + +{% highlight sql %} +CREATE TABLE users_with_options ( + name STRING, + favorite_color STRING, + favorite_numbers array +) USING ORC +OPTIONS ( + orc.bloom.filter.columns 'favorite_color', + orc.dictionary.key.threshold '1.0', + orc.column.encoding.direct 'name' --- End diff -- Could you review this, @gatorsmile ? This is the example we discussed previously. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22801: [SPARK-25656][DOC][EXAMPLE] Add a doc and example...
GitHub user dongjoon-hyun opened a pull request: https://github.com/apache/spark/pull/22801 [SPARK-25656][DOC][EXAMPLE] Add a doc and examples about extra data source options ## What changes were proposed in this pull request? Our current doc does not explain how we are passing the data source specific options to the underlying data source. According to [the review comment](https://github.com/apache/spark/pull/22622#discussion_r222911529), this PR aims to add more detailed information and examples ## How was this patch tested? Manual. You can merge this pull request into a Git repository by running: $ git pull https://github.com/dongjoon-hyun/spark SPARK-25656 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22801.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22801 commit 1bd23a41b3d6dbe4a2eff2565c79d0b2379f894b Author: Dongjoon Hyun Date: 2018-10-23T04:57:15Z [SPARK-25656][DOC][EXAMPLE] Add a doc and examples about extra data source save options --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22512: [SPARK-25498][SQL] InterpretedMutableProjection should h...
Github user maropu commented on the issue: https://github.com/apache/spark/pull/22512 ok, I'll add tests. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22512: [SPARK-25498][SQL] InterpretedMutableProjection s...
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/22512#discussion_r227224209 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/InterpretedMutableProjection.scala --- @@ -49,10 +51,54 @@ class InterpretedMutableProjection(expressions: Seq[Expression]) extends Mutable def currentValue: InternalRow = mutableRow override def target(row: InternalRow): MutableProjection = { +// If `mutableRow` is `UnsafeRow`, `MutableProjection` accepts fixed-length types only +assert(!row.isInstanceOf[UnsafeRow] || + validExprs.forall { case (e, _) => UnsafeRow.isFixedLength(e.dataType) }) mutableRow = row this } + private[this] val fieldWriters = validExprs.map { case (e, i) => +val writer = generateRowWriter(i, e.dataType) +if (!e.nullable) { + (v: Any) => writer(v) +} else { + (v: Any) => { +if (v == null) { + mutableRow.setNullAt(i) +} else { + writer(v) +} + } +} + } + + private def generateRowWriter(ordinal: Int, dt: DataType): Any => Unit = dt match { --- End diff -- oh, yes! yea, I will. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22512: [SPARK-25498][SQL] InterpretedMutableProjection s...
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/22512#discussion_r227224030 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/InterpretedMutableProjection.scala --- @@ -49,10 +51,54 @@ class InterpretedMutableProjection(expressions: Seq[Expression]) extends Mutable def currentValue: InternalRow = mutableRow override def target(row: InternalRow): MutableProjection = { +// If `mutableRow` is `UnsafeRow`, `MutableProjection` accepts fixed-length types only +assert(!row.isInstanceOf[UnsafeRow] || + validExprs.forall { case (e, _) => UnsafeRow.isFixedLength(e.dataType) }) mutableRow = row this } + private[this] val fieldWriters = validExprs.map { case (e, i) => +val writer = generateRowWriter(i, e.dataType) +if (!e.nullable) { + (v: Any) => writer(v) +} else { + (v: Any) => { +if (v == null) { + mutableRow.setNullAt(i) +} else { + writer(v) +} + } +} + } + + private def generateRowWriter(ordinal: Int, dt: DataType): Any => Unit = dt match { +case BooleanType => + v => mutableRow.setBoolean(ordinal, v.asInstanceOf[Boolean]) +case ByteType => + v => mutableRow.setByte(ordinal, v.asInstanceOf[Byte]) +case ShortType => + v => mutableRow.setShort(ordinal, v.asInstanceOf[Short]) +case IntegerType | DateType => + v => mutableRow.setInt(ordinal, v.asInstanceOf[Int]) +case LongType | TimestampType => + v => mutableRow.setLong(ordinal, v.asInstanceOf[Long]) +case FloatType => + v => mutableRow.setFloat(ordinal, v.asInstanceOf[Float]) +case DoubleType => + v => mutableRow.setDouble(ordinal, v.asInstanceOf[Double]) +case DecimalType.Fixed(precision, _) => + v => mutableRow.setDecimal(ordinal, v.asInstanceOf[Decimal], precision) +case CalendarIntervalType | BinaryType | _: ArrayType | StringType | _: StructType | + _: MapType | _: UserDefinedType[_] => + v => mutableRow.update(ordinal, v) +case NullType => + v => {} --- End diff -- We need to take care of `e.nullable && e.dataType == NullType` here? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22754: [SPARK-25776][CORE][MINOR]The disk write buffer size mus...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22754 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22754: [SPARK-25776][CORE][MINOR]The disk write buffer size mus...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22754 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97887/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22754: [SPARK-25776][CORE][MINOR]The disk write buffer size mus...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22754 **[Test build #97887 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97887/testReport)** for PR 22754 at commit [`6f8404b`](https://github.com/apache/spark/commit/6f8404b474539a989e08459949f54395bcd7ed10). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22612: [SPARK-24958] Add executors' process tree total memory i...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22612 **[Test build #97899 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97899/testReport)** for PR 22612 at commit [`7f7ed2b`](https://github.com/apache/spark/commit/7f7ed2bdf5740bd2c4ae8cf2090ba7f016ffb023). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22612: [SPARK-24958] Add executors' process tree total m...
Github user rezasafi commented on a diff in the pull request: https://github.com/apache/spark/pull/22612#discussion_r22741 --- Diff: core/src/main/scala/org/apache/spark/metrics/ExecutorMetricType.scala --- @@ -95,10 +135,29 @@ private[spark] object ExecutorMetricType { OnHeapUnifiedMemory, OffHeapUnifiedMemory, DirectPoolMemory, -MappedPoolMemory +MappedPoolMemory, +ProcessTreeMetrics + ) + // List of defined metrics + val definedMetrics = IndexedSeq( +"JVMHeapMemory", +"JVMOffHeapMemory", +"OnHeapExecutionMemory", +"OffHeapExecutionMemory", +"OnHeapStorageMemory", +"OffHeapStorageMemory", +"OnHeapUnifiedMemory", +"OffHeapUnifiedMemory", +"DirectPoolMemory", +"MappedPoolMemory", +"ProcessTreeJVMVMemory", +"ProcessTreeJVMRSSMemory", +"ProcessTreePythonVMemory", +"ProcessTreePythonRSSMemory", +"ProcessTreeOtherVMemory", +"ProcessTreeOtherRSSMemory" --- End diff -- I changed this in a way similar to what you suggested to avoid having separate names and also using arrays instead of maps --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22612: [SPARK-24958] Add executors' process tree total memory i...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22612 **[Test build #97898 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97898/testReport)** for PR 22612 at commit [`a3f2c9b`](https://github.com/apache/spark/commit/a3f2c9bbf8897f2ebb68b8e4607beff5f0cc29fe). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22204: [SPARK-25196][SQL] Extends Analyze commands for cached t...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22204 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22204: [SPARK-25196][SQL] Extends Analyze commands for cached t...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22204 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97894/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22204: [SPARK-25196][SQL] Extends Analyze commands for cached t...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22204 **[Test build #97894 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97894/testReport)** for PR 22204 at commit [`d5bff88`](https://github.com/apache/spark/commit/d5bff888a17439a62f8b4f0762a2488cdd57e817). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22800: [SPARK-24499][SQL][DOC][follow-up] Fix spelling i...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/22800 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22800: [SPARK-24499][SQL][DOC][follow-up] Fix spelling in doc
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22800 thanks, merging to master/2.4! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22787: [SPARK-25040][SQL] Empty string for non string types sho...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22787 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97884/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22787: [SPARK-25040][SQL] Empty string for non string types sho...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22787 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22787: [SPARK-25040][SQL] Empty string for non string types sho...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22787 **[Test build #97884 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97884/testReport)** for PR 22787 at commit [`c04ea64`](https://github.com/apache/spark/commit/c04ea649a9087f9cb5ffbb7361fbb711d74d9266). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22749: [SPARK-25746][SQL] Refactoring ExpressionEncoder to get ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22749 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97885/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22749: [SPARK-25746][SQL] Refactoring ExpressionEncoder to get ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22749 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22749: [SPARK-25746][SQL] Refactoring ExpressionEncoder to get ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22749 **[Test build #97885 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97885/testReport)** for PR 22749 at commit [`400f878`](https://github.com/apache/spark/commit/400f87817183640006140e2db1839f8d78a13856). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22797: [SPARK-19851][SQL] Add support for EVERY and ANY (SOME) ...
Github user dilipbiswal commented on the issue: https://github.com/apache/spark/pull/22797 cc @cloud-fan @gatorsmile --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22755: [SPARK-25755][SQL][Test] Supplementation of non-CodeGen ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22755 **[Test build #97897 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97897/testReport)** for PR 22755 at commit [`0d75328`](https://github.com/apache/spark/commit/0d753280828cba5e7658edafdf66b1ebbde527b0). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22755: [SPARK-25755][SQL][Test] Supplementation of non-C...
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/22755#discussion_r227215773 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/joins/ExistenceJoinSuite.scala --- @@ -122,19 +122,22 @@ class ExistenceJoinSuite extends SparkPlanTest with SharedSQLContext { test(s"$testName using BroadcastHashJoin") { extractJoinParts().foreach { case (_, leftKeys, rightKeys, boundCondition, _, _) => -withSQLConf(SQLConf.SHUFFLE_PARTITIONS.key -> "1") { - checkAnswer2(leftRows, rightRows, (left: SparkPlan, right: SparkPlan) => -EnsureRequirements(left.sqlContext.sessionState.conf).apply( - BroadcastHashJoinExec( -leftKeys, rightKeys, joinType, BuildRight, boundCondition, left, right)), -expectedAnswer, -sortAnswers = true) - checkAnswer2(leftRows, rightRows, (left: SparkPlan, right: SparkPlan) => -EnsureRequirements(left.sqlContext.sessionState.conf).apply( - createLeftSemiPlusJoin(BroadcastHashJoinExec( -leftKeys, rightKeys, leftSemiPlus, BuildRight, boundCondition, left, right))), -expectedAnswer, -sortAnswers = true) +Seq("false", "true").foreach { v => --- End diff -- nit: v -> codegenEnabled --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22755: [SPARK-25755][SQL][Test] Supplementation of non-CodeGen ...
Github user maropu commented on the issue: https://github.com/apache/spark/pull/22755 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22784: [SPARK-25790][MLLIB] PCA: Support more than 65535 column...
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/22784 Sorry for my mistake. My keyboard '4' sometimes has a trouble. > I think, INT_MAX is 2147483647, so n ~= sqrt(2*2147483647) = 65536. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22778: [SPARK-25784][SQL] Infer filters from constraints...
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/22778#discussion_r227215135 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/RewriteSubquerySuite.scala --- @@ -33,23 +34,44 @@ class RewriteSubquerySuite extends PlanTest { Batch("Rewrite Subquery", FixedPoint(1), RewritePredicateSubquery, ColumnPruning, +InferFiltersFromConstraints, +PushDownPredicate, CollapseProject, RemoveRedundantProject) :: Nil } test("Column pruning after rewriting predicate subquery") { -val relation = LocalRelation('a.int, 'b.int) -val relInSubquery = LocalRelation('x.int, 'y.int, 'z.int) +withSQLConf(SQLConf.CONSTRAINT_PROPAGATION_ENABLED.key -> "false") { + val relation = LocalRelation('a.int, 'b.int) + val relInSubquery = LocalRelation('x.int, 'y.int, 'z.int) -val query = relation.where('a.in(ListQuery(relInSubquery.select('x.select('a) + val query = relation.where('a.in(ListQuery(relInSubquery.select('x.select('a) -val optimized = Optimize.execute(query.analyze) -val correctAnswer = relation - .select('a) - .join(relInSubquery.select('x), LeftSemi, Some('a === 'x)) - .analyze + val optimized = Optimize.execute(query.analyze) + val correctAnswer = relation +.select('a) +.join(relInSubquery.select('x), LeftSemi, Some('a === 'x)) +.analyze -comparePlans(optimized, correctAnswer) + comparePlans(optimized, correctAnswer) +} + } + + test("Infer filters and push down predicate after rewriting predicate subquery") { --- End diff -- How about making the test title simple, then leaving comments about what's tested clearly here? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22778: [SPARK-25784][SQL] Infer filters from constraints...
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/22778#discussion_r227214593 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/RewriteSubquerySuite.scala --- @@ -33,23 +34,44 @@ class RewriteSubquerySuite extends PlanTest { Batch("Rewrite Subquery", FixedPoint(1), RewritePredicateSubquery, ColumnPruning, +InferFiltersFromConstraints, +PushDownPredicate, CollapseProject, RemoveRedundantProject) :: Nil } test("Column pruning after rewriting predicate subquery") { -val relation = LocalRelation('a.int, 'b.int) -val relInSubquery = LocalRelation('x.int, 'y.int, 'z.int) +withSQLConf(SQLConf.CONSTRAINT_PROPAGATION_ENABLED.key -> "false") { + val relation = LocalRelation('a.int, 'b.int) + val relInSubquery = LocalRelation('x.int, 'y.int, 'z.int) -val query = relation.where('a.in(ListQuery(relInSubquery.select('x.select('a) + val query = relation.where('a.in(ListQuery(relInSubquery.select('x.select('a) -val optimized = Optimize.execute(query.analyze) -val correctAnswer = relation - .select('a) - .join(relInSubquery.select('x), LeftSemi, Some('a === 'x)) - .analyze + val optimized = Optimize.execute(query.analyze) + val correctAnswer = relation +.select('a) +.join(relInSubquery.select('x), LeftSemi, Some('a === 'x)) +.analyze -comparePlans(optimized, correctAnswer) + comparePlans(optimized, correctAnswer) +} + } + + test("Infer filters and push down predicate after rewriting predicate subquery") { --- End diff -- Need the column pruning in the test title? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22778: [SPARK-25784][SQL] Infer filters from constraints...
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/22778#discussion_r227214404 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/RewriteSubquerySuite.scala --- @@ -33,23 +34,44 @@ class RewriteSubquerySuite extends PlanTest { Batch("Rewrite Subquery", FixedPoint(1), RewritePredicateSubquery, ColumnPruning, +InferFiltersFromConstraints, +PushDownPredicate, CollapseProject, RemoveRedundantProject) :: Nil } test("Column pruning after rewriting predicate subquery") { -val relation = LocalRelation('a.int, 'b.int) -val relInSubquery = LocalRelation('x.int, 'y.int, 'z.int) +withSQLConf(SQLConf.CONSTRAINT_PROPAGATION_ENABLED.key -> "false") { --- End diff -- Ah, I see. Thanks. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22675: [SPARK-25347][ML][DOC] Spark datasource for image/libsvm...
Github user imatiach-msft commented on the issue: https://github.com/apache/spark/pull/22675 LGTM, this is great to have! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22675: [SPARK-25347][ML][DOC] Spark datasource for image...
Github user imatiach-msft commented on a diff in the pull request: https://github.com/apache/spark/pull/22675#discussion_r227214199 --- Diff: docs/ml-datasource.md --- @@ -0,0 +1,113 @@ +--- +layout: global +title: Data sources +displayTitle: Data sources +--- + +In this section, we introduce how to use data source in ML to load data. +Beside some general data sources such as Parquet, CSV, JSON and JDBC, we also provide some specific data source for ML. + +**Table of Contents** + +* This will become a table of contents (this text will be scraped). --- End diff -- ah, ok, great --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21632: [SPARK-19591][ML][MLlib] Add sample weights to decision ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21632 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4389/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21632: [SPARK-19591][ML][MLlib] Add sample weights to decision ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21632 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21632: [SPARK-19591][ML][MLlib] Add sample weights to decision ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21632 **[Test build #97896 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97896/testReport)** for PR 21632 at commit [`bca2eaf`](https://github.com/apache/spark/commit/bca2eaf0de57180268494e21d93ddf3e66213321). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21632: [SPARK-19591][ML][MLlib] Add sample weights to decision ...
Github user imatiach-msft commented on the issue: https://github.com/apache/spark/pull/21632 jenkins retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22482: WIP - [SPARK-10816][SS] Support session window natively
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22482 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97883/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22482: WIP - [SPARK-10816][SS] Support session window natively
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22482 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22482: WIP - [SPARK-10816][SS] Support session window natively
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22482 **[Test build #97883 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97883/testReport)** for PR 22482 at commit [`1f6e496`](https://github.com/apache/spark/commit/1f6e496c9b3474d37e735fcede4ac3587136de35). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22800: [SPARK-24499][SQL][DOC][follow-up] Fix spelling in doc
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22800 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22788: [SPARK-25769][SQL]escape nested columns by backtick each...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22788 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22788: [SPARK-25769][SQL]escape nested columns by backtick each...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22788 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97881/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22675: [SPARK-25347][ML][DOC] Spark datasource for image/libsvm...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22675 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97889/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22675: [SPARK-25347][ML][DOC] Spark datasource for image/libsvm...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22675 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22788: [SPARK-25769][SQL]escape nested columns by backtick each...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22788 **[Test build #97881 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97881/testReport)** for PR 22788 at commit [`99bfd00`](https://github.com/apache/spark/commit/99bfd0099378200e606293368c2914d5628adc44). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22675: [SPARK-25347][ML][DOC] Spark datasource for image/libsvm...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22675 **[Test build #97889 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97889/testReport)** for PR 22675 at commit [`8231cb2`](https://github.com/apache/spark/commit/8231cb25c30fe17bd076145b3ebed020265ac173). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22800: [SPARK-24499][SQL][DOC][follow-up] Fix spelling in doc
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22800 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22790: [SPARK-25793][ML]call SaveLoadV2_0.load for class...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22790#discussion_r227210362 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/BisectingKMeansModel.scala --- @@ -126,7 +126,7 @@ object BisectingKMeansModel extends Loader[BisectingKMeansModel] { val model = SaveLoadV1_0.load(sc, path) model case (SaveLoadV2_0.thisClassName, SaveLoadV2_0.thisFormatVersion) => -val model = SaveLoadV1_0.load(sc, path) +val model = SaveLoadV2_0.load(sc, path) --- End diff -- Do we have ever use `SaveLoadV2_0` to save model for now? Looks `BisectingKMeansModel.save` simply calls `SaveLoadV1_0.save` to save models. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22800: [SPARK-24499][SQL][DOC][follow-up] Fix spelling in doc
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22800 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97895/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22800: [SPARK-24499][SQL][DOC][follow-up] Fix spelling in doc
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22800 **[Test build #97895 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97895/testReport)** for PR 22800 at commit [`a13493b`](https://github.com/apache/spark/commit/a13493b2e3ede38129de6d32ea19d6886cc13b80). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22778: [SPARK-25784][SQL] Infer filters from constraints...
Github user wangyum commented on a diff in the pull request: https://github.com/apache/spark/pull/22778#discussion_r227208719 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/RewriteSubquerySuite.scala --- @@ -33,23 +34,44 @@ class RewriteSubquerySuite extends PlanTest { Batch("Rewrite Subquery", FixedPoint(1), RewritePredicateSubquery, ColumnPruning, +InferFiltersFromConstraints, +PushDownPredicate, CollapseProject, RemoveRedundantProject) :: Nil } test("Column pruning after rewriting predicate subquery") { -val relation = LocalRelation('a.int, 'b.int) -val relInSubquery = LocalRelation('x.int, 'y.int, 'z.int) +withSQLConf(SQLConf.CONSTRAINT_PROPAGATION_ENABLED.key -> "false") { --- End diff -- Yes, `spark.sql.constraintPropagation.enabled=false` to test `ColumnPruning`. `spark.sql.constraintPropagation.enabled=true` to test `ColumnPruning`, `InferFiltersFromConstraints` and `PushDownPredicate`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22800: [SPARK-24499][SQL][DOC][follow-up] Fix spelling in doc
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22800 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4388/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22800: [SPARK-24499][SQL][DOC][follow-up] Fix spelling in doc
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22800 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22800: [SPARK-24499][SQL][DOC][follow-up] Fix spelling in doc
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/22800 cc @cloud-fan @gatorsmile @HyukjinKwon @xuanyuanking --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22800: [SPARK-24499][SQL][DOC][follow-up] Fix spelling in doc
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22800 **[Test build #97895 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97895/testReport)** for PR 22800 at commit [`a13493b`](https://github.com/apache/spark/commit/a13493b2e3ede38129de6d32ea19d6886cc13b80). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22800: [SPARK-24499][SQL][DOC][follow-up] Fix spelling i...
GitHub user kiszk opened a pull request: https://github.com/apache/spark/pull/22800 [SPARK-24499][SQL][DOC][follow-up] Fix spelling in doc ## What changes were proposed in this pull request? This PR replaces `turing` with `tuning` in files and a file name. Currently, in the left side menu, `Turing` is shown. ![image](https://user-images.githubusercontent.com/1315079/47332714-20a96180-d6bb-11e8-9a5a-0a8dad292626.png) ## How was this patch tested? `grep -rin turing docs` && `find docs -name "*turing*"` You can merge this pull request into a Git repository by running: $ git pull https://github.com/kiszk/spark SPARK-24499-follow Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22800.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22800 commit a13493b2e3ede38129de6d32ea19d6886cc13b80 Author: Kazuaki Ishizaki Date: 2018-10-23T02:56:18Z turing -> tuning --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22790: [SPARK-25793][ML]call SaveLoadV2_0.load for classNameV2_...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22790 cc @mengxr @WeichenXu123 how serious is it? shall we treat it as a blocker? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22778: [SPARK-25784][SQL] Infer filters from constraints...
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/22778#discussion_r227205054 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/RewriteSubquerySuite.scala --- @@ -33,23 +34,44 @@ class RewriteSubquerySuite extends PlanTest { Batch("Rewrite Subquery", FixedPoint(1), RewritePredicateSubquery, ColumnPruning, +InferFiltersFromConstraints, +PushDownPredicate, CollapseProject, RemoveRedundantProject) :: Nil } test("Column pruning after rewriting predicate subquery") { -val relation = LocalRelation('a.int, 'b.int) -val relInSubquery = LocalRelation('x.int, 'y.int, 'z.int) +withSQLConf(SQLConf.CONSTRAINT_PROPAGATION_ENABLED.key -> "false") { --- End diff -- We need to modify this existing test? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22729: [SPARK-25737][CORE] Remove JavaSparkContextVarargsWorkar...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22729 **[Test build #4385 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4385/testReport)** for PR 22729 at commit [`0860d27`](https://github.com/apache/spark/commit/0860d27a205d3dd3d94e6bbe2c9db49b7e432ef4). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22799: [SPARK-25805][SQL][TEST] Fix test for SPARK-25159
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22799 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22778: [SPARK-25784][SQL] Infer filters from constraints after ...
Github user maropu commented on the issue: https://github.com/apache/spark/pull/22778 Also, to make sure no performance regression in the optimizer, can you check optimizer statistics in TPCDS by running `TPCDSQuerySuite`, too? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22204: [SPARK-25196][SQL] Extends Analyze commands for cached t...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22204 **[Test build #97894 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97894/testReport)** for PR 22204 at commit [`d5bff88`](https://github.com/apache/spark/commit/d5bff888a17439a62f8b4f0762a2488cdd57e817). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22204: [SPARK-25196][SQL] Extends Analyze commands for cached t...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22204 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4387/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22204: [SPARK-25196][SQL] Extends Analyze commands for cached t...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22204 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22730: [SPARK-16775][CORE] Remove deprecated accumulator v1 API...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22730 **[Test build #4384 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4384/testReport)** for PR 22730 at commit [`41f02f4`](https://github.com/apache/spark/commit/41f02f461d0f632606adb68a36d03a7ed9f044c4). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22778: [SPARK-25784][SQL] Infer filters from constraints after ...
Github user maropu commented on the issue: https://github.com/apache/spark/pull/22778 Can you put the concrete example of the missing case you described in the PR description? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22797: [SPARK-19851][SQL] Add support for EVERY and ANY (SOME) ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22797 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97876/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22797: [SPARK-19851][SQL] Add support for EVERY and ANY (SOME) ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22797 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22797: [SPARK-19851][SQL] Add support for EVERY and ANY (SOME) ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22797 **[Test build #97876 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97876/testReport)** for PR 22797 at commit [`905652a`](https://github.com/apache/spark/commit/905652a55018433c4e9a7ec1a849c39cf04d8920). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22788: [SPARK-25769][SQL]escape nested columns by backti...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22788#discussion_r227203199 --- Diff: sql/core/src/test/resources/sql-tests/results/columnresolution-negative.sql.out --- @@ -81,7 +81,7 @@ SELECT t1.i1 FROM t1, mydb1.t1 struct<> -- !query 9 output org.apache.spark.sql.AnalysisException -Reference 't1.i1' is ambiguous, could be: mydb1.t1.i1, mydb1.t1.i1.; line 1 pos 7 +Reference '`t1`.`i1`' is ambiguous, could be: mydb1.t1.i1, mydb1.t1.i1.; line 1 pos 7 --- End diff -- These examples only make sense when we have the outer backticks. e.g. `'t1.i1'` is good. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org