svn commit: r28683 - in /dev/spark/2.4.0-SNAPSHOT-2018_08_13_00_02-a992827-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s
Author: pwendell Date: Mon Aug 13 07:16:40 2018 New Revision: 28683 Log: Apache Spark 2.4.0-SNAPSHOT-2018_08_13_00_02-a992827 docs [This commit notification would consist of 1476 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.] - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-25096][SQL] Loosen nullability if the cast is force-nullable.
Repository: spark Updated Branches: refs/heads/master a9928277d -> b270bccff [SPARK-25096][SQL] Loosen nullability if the cast is force-nullable. ## What changes were proposed in this pull request? In type coercion for complex types, if the found type is force-nullable to cast, we should loosen the nullability to be able to cast. Also for map key type, we can't use the type. ## How was this patch tested? Added some test. Closes #22086 from ueshin/issues/SPARK-25096/fix_type_coercion. Authored-by: Takuya UESHIN Signed-off-by: hyukjinkwon Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/b270bccf Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/b270bccf Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/b270bccf Branch: refs/heads/master Commit: b270bccb21b814e77ae55c1b74bc25d7 Parents: a992827 Author: Takuya UESHIN Authored: Mon Aug 13 19:27:17 2018 +0800 Committer: hyukjinkwon Committed: Mon Aug 13 19:27:17 2018 +0800 -- .../sql/catalyst/analysis/TypeCoercion.scala| 21 +--- .../catalyst/analysis/TypeCoercionSuite.scala | 16 +++ 2 files changed, 30 insertions(+), 7 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/b270bccf/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala -- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala index 27839d7..10d9ee5 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala @@ -153,19 +153,26 @@ object TypeCoercion { t2: DataType, findTypeFunc: (DataType, DataType) => Option[DataType]): Option[DataType] = (t1, t2) match { case (ArrayType(et1, containsNull1), ArrayType(et2, containsNull2)) => - findTypeFunc(et1, et2).map(ArrayType(_, containsNull1 || containsNull2)) + findTypeFunc(et1, et2).map { et => +ArrayType(et, containsNull1 || containsNull2 || + Cast.forceNullable(et1, et) || Cast.forceNullable(et2, et)) + } case (MapType(kt1, vt1, valueContainsNull1), MapType(kt2, vt2, valueContainsNull2)) => - findTypeFunc(kt1, kt2).flatMap { kt => -findTypeFunc(vt1, vt2).map { vt => - MapType(kt, vt, valueContainsNull1 || valueContainsNull2) -} + findTypeFunc(kt1, kt2) +.filter { kt => !Cast.forceNullable(kt1, kt) && !Cast.forceNullable(kt2, kt) } +.flatMap { kt => + findTypeFunc(vt1, vt2).map { vt => +MapType(kt, vt, valueContainsNull1 || valueContainsNull2 || + Cast.forceNullable(vt1, vt) || Cast.forceNullable(vt2, vt)) + } } case (StructType(fields1), StructType(fields2)) if fields1.length == fields2.length => val resolver = SQLConf.get.resolver fields1.zip(fields2).foldLeft(Option(new StructType())) { case (Some(struct), (field1, field2)) if resolver(field1.name, field2.name) => - findTypeFunc(field1.dataType, field2.dataType).map { -dt => struct.add(field1.name, dt, field1.nullable || field2.nullable) + findTypeFunc(field1.dataType, field2.dataType).map { dt => +struct.add(field1.name, dt, field1.nullable || field2.nullable || + Cast.forceNullable(field1.dataType, dt) || Cast.forceNullable(field2.dataType, dt)) } case _ => None } http://git-wip-us.apache.org/repos/asf/spark/blob/b270bccf/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercionSuite.scala -- diff --git a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercionSuite.scala b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercionSuite.scala index d71bbb3..2c6cb3a 100644 --- a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercionSuite.scala +++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercionSuite.scala @@ -499,6 +499,10 @@ class TypeCoercionSuite extends AnalysisTest { ArrayType(new StructType().add("num", ShortType), containsNull = false), ArrayType(new StructType().add("num", LongType), containsNull = false), Some(ArrayType(new StructType().add("num", LongType), containsNull = false))) +widenTestWithStringPromotion( + ArrayType(IntegerType, containsNull = false), + ArrayType(DecimalType.IntDecimal, containsNull = false), + Some(ArrayType(Dec
spark git commit: [SPARK-24391][SQL] Support arrays of any types by from_json
Repository: spark Updated Branches: refs/heads/master b270bccff -> ab06c2535 [SPARK-24391][SQL] Support arrays of any types by from_json ## What changes were proposed in this pull request? The PR removes a restriction for element types of array type which exists in `from_json` for the root type. Currently, the function can handle only arrays of structs. Even array of primitive types is disallowed. The PR allows arrays of any types currently supported by JSON datasource. Here is an example of an array of a primitive type: ``` scala> import org.apache.spark.sql.functions._ scala> val df = Seq("[1, 2, 3]").toDF("a") scala> val schema = new ArrayType(IntegerType, false) scala> val arr = df.select(from_json($"a", schema)) scala> arr.printSchema root |-- jsontostructs(a): array (nullable = true) ||-- element: integer (containsNull = true) ``` and result of converting of the json string to the `ArrayType`: ``` scala> arr.show ++ |jsontostructs(a)| ++ | [1, 2, 3]| ++ ``` ## How was this patch tested? I added a few positive and negative tests: - array of primitive types - array of arrays - array of structs - array of maps Closes #21439 from MaxGekk/from_json-array. Lead-authored-by: Maxim Gekk Co-authored-by: Maxim Gekk Signed-off-by: hyukjinkwon Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/ab06c253 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/ab06c253 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/ab06c253 Branch: refs/heads/master Commit: ab06c25350f8a997bef0c3dd8aa82b709e7dfb3f Parents: b270bcc Author: Maxim Gekk Authored: Mon Aug 13 20:13:09 2018 +0800 Committer: hyukjinkwon Committed: Mon Aug 13 20:13:09 2018 +0800 -- python/pyspark/sql/functions.py | 7 +- .../catalyst/expressions/jsonExpressions.scala | 19 ++--- .../spark/sql/catalyst/json/JacksonParser.scala | 30 .../scala/org/apache/spark/sql/functions.scala | 10 +-- .../sql-tests/inputs/json-functions.sql | 12 .../sql-tests/results/json-functions.sql.out| 66 - .../apache/spark/sql/JsonFunctionsSuite.scala | 76 ++-- 7 files changed, 194 insertions(+), 26 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/ab06c253/python/pyspark/sql/functions.py -- diff --git a/python/pyspark/sql/functions.py b/python/pyspark/sql/functions.py index eaecf28..f583373 100644 --- a/python/pyspark/sql/functions.py +++ b/python/pyspark/sql/functions.py @@ -2241,7 +2241,7 @@ def json_tuple(col, *fields): def from_json(col, schema, options={}): """ Parses a column containing a JSON string into a :class:`MapType` with :class:`StringType` -as keys type, :class:`StructType` or :class:`ArrayType` of :class:`StructType`\\s with +as keys type, :class:`StructType` or :class:`ArrayType` with the specified schema. Returns `null`, in the case of an unparseable string. :param col: string column in json format @@ -2269,6 +2269,11 @@ def from_json(col, schema, options={}): >>> schema = schema_of_json(lit('''{"a": 0}''')) >>> df.select(from_json(df.value, schema).alias("json")).collect() [Row(json=Row(a=1))] +>>> data = [(1, '''[1, 2, 3]''')] +>>> schema = ArrayType(IntegerType()) +>>> df = spark.createDataFrame(data, ("key", "value")) +>>> df.select(from_json(df.value, schema).alias("json")).collect() +[Row(json=[1, 2, 3])] """ sc = SparkContext._active_spark_context http://git-wip-us.apache.org/repos/asf/spark/blob/ab06c253/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala -- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala index abe8875..ca99100 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala @@ -495,7 +495,7 @@ case class JsonTuple(children: Seq[Expression]) } /** - * Converts an json input string to a [[StructType]] or [[ArrayType]] of [[StructType]]s + * Converts an json input string to a [[StructType]], [[ArrayType]] or [[MapType]] * with the specified schema. */ // scalastyle:off line.size.limit @@ -544,17 +544,10 @@ case class JsonToStructs( timeZoneId = None) override def checkInputDataTypes(): TypeCheckResult = nullableSchema match { -case _: StructType | ArrayType(_: StructType, _) | _: Map
spark git commit: [SPARK-25099][SQL][TEST] Generate Avro Binary files in test suite
Repository: spark Updated Branches: refs/heads/master ab06c2535 -> 26775e3c8 [SPARK-25099][SQL][TEST] Generate Avro Binary files in test suite ## What changes were proposed in this pull request? In PR https://github.com/apache/spark/pull/21984 and https://github.com/apache/spark/pull/21935 , the related test cases are using binary files created by Python scripts. Generate the binary files in test suite to make it more transparent. Also we can Also move the related test cases to a new file `AvroLogicalTypeSuite.scala`. ## How was this patch tested? Unit test. Closes #22091 from gengliangwang/logicalType_suite. Authored-by: Gengliang Wang Signed-off-by: Wenchen Fan Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/26775e3c Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/26775e3c Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/26775e3c Branch: refs/heads/master Commit: 26775e3c8ed5bf9028253280b57da64678363f8a Parents: ab06c25 Author: Gengliang Wang Authored: Mon Aug 13 20:50:28 2018 +0800 Committer: Wenchen Fan Committed: Mon Aug 13 20:50:28 2018 +0800 -- external/avro/src/test/resources/date.avro | Bin 209 -> 0 bytes external/avro/src/test/resources/timestamp.avro | Bin 375 -> 0 bytes .../spark/sql/avro/AvroLogicalTypeSuite.scala | 298 +++ .../org/apache/spark/sql/avro/AvroSuite.scala | 242 +-- 4 files changed, 299 insertions(+), 241 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/26775e3c/external/avro/src/test/resources/date.avro -- diff --git a/external/avro/src/test/resources/date.avro b/external/avro/src/test/resources/date.avro deleted file mode 100644 index 3a67617..000 Binary files a/external/avro/src/test/resources/date.avro and /dev/null differ http://git-wip-us.apache.org/repos/asf/spark/blob/26775e3c/external/avro/src/test/resources/timestamp.avro -- diff --git a/external/avro/src/test/resources/timestamp.avro b/external/avro/src/test/resources/timestamp.avro deleted file mode 100644 index daef50b..000 Binary files a/external/avro/src/test/resources/timestamp.avro and /dev/null differ http://git-wip-us.apache.org/repos/asf/spark/blob/26775e3c/external/avro/src/test/scala/org/apache/spark/sql/avro/AvroLogicalTypeSuite.scala -- diff --git a/external/avro/src/test/scala/org/apache/spark/sql/avro/AvroLogicalTypeSuite.scala b/external/avro/src/test/scala/org/apache/spark/sql/avro/AvroLogicalTypeSuite.scala new file mode 100644 index 000..24d8c53 --- /dev/null +++ b/external/avro/src/test/scala/org/apache/spark/sql/avro/AvroLogicalTypeSuite.scala @@ -0,0 +1,298 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.spark.sql.avro + +import java.io.File +import java.sql.Timestamp + +import org.apache.avro.{LogicalTypes, Schema} +import org.apache.avro.Conversions.DecimalConversion +import org.apache.avro.file.DataFileWriter +import org.apache.avro.generic.{GenericData, GenericDatumWriter, GenericRecord} + +import org.apache.spark.SparkException +import org.apache.spark.sql.{QueryTest, Row} +import org.apache.spark.sql.catalyst.util.DateTimeUtils +import org.apache.spark.sql.internal.SQLConf +import org.apache.spark.sql.test.{SharedSQLContext, SQLTestUtils} +import org.apache.spark.sql.types.{StructField, StructType, TimestampType} + +class AvroLogicalTypeSuite extends QueryTest with SharedSQLContext with SQLTestUtils { + import testImplicits._ + + val dateSchema = s""" + { +"namespace": "logical", +"type": "record", +"name": "test", +"fields": [ + {"name": "date", "type": {"type": "int", "logicalType": "date"}} +] + } +""" + + val dateInputData = Seq(7, 365, 0) + + def dateFile(path: String): String = { +val schema = new Schema.Parser().pars
spark git commit: [SPARK-22713][CORE] ExternalAppendOnlyMap leaks when spilled during iteration
Repository: spark Updated Branches: refs/heads/master 26775e3c8 -> 2e3abdff2 [SPARK-22713][CORE] ExternalAppendOnlyMap leaks when spilled during iteration ## What changes were proposed in this pull request? This PR solves [SPARK-22713](https://issues.apache.org/jira/browse/SPARK-22713) which describes a memory leak that occurs when and ExternalAppendOnlyMap is spilled during iteration (opposed to insertion). (Please fill in changes proposed in this fix) ExternalAppendOnlyMap's iterator supports spilling but it kept a reference to the internal map (via an internal iterator) after spilling, it seems that the original code was actually supposed to 'get rid' of this reference on the next iteration but according to the elaborate investigation described in the JIRA this didn't happen. the fix was simply replacing the internal iterator immediately after spilling. ## How was this patch tested? I've introduced a new test to test suite ExternalAppendOnlyMapSuite, this test asserts that neither the external map itself nor its iterator hold any reference to the internal map after a spill. These approach required some access relaxation of some members variables and nested classes of ExternalAppendOnlyMap, this members are now package provate and annotated with VisibleForTesting. Closes #21369 from eyalfa/SPARK-22713__ExternalAppendOnlyMap_effective_spill. Authored-by: Eyal Farago Signed-off-by: Wenchen Fan Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/2e3abdff Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/2e3abdff Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/2e3abdff Branch: refs/heads/master Commit: 2e3abdff23a0725b80992cc30dba2ecf9c2e7fd3 Parents: 26775e3 Author: Eyal Farago Authored: Mon Aug 13 20:55:46 2018 +0800 Committer: Wenchen Fan Committed: Mon Aug 13 20:55:46 2018 +0800 -- .../util/collection/ExternalAppendOnlyMap.scala | 35 +++--- .../collection/ExternalAppendOnlyMapSuite.scala | 119 ++- 2 files changed, 138 insertions(+), 16 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/2e3abdff/core/src/main/scala/org/apache/spark/util/collection/ExternalAppendOnlyMap.scala -- diff --git a/core/src/main/scala/org/apache/spark/util/collection/ExternalAppendOnlyMap.scala b/core/src/main/scala/org/apache/spark/util/collection/ExternalAppendOnlyMap.scala index d83da0d..19ff109 100644 --- a/core/src/main/scala/org/apache/spark/util/collection/ExternalAppendOnlyMap.scala +++ b/core/src/main/scala/org/apache/spark/util/collection/ExternalAppendOnlyMap.scala @@ -80,7 +80,10 @@ class ExternalAppendOnlyMap[K, V, C]( this(createCombiner, mergeValue, mergeCombiners, serializer, blockManager, TaskContext.get()) } - @volatile private var currentMap = new SizeTrackingAppendOnlyMap[K, C] + /** + * Exposed for testing + */ + @volatile private[collection] var currentMap = new SizeTrackingAppendOnlyMap[K, C] private val spilledMaps = new ArrayBuffer[DiskMapIterator] private val sparkConf = SparkEnv.get.conf private val diskBlockManager = blockManager.diskBlockManager @@ -267,7 +270,7 @@ class ExternalAppendOnlyMap[K, V, C]( */ def destructiveIterator(inMemoryIterator: Iterator[(K, C)]): Iterator[(K, C)] = { readingIterator = new SpillableIterator(inMemoryIterator) -readingIterator +readingIterator.toCompletionIterator } /** @@ -280,8 +283,7 @@ class ExternalAppendOnlyMap[K, V, C]( "ExternalAppendOnlyMap.iterator is destructive and should only be called once.") } if (spilledMaps.isEmpty) { - CompletionIterator[(K, C), Iterator[(K, C)]]( -destructiveIterator(currentMap.iterator), freeCurrentMap()) + destructiveIterator(currentMap.iterator) } else { new ExternalIterator() } @@ -305,8 +307,8 @@ class ExternalAppendOnlyMap[K, V, C]( // Input streams are derived both from the in-memory map and spilled maps on disk // The in-memory map is sorted in place, while the spilled maps are already in sorted order -private val sortedMap = CompletionIterator[(K, C), Iterator[(K, C)]](destructiveIterator( - currentMap.destructiveSortedIterator(keyComparator)), freeCurrentMap()) +private val sortedMap = destructiveIterator( + currentMap.destructiveSortedIterator(keyComparator)) private val inputStreams = (Seq(sortedMap) ++ spilledMaps).map(it => it.buffered) inputStreams.foreach { it => @@ -568,13 +570,11 @@ class ExternalAppendOnlyMap[K, V, C]( context.addTaskCompletionListener[Unit](context => cleanup()) } - private[this] class SpillableIterator(var upstream: Iterator[(K, C)]) + private class SpillableIterator(var upstream:
spark git commit: [SPARK-23908][SQL][FOLLOW-UP] Rename inputs to arguments, and add argument type check.
Repository: spark Updated Branches: refs/heads/master 2e3abdff2 -> b804ca577 [SPARK-23908][SQL][FOLLOW-UP] Rename inputs to arguments, and add argument type check. ## What changes were proposed in this pull request? This is a follow-up pr of #21954 to address comments. - Rename ambiguous name `inputs` to `arguments`. - Add argument type check and remove hacky workaround. - Address other small comments. ## How was this patch tested? Existing tests and some additional tests. Closes #22075 from ueshin/issues/SPARK-23908/fup1. Authored-by: Takuya UESHIN Signed-off-by: Wenchen Fan Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/b804ca57 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/b804ca57 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/b804ca57 Branch: refs/heads/master Commit: b804ca57718ad1568458d8185c8c30118be8275f Parents: 2e3abdf Author: Takuya UESHIN Authored: Mon Aug 13 20:58:29 2018 +0800 Committer: Wenchen Fan Committed: Mon Aug 13 20:58:29 2018 +0800 -- .../sql/catalyst/analysis/CheckAnalysis.scala | 14 ++ .../analysis/higherOrderFunctions.scala | 12 +- .../expressions/ExpectsInputTypes.scala | 16 +- .../expressions/higherOrderFunctions.scala | 181 ++- .../spark/sql/catalyst/plans/PlanTest.scala | 2 +- .../spark/sql/DataFrameFunctionsSuite.scala | 25 +++ 6 files changed, 152 insertions(+), 98 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/b804ca57/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala -- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala index 4addc83..6a91d55 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala @@ -90,6 +90,20 @@ trait CheckAnalysis extends PredicateHelper { u.failAnalysis(s"Table or view not found: ${u.tableIdentifier}") case operator: LogicalPlan => +// Check argument data types of higher-order functions downwards first. +// If the arguments of the higher-order functions are resolved but the type check fails, +// the argument functions will not get resolved, but we should report the argument type +// check failure instead of claiming the argument functions are unresolved. +operator transformExpressionsDown { + case hof: HigherOrderFunction + if hof.argumentsResolved && hof.checkArgumentDataTypes().isFailure => +hof.checkArgumentDataTypes() match { + case TypeCheckResult.TypeCheckFailure(message) => +hof.failAnalysis( + s"cannot resolve '${hof.sql}' due to argument data type mismatch: $message") +} +} + operator transformExpressionsUp { case a: Attribute if !a.resolved => val from = operator.inputSet.map(_.qualifiedName).mkString(", ") http://git-wip-us.apache.org/repos/asf/spark/blob/b804ca57/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/higherOrderFunctions.scala -- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/higherOrderFunctions.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/higherOrderFunctions.scala index 5e2029c..dd08190 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/higherOrderFunctions.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/higherOrderFunctions.scala @@ -95,15 +95,15 @@ case class ResolveLambdaVariables(conf: SQLConf) extends Rule[LogicalPlan] { */ private def createLambda( e: Expression, - partialArguments: Seq[(DataType, Boolean)]): LambdaFunction = e match { + argInfo: Seq[(DataType, Boolean)]): LambdaFunction = e match { case f: LambdaFunction if f.bound => f case LambdaFunction(function, names, _) => - if (names.size != partialArguments.size) { + if (names.size != argInfo.size) { e.failAnalysis( s"The number of lambda function arguments '${names.size}' does not " + "match the number of arguments expected by the higher order function " + -s"'${partialArguments.size}'.") +s"'${argInfo.size}'.") } if (names.map(a => canonicalizer(a.name)).distinct.size < names.size) { @@ -111,7 +111,7 @@ case class ResolveLambdaVariable
spark-website git commit: Add CVE-2018-11770
Repository: spark-website Updated Branches: refs/heads/asf-site a63b5f427 -> e33a4bb7d Add CVE-2018-11770 Project: http://git-wip-us.apache.org/repos/asf/spark-website/repo Commit: http://git-wip-us.apache.org/repos/asf/spark-website/commit/e33a4bb7 Tree: http://git-wip-us.apache.org/repos/asf/spark-website/tree/e33a4bb7 Diff: http://git-wip-us.apache.org/repos/asf/spark-website/diff/e33a4bb7 Branch: refs/heads/asf-site Commit: e33a4bb7d8bbc25bb6a7d96c8bd6c13e3b05e77b Parents: a63b5f4 Author: Sean Owen Authored: Mon Aug 13 09:25:05 2018 -0500 Committer: Sean Owen Committed: Mon Aug 13 09:25:05 2018 -0500 -- security.md| 62 +-- site/security.html | 99 +++-- 2 files changed, 138 insertions(+), 23 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark-website/blob/e33a4bb7/security.md -- diff --git a/security.md b/security.md index f99b9bd..19231f6 100644 --- a/security.md +++ b/security.md @@ -10,15 +10,55 @@ navigation: Reporting Security Issues Apache Spark uses the standard process outlined by the [Apache Security Team](https://www.apache.org/security/) -for reporting vulnerabilities. +for reporting vulnerabilities. Note that vulnerabilities should not be publicly disclosed until the project has +responded. To report a possible security vulnerability, please email `secur...@apache.org`. This is a non-public list that will reach the Apache Security team, as well as the Spark PMC. Known Security Issues +CVE-2018-11770: Apache Spark standalone master, Mesos REST APIs not controlled by authentication + +Severity: Medium + +Vendor: The Apache Software Foundation + +Versions Affected: + +- Spark versions from 1.3.0, running standalone master with REST API enabled, or running Mesos master with cluster mode enabled + +Description: + +From version 1.3.0 onward, Spark's standalone master exposes a REST API for job submission, in addition +to the submission mechanism used by `spark-submit`. In standalone, the config property +`spark.authenticate.secret` establishes a shared secret for authenticating requests to submit jobs via +`spark-submit`. However, the REST API does not use this or any other authentication mechanism, and this is +not adequately documented. In this case, a user would be able to run a driver program without authenticating, +but not launch executors, using the REST API. This REST API is also used by Mesos, when set up to run in +cluster mode (i.e., when also running `MesosClusterDispatcher`), for job submission. Future versions of Spark +will improve documentation on these points, and prohibit setting `spark.authenticate.secret` when running +the REST APIs, to make this clear. Future versions will also disable the REST API by default in the +standalone master by changing the default value of `spark.master.rest.enabled` to `false`. + +Mitigation: + +For standalone masters, disable the REST API by setting `spark.master.rest.enabled` to `false` if it is unused, +and/or ensure that all network access to the REST API (port 6066 by default) is restricted to hosts that are +trusted to submit jobs. Mesos users can stop the `MesosClusterDispatcher`, though that will prevent them +from running jobs in cluster mode. Alternatively, they can ensure access to the `MesosRestSubmissionServer` +(port 7077 by default) is restricted to trusted hosts. + +Credit: + +- Imran Rashid, Cloudera +- Fengwei Zhang, Alibaba Cloud Security Team + + CVE-2018-8024: Apache Spark XSS vulnerability in UI +Severity: Medium + Versions Affected: - Spark versions through 2.1.2 @@ -26,6 +66,7 @@ Versions Affected: - Spark 2.3.0 Description: + In Apache Spark up to and including 2.1.2, 2.2.0 to 2.2.1, and 2.3.0, it's possible for a malicious user to construct a URL pointing to a Spark cluster's UI's job and stage info pages, and if a user can be tricked into accessing the URL, can be used to cause script to execute and expose information from @@ -55,6 +96,7 @@ Versions affected: - Spark 2.3.0 Description: + In Apache Spark up to and including 2.1.2, 2.2.0 to 2.2.1, and 2.3.0, when using PySpark or SparkR, it's possible for a different local user to connect to the Spark application and impersonate the user running the Spark application. @@ -79,9 +121,11 @@ Severity: Medium Vendor: The Apache Software Foundation Versions Affected: -Versions of Apache Spark from 1.6.0 until 2.1.1 + +- Versions of Apache Spark from 1.6.0 until 2.1.1 Description: + In Apache Spark 1.6.0 until 2.1.1, the launcher API performs unsafe deserialization of data received by its socket. This makes applications launched programmatically using the launcher API potentially @@ -92,6 +13
svn commit: r28694 - in /dev/spark/2.4.0-SNAPSHOT-2018_08_13_08_02-b804ca5-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s
Author: pwendell Date: Mon Aug 13 15:16:09 2018 New Revision: 28694 Log: Apache Spark 2.4.0-SNAPSHOT-2018_08_13_08_02-b804ca5 docs [This commit notification would consist of 1476 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.] - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-25028][SQL] Avoid NPE when analyzing partition with NULL values
Repository: spark Updated Branches: refs/heads/master b804ca577 -> c220cc42a [SPARK-25028][SQL] Avoid NPE when analyzing partition with NULL values ## What changes were proposed in this pull request? `ANALYZE TABLE ... PARTITION(...) COMPUTE STATISTICS` can fail with a NPE if a partition column contains a NULL value. The PR avoids the NPE, replacing the `NULL` values with the default partition placeholder. ## How was this patch tested? added UT Closes #22036 from mgaido91/SPARK-25028. Authored-by: Marco Gaido Signed-off-by: Wenchen Fan Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/c220cc42 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/c220cc42 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/c220cc42 Branch: refs/heads/master Commit: c220cc42abebbc98a6110b50f787eb6d338c2d97 Parents: b804ca5 Author: Marco Gaido Authored: Tue Aug 14 00:59:18 2018 +0800 Committer: Wenchen Fan Committed: Tue Aug 14 00:59:18 2018 +0800 -- .../command/AnalyzePartitionCommand.scala | 10 -- .../spark/sql/StatisticsCollectionSuite.scala | 18 ++ 2 files changed, 26 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/c220cc42/sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzePartitionCommand.scala -- diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzePartitionCommand.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzePartitionCommand.scala index 5b54b22..18fefa0 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzePartitionCommand.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzePartitionCommand.scala @@ -20,7 +20,7 @@ package org.apache.spark.sql.execution.command import org.apache.spark.sql.{AnalysisException, Column, Row, SparkSession} import org.apache.spark.sql.catalyst.TableIdentifier import org.apache.spark.sql.catalyst.analysis.{NoSuchPartitionException, UnresolvedAttribute} -import org.apache.spark.sql.catalyst.catalog.{CatalogTable, CatalogTableType} +import org.apache.spark.sql.catalyst.catalog.{CatalogTable, CatalogTableType, ExternalCatalogUtils} import org.apache.spark.sql.catalyst.catalog.CatalogTypes.TablePartitionSpec import org.apache.spark.sql.catalyst.expressions.{And, EqualTo, Literal} import org.apache.spark.sql.execution.datasources.PartitioningUtils @@ -140,7 +140,13 @@ case class AnalyzePartitionCommand( val df = tableDf.filter(Column(filter)).groupBy(partitionColumns: _*).count() df.collect().map { r => - val partitionColumnValues = partitionColumns.indices.map(r.get(_).toString) + val partitionColumnValues = partitionColumns.indices.map { i => +if (r.isNullAt(i)) { + ExternalCatalogUtils.DEFAULT_PARTITION_NAME +} else { + r.get(i).toString +} + } val spec = tableMeta.partitionColumnNames.zip(partitionColumnValues).toMap val count = BigInt(r.getLong(partitionColumns.size)) (spec, count) http://git-wip-us.apache.org/repos/asf/spark/blob/c220cc42/sql/core/src/test/scala/org/apache/spark/sql/StatisticsCollectionSuite.scala -- diff --git a/sql/core/src/test/scala/org/apache/spark/sql/StatisticsCollectionSuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/StatisticsCollectionSuite.scala index 60fa951..cb562d6 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/StatisticsCollectionSuite.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/StatisticsCollectionSuite.scala @@ -204,6 +204,24 @@ class StatisticsCollectionSuite extends StatisticsCollectionTestBase with Shared } } + test("SPARK-25028: column stats collection for null partitioning columns") { +val table = "analyze_partition_with_null" +withTempDir { dir => + withTable(table) { +sql(s""" + |CREATE TABLE $table (value string, name string) + |USING PARQUET + |PARTITIONED BY (name) + |LOCATION '${dir.toURI}'""".stripMargin) +val df = Seq(("a", null), ("b", null)).toDF("value", "name") +df.write.mode("overwrite").insertInto(table) +sql(s"ANALYZE TABLE $table PARTITION (name) COMPUTE STATISTICS") +val partitions = spark.sessionState.catalog.listPartitions(TableIdentifier(table)) +assert(partitions.head.stats.get.rowCount.get == 2) + } +} + } + test("number format in statistics") { val numbers = Seq( BigInt(0) -> (("0.0 B", "0")), - To un
spark git commit: [SPARK-25028][SQL] Avoid NPE when analyzing partition with NULL values
Repository: spark Updated Branches: refs/heads/branch-2.3 b9b35b959 -> 787790b3c [SPARK-25028][SQL] Avoid NPE when analyzing partition with NULL values ## What changes were proposed in this pull request? `ANALYZE TABLE ... PARTITION(...) COMPUTE STATISTICS` can fail with a NPE if a partition column contains a NULL value. The PR avoids the NPE, replacing the `NULL` values with the default partition placeholder. ## How was this patch tested? added UT Closes #22036 from mgaido91/SPARK-25028. Authored-by: Marco Gaido Signed-off-by: Wenchen Fan (cherry picked from commit c220cc42abebbc98a6110b50f787eb6d338c2d97) Signed-off-by: Wenchen Fan Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/787790b3 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/787790b3 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/787790b3 Branch: refs/heads/branch-2.3 Commit: 787790b3c733085b8b5e95cf832dedd481ab3b9a Parents: b9b35b9 Author: Marco Gaido Authored: Tue Aug 14 00:59:18 2018 +0800 Committer: Wenchen Fan Committed: Tue Aug 14 00:59:54 2018 +0800 -- .../command/AnalyzePartitionCommand.scala | 10 -- .../spark/sql/StatisticsCollectionSuite.scala | 18 ++ 2 files changed, 26 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/787790b3/sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzePartitionCommand.scala -- diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzePartitionCommand.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzePartitionCommand.scala index 5b54b22..18fefa0 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzePartitionCommand.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzePartitionCommand.scala @@ -20,7 +20,7 @@ package org.apache.spark.sql.execution.command import org.apache.spark.sql.{AnalysisException, Column, Row, SparkSession} import org.apache.spark.sql.catalyst.TableIdentifier import org.apache.spark.sql.catalyst.analysis.{NoSuchPartitionException, UnresolvedAttribute} -import org.apache.spark.sql.catalyst.catalog.{CatalogTable, CatalogTableType} +import org.apache.spark.sql.catalyst.catalog.{CatalogTable, CatalogTableType, ExternalCatalogUtils} import org.apache.spark.sql.catalyst.catalog.CatalogTypes.TablePartitionSpec import org.apache.spark.sql.catalyst.expressions.{And, EqualTo, Literal} import org.apache.spark.sql.execution.datasources.PartitioningUtils @@ -140,7 +140,13 @@ case class AnalyzePartitionCommand( val df = tableDf.filter(Column(filter)).groupBy(partitionColumns: _*).count() df.collect().map { r => - val partitionColumnValues = partitionColumns.indices.map(r.get(_).toString) + val partitionColumnValues = partitionColumns.indices.map { i => +if (r.isNullAt(i)) { + ExternalCatalogUtils.DEFAULT_PARTITION_NAME +} else { + r.get(i).toString +} + } val spec = tableMeta.partitionColumnNames.zip(partitionColumnValues).toMap val count = BigInt(r.getLong(partitionColumns.size)) (spec, count) http://git-wip-us.apache.org/repos/asf/spark/blob/787790b3/sql/core/src/test/scala/org/apache/spark/sql/StatisticsCollectionSuite.scala -- diff --git a/sql/core/src/test/scala/org/apache/spark/sql/StatisticsCollectionSuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/StatisticsCollectionSuite.scala index b11e798..0e7209a 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/StatisticsCollectionSuite.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/StatisticsCollectionSuite.scala @@ -198,6 +198,24 @@ class StatisticsCollectionSuite extends StatisticsCollectionTestBase with Shared } } + test("SPARK-25028: column stats collection for null partitioning columns") { +val table = "analyze_partition_with_null" +withTempDir { dir => + withTable(table) { +sql(s""" + |CREATE TABLE $table (value string, name string) + |USING PARQUET + |PARTITIONED BY (name) + |LOCATION '${dir.toURI}'""".stripMargin) +val df = Seq(("a", null), ("b", null)).toDF("value", "name") +df.write.mode("overwrite").insertInto(table) +sql(s"ANALYZE TABLE $table PARTITION (name) COMPUTE STATISTICS") +val partitions = spark.sessionState.catalog.listPartitions(TableIdentifier(table)) +assert(partitions.head.stats.get.rowCount.get == 2) + } +} + } + test("number format in statistics") { val numbers = Seq( Big
svn commit: r28695 - in /dev/spark/2.3.3-SNAPSHOT-2018_08_13_10_01-787790b-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s
Author: pwendell Date: Mon Aug 13 17:15:24 2018 New Revision: 28695 Log: Apache Spark 2.3.3-SNAPSHOT-2018_08_13_10_01-787790b docs [This commit notification would consist of 1443 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.] - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r28697 - in /dev/spark/2.4.0-SNAPSHOT-2018_08_13_12_01-c220cc4-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s
Author: pwendell Date: Mon Aug 13 19:16:02 2018 New Revision: 28697 Log: Apache Spark 2.4.0-SNAPSHOT-2018_08_13_12_01-c220cc4 docs [This commit notification would consist of 1476 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.] - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark-website git commit: Stash pride logo for next year
Repository: spark-website Updated Branches: refs/heads/asf-site e33a4bb7d -> 8eb764260 Stash pride logo for next year Project: http://git-wip-us.apache.org/repos/asf/spark-website/repo Commit: http://git-wip-us.apache.org/repos/asf/spark-website/commit/8eb76426 Tree: http://git-wip-us.apache.org/repos/asf/spark-website/tree/8eb76426 Diff: http://git-wip-us.apache.org/repos/asf/spark-website/diff/8eb76426 Branch: refs/heads/asf-site Commit: 8eb764260f5308960c69c212c642cd19ededf3ed Parents: e33a4bb Author: Sean Owen Authored: Sat Aug 11 21:35:01 2018 -0500 Committer: Sean Owen Committed: Mon Aug 13 20:12:03 2018 -0500 -- images/spark-logo-trademark.png | Bin 49720 -> 26999 bytes images/spark-logo.png| Bin 49720 -> 26999 bytes site/images/spark-logo-trademark.png | Bin 49720 -> 26999 bytes site/images/spark-logo.png | Bin 49720 -> 26999 bytes 4 files changed, 0 insertions(+), 0 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark-website/blob/8eb76426/images/spark-logo-trademark.png -- diff --git a/images/spark-logo-trademark.png b/images/spark-logo-trademark.png index eab639f..16702a9 100644 Binary files a/images/spark-logo-trademark.png and b/images/spark-logo-trademark.png differ http://git-wip-us.apache.org/repos/asf/spark-website/blob/8eb76426/images/spark-logo.png -- diff --git a/images/spark-logo.png b/images/spark-logo.png index eab639f..16702a9 100644 Binary files a/images/spark-logo.png and b/images/spark-logo.png differ http://git-wip-us.apache.org/repos/asf/spark-website/blob/8eb76426/site/images/spark-logo-trademark.png -- diff --git a/site/images/spark-logo-trademark.png b/site/images/spark-logo-trademark.png index eab639f..16702a9 100644 Binary files a/site/images/spark-logo-trademark.png and b/site/images/spark-logo-trademark.png differ http://git-wip-us.apache.org/repos/asf/spark-website/blob/8eb76426/site/images/spark-logo.png -- diff --git a/site/images/spark-logo.png b/site/images/spark-logo.png index eab639f..16702a9 100644 Binary files a/site/images/spark-logo.png and b/site/images/spark-logo.png differ - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[1/2] spark git commit: Preparing Spark release v2.3.2-rc5
Repository: spark Updated Branches: refs/heads/branch-2.3 787790b3c -> 29a040361 Preparing Spark release v2.3.2-rc5 Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/4dc82259 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/4dc82259 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/4dc82259 Branch: refs/heads/branch-2.3 Commit: 4dc82259d81102e0cb48f4cb2e8075f80d899ac4 Parents: 787790b Author: Saisai Shao Authored: Tue Aug 14 02:55:09 2018 + Committer: Saisai Shao Committed: Tue Aug 14 02:55:09 2018 + -- R/pkg/DESCRIPTION | 2 +- assembly/pom.xml | 2 +- common/kvstore/pom.xml| 2 +- common/network-common/pom.xml | 2 +- common/network-shuffle/pom.xml| 2 +- common/network-yarn/pom.xml | 2 +- common/sketch/pom.xml | 2 +- common/tags/pom.xml | 2 +- common/unsafe/pom.xml | 2 +- core/pom.xml | 2 +- docs/_config.yml | 4 ++-- examples/pom.xml | 2 +- external/docker-integration-tests/pom.xml | 2 +- external/flume-assembly/pom.xml | 2 +- external/flume-sink/pom.xml | 2 +- external/flume/pom.xml| 2 +- external/kafka-0-10-assembly/pom.xml | 2 +- external/kafka-0-10-sql/pom.xml | 2 +- external/kafka-0-10/pom.xml | 2 +- external/kafka-0-8-assembly/pom.xml | 2 +- external/kafka-0-8/pom.xml| 2 +- external/kinesis-asl-assembly/pom.xml | 2 +- external/kinesis-asl/pom.xml | 2 +- external/spark-ganglia-lgpl/pom.xml | 2 +- graphx/pom.xml| 2 +- hadoop-cloud/pom.xml | 2 +- launcher/pom.xml | 2 +- mllib-local/pom.xml | 2 +- mllib/pom.xml | 2 +- pom.xml | 2 +- python/pyspark/version.py | 2 +- repl/pom.xml | 2 +- resource-managers/kubernetes/core/pom.xml | 2 +- resource-managers/mesos/pom.xml | 2 +- resource-managers/yarn/pom.xml| 2 +- sql/catalyst/pom.xml | 2 +- sql/core/pom.xml | 2 +- sql/hive-thriftserver/pom.xml | 2 +- sql/hive/pom.xml | 2 +- streaming/pom.xml | 2 +- tools/pom.xml | 2 +- 41 files changed, 42 insertions(+), 42 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/4dc82259/R/pkg/DESCRIPTION -- diff --git a/R/pkg/DESCRIPTION b/R/pkg/DESCRIPTION index 6ec4966..8df2635 100644 --- a/R/pkg/DESCRIPTION +++ b/R/pkg/DESCRIPTION @@ -1,6 +1,6 @@ Package: SparkR Type: Package -Version: 2.3.3 +Version: 2.3.2 Title: R Frontend for Apache Spark Description: Provides an R Frontend for Apache Spark. Authors@R: c(person("Shivaram", "Venkataraman", role = c("aut", "cre"), http://git-wip-us.apache.org/repos/asf/spark/blob/4dc82259/assembly/pom.xml -- diff --git a/assembly/pom.xml b/assembly/pom.xml index f8b15cc..57485fc 100644 --- a/assembly/pom.xml +++ b/assembly/pom.xml @@ -21,7 +21,7 @@ org.apache.spark spark-parent_2.11 -2.3.3-SNAPSHOT +2.3.2 ../pom.xml http://git-wip-us.apache.org/repos/asf/spark/blob/4dc82259/common/kvstore/pom.xml -- diff --git a/common/kvstore/pom.xml b/common/kvstore/pom.xml index e412a47..53e58c2 100644 --- a/common/kvstore/pom.xml +++ b/common/kvstore/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.11 -2.3.3-SNAPSHOT +2.3.2 ../../pom.xml http://git-wip-us.apache.org/repos/asf/spark/blob/4dc82259/common/network-common/pom.xml -- diff --git a/common/network-common/pom.xml b/common/network-common/pom.xml index d8f9a3d..d05647c 100644 --- a/common/network-common/pom.xml +++ b/common/network-common/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.11 -2.3.3-SNAPSHOT +2.3.2 ../../pom.xml http://git-wip-us.apache.org/repos/asf/spark/blob/4dc82259/common/network-shuffle/pom.xml -- diff --git a/common/network-shuffle/pom.xml b/common/network-shuffle/pom.xml index a1a4f87..8d46761 100644 --- a/common/network-shuffle/pom.xml +++ b/common/network-shuffle/pom.xml
[spark] Git Push Summary
Repository: spark Updated Tags: refs/tags/v2.3.2-rc5 [created] 4dc82259d - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[2/2] spark git commit: Preparing development version 2.3.3-SNAPSHOT
Preparing development version 2.3.3-SNAPSHOT Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/29a04036 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/29a04036 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/29a04036 Branch: refs/heads/branch-2.3 Commit: 29a040361c4de5c6438c909ded9959ccd53e1a7c Parents: 4dc8225 Author: Saisai Shao Authored: Tue Aug 14 02:55:19 2018 + Committer: Saisai Shao Committed: Tue Aug 14 02:55:19 2018 + -- R/pkg/DESCRIPTION | 2 +- assembly/pom.xml | 2 +- common/kvstore/pom.xml| 2 +- common/network-common/pom.xml | 2 +- common/network-shuffle/pom.xml| 2 +- common/network-yarn/pom.xml | 2 +- common/sketch/pom.xml | 2 +- common/tags/pom.xml | 2 +- common/unsafe/pom.xml | 2 +- core/pom.xml | 2 +- docs/_config.yml | 4 ++-- examples/pom.xml | 2 +- external/docker-integration-tests/pom.xml | 2 +- external/flume-assembly/pom.xml | 2 +- external/flume-sink/pom.xml | 2 +- external/flume/pom.xml| 2 +- external/kafka-0-10-assembly/pom.xml | 2 +- external/kafka-0-10-sql/pom.xml | 2 +- external/kafka-0-10/pom.xml | 2 +- external/kafka-0-8-assembly/pom.xml | 2 +- external/kafka-0-8/pom.xml| 2 +- external/kinesis-asl-assembly/pom.xml | 2 +- external/kinesis-asl/pom.xml | 2 +- external/spark-ganglia-lgpl/pom.xml | 2 +- graphx/pom.xml| 2 +- hadoop-cloud/pom.xml | 2 +- launcher/pom.xml | 2 +- mllib-local/pom.xml | 2 +- mllib/pom.xml | 2 +- pom.xml | 2 +- python/pyspark/version.py | 2 +- repl/pom.xml | 2 +- resource-managers/kubernetes/core/pom.xml | 2 +- resource-managers/mesos/pom.xml | 2 +- resource-managers/yarn/pom.xml| 2 +- sql/catalyst/pom.xml | 2 +- sql/core/pom.xml | 2 +- sql/hive-thriftserver/pom.xml | 2 +- sql/hive/pom.xml | 2 +- streaming/pom.xml | 2 +- tools/pom.xml | 2 +- 41 files changed, 42 insertions(+), 42 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/29a04036/R/pkg/DESCRIPTION -- diff --git a/R/pkg/DESCRIPTION b/R/pkg/DESCRIPTION index 8df2635..6ec4966 100644 --- a/R/pkg/DESCRIPTION +++ b/R/pkg/DESCRIPTION @@ -1,6 +1,6 @@ Package: SparkR Type: Package -Version: 2.3.2 +Version: 2.3.3 Title: R Frontend for Apache Spark Description: Provides an R Frontend for Apache Spark. Authors@R: c(person("Shivaram", "Venkataraman", role = c("aut", "cre"), http://git-wip-us.apache.org/repos/asf/spark/blob/29a04036/assembly/pom.xml -- diff --git a/assembly/pom.xml b/assembly/pom.xml index 57485fc..f8b15cc 100644 --- a/assembly/pom.xml +++ b/assembly/pom.xml @@ -21,7 +21,7 @@ org.apache.spark spark-parent_2.11 -2.3.2 +2.3.3-SNAPSHOT ../pom.xml http://git-wip-us.apache.org/repos/asf/spark/blob/29a04036/common/kvstore/pom.xml -- diff --git a/common/kvstore/pom.xml b/common/kvstore/pom.xml index 53e58c2..e412a47 100644 --- a/common/kvstore/pom.xml +++ b/common/kvstore/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.11 -2.3.2 +2.3.3-SNAPSHOT ../../pom.xml http://git-wip-us.apache.org/repos/asf/spark/blob/29a04036/common/network-common/pom.xml -- diff --git a/common/network-common/pom.xml b/common/network-common/pom.xml index d05647c..d8f9a3d 100644 --- a/common/network-common/pom.xml +++ b/common/network-common/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.11 -2.3.2 +2.3.3-SNAPSHOT ../../pom.xml http://git-wip-us.apache.org/repos/asf/spark/blob/29a04036/common/network-shuffle/pom.xml -- diff --git a/common/network-shuffle/pom.xml b/common/network-shuffle/pom.xml index 8d46761..a1a4f87 100644 --- a/common/network-shuffle/pom.xml +++ b/common/network-shuffle/pom.xml @@ -22,7 +22,7 @@ org.apache.spark spark-parent_2.11 -2.3
svn commit: r28702 - /dev/spark/v2.3.2-rc5-bin/
Author: jshao Date: Tue Aug 14 04:02:50 2018 New Revision: 28702 Log: Apache Spark v2.3.2-rc5 Added: dev/spark/v2.3.2-rc5-bin/ dev/spark/v2.3.2-rc5-bin/SparkR_2.3.2.tar.gz (with props) dev/spark/v2.3.2-rc5-bin/SparkR_2.3.2.tar.gz.asc dev/spark/v2.3.2-rc5-bin/SparkR_2.3.2.tar.gz.sha512 dev/spark/v2.3.2-rc5-bin/pyspark-2.3.2.tar.gz (with props) dev/spark/v2.3.2-rc5-bin/pyspark-2.3.2.tar.gz.asc dev/spark/v2.3.2-rc5-bin/pyspark-2.3.2.tar.gz.sha512 dev/spark/v2.3.2-rc5-bin/spark-2.3.2-bin-hadoop2.6.tgz (with props) dev/spark/v2.3.2-rc5-bin/spark-2.3.2-bin-hadoop2.6.tgz.asc dev/spark/v2.3.2-rc5-bin/spark-2.3.2-bin-hadoop2.6.tgz.sha512 dev/spark/v2.3.2-rc5-bin/spark-2.3.2-bin-hadoop2.7.tgz (with props) dev/spark/v2.3.2-rc5-bin/spark-2.3.2-bin-hadoop2.7.tgz.asc dev/spark/v2.3.2-rc5-bin/spark-2.3.2-bin-hadoop2.7.tgz.sha512 dev/spark/v2.3.2-rc5-bin/spark-2.3.2-bin-without-hadoop.tgz (with props) dev/spark/v2.3.2-rc5-bin/spark-2.3.2-bin-without-hadoop.tgz.asc dev/spark/v2.3.2-rc5-bin/spark-2.3.2-bin-without-hadoop.tgz.sha512 dev/spark/v2.3.2-rc5-bin/spark-2.3.2.tgz (with props) dev/spark/v2.3.2-rc5-bin/spark-2.3.2.tgz.asc dev/spark/v2.3.2-rc5-bin/spark-2.3.2.tgz.sha512 Added: dev/spark/v2.3.2-rc5-bin/SparkR_2.3.2.tar.gz == Binary file - no diff available. Propchange: dev/spark/v2.3.2-rc5-bin/SparkR_2.3.2.tar.gz -- svn:mime-type = application/octet-stream Added: dev/spark/v2.3.2-rc5-bin/SparkR_2.3.2.tar.gz.asc == --- dev/spark/v2.3.2-rc5-bin/SparkR_2.3.2.tar.gz.asc (added) +++ dev/spark/v2.3.2-rc5-bin/SparkR_2.3.2.tar.gz.asc Tue Aug 14 04:02:50 2018 @@ -0,0 +1,16 @@ +-BEGIN PGP SIGNATURE- + +iQIcBAABCgAGBQJbcktnAAoJENsLIaASlz/Qx1AQALpg9+8iDcJ/rW+q4GxLAsBB +76So/oYAWQSRpj4AeBDnJbfiyVjFsny1x26+IyKLyz90A5G3astBx1j92LpVWqag +ii4C3u9HyHYmfSriWlAxeJYhDt7MhdsM+Es31Q+uO+3QPB2Up+DuGYA9PzrE/jSA +QY5NQ+jVGH83KIynMQXHVTbz1MMYQrtwIVOImrBDrf+vgTTm3Whz5xYxMQpVcNDY +C+VQigGKoqq0rxjJd1lqer3F5KjCqSoHk7xIBBh7C/Kjk3Wv1x6y3O88r3v1WWPe +Nww/UXhFDD9QKY+8T9TvhW/OEA6dgHm87zko3AXOMaPIHdoyU57L/5uUdICt72iW +YT7YMdecZgzd7QCU6rneEwZgU6WS1TvdcvAGi8JvAszGNuQeYqKw7c+EkzMiv7Ys +h3Ymcwq5ODULtQh8UQbiECcpeECmp4h1Vnq9FQDUco3XYEkGesuAUET9wMjCWeqN +ahR08j/cbcW7yxbOwKpsl4RuSyAqQwQIRkM9GK8g+z091V2MJFfq241Wip2eHZK5 +pakWR8XFemVCqFUppIzrIbIAve5Hk0YRZL/l6bGcZSfKu3aCr3ndges5SJfufuYV +EKlQjyGnz8o6QsZ+qMi/LRZl5Wxh9eHamn/Eg96H36jYc8I1V5xf1ZGOdVlngK5K +Dub/tLAYfVPJJfSziOBK +=6LOG +-END PGP SIGNATURE- Added: dev/spark/v2.3.2-rc5-bin/SparkR_2.3.2.tar.gz.sha512 == --- dev/spark/v2.3.2-rc5-bin/SparkR_2.3.2.tar.gz.sha512 (added) +++ dev/spark/v2.3.2-rc5-bin/SparkR_2.3.2.tar.gz.sha512 Tue Aug 14 04:02:50 2018 @@ -0,0 +1,3 @@ +SparkR_2.3.2.tar.gz: 5C580581 A27AAEC3 0F1176AD EF0817E7 8D58CE8A 1BAEE405 + 2CE70766 D3BCCE9B D8531F79 CFAB75E9 59ACF879 A1BAB6A8 + 2E7EA2AD 37D6742D F57EC3E9 42D964B3 Added: dev/spark/v2.3.2-rc5-bin/pyspark-2.3.2.tar.gz == Binary file - no diff available. Propchange: dev/spark/v2.3.2-rc5-bin/pyspark-2.3.2.tar.gz -- svn:mime-type = application/octet-stream Added: dev/spark/v2.3.2-rc5-bin/pyspark-2.3.2.tar.gz.asc == --- dev/spark/v2.3.2-rc5-bin/pyspark-2.3.2.tar.gz.asc (added) +++ dev/spark/v2.3.2-rc5-bin/pyspark-2.3.2.tar.gz.asc Tue Aug 14 04:02:50 2018 @@ -0,0 +1,16 @@ +-BEGIN PGP SIGNATURE- + +iQIcBAABCgAGBQJbck+7AAoJENsLIaASlz/Q1KMP/jvhiZw5ImbWJpwOFjkO73G+ +dF3Oo20IZmhiRN16NRJaYO2SzdH3t8HDnEMBC5cbZIR69h+qres5aGN9K1v/DbmH +88BLUDiSnk7+XXqX+jfQGwowyqE65kj20H6QCGWBsD56m+gbzadtgJ/GMiG9lvKz +yyERagY0/shKABbXTNyiAtmKI12FR4/L/Y98WDlSs90LEYMHFxDAummHWMqPdyn4 +vF2pMV/7mvthWr7HNyt6cXtBG6KUTszt674VMAeJn5Yt3ZkCpydJslkSsm1WLu9V +TZ7H5F6R6DlxfopExdu/lGZbINlFmSdKPhzKeX9j0yqzUOjY64obhEJGZNgOl5yU +/YC/D1u1NTafIb8g2tdzXsJQGI9v3+KmCqgBKBKAcNeEycRNIdvHvswPgz9g+Jzf +gpMpHLrZHbIv62RmlzERJvd5v+PfT7195ax85Gb+p7k2Zjea0J1oC5iEj+qhRvl/ +Y3kpWd/258s3bLhrv+MUYwzZepLBm3brY/Jbs9N6VEnbEhzQeOHHLj2loIHR1R/W +CKXHLzHjQCXWvcfBCpmdF9SUGI8ZUSNZrV/96D4T6pmAA1QU3e2RC8N83SOHeAlt +iEPF/lgeqp6zClV8mKs245cIZt7MRaovPghRWapSfp6XrwomreDDPUcrlJmgpV3h +e1ronCjB3AvaJ9LOh+IA +=mxwy +-END PGP SIGNATURE- Added: dev/spark/v2.3.2-rc5-bin/pyspark-2.3.2.tar.gz.sha512 == --- dev/spark/v2.3.2-rc5-bin/pyspark-2.3.2.tar.gz.sha512 (added) +++ dev/spark/v2.3.2-rc5-bin/pyspark-2.3.2.tar.gz.sha512 Tue Aug 14 04:02:50 2018 @@ -0,0 +1,3 @@ +
spark git commit: [SPARK-25104][SQL] Avro: Validate user specified output schema
Repository: spark Updated Branches: refs/heads/master c220cc42a -> ab197308a [SPARK-25104][SQL] Avro: Validate user specified output schema ## What changes were proposed in this pull request? With code changes in https://github.com/apache/spark/pull/21847 , Spark can write out to Avro file as per user provided output schema. To make it more robust and user friendly, we should validate the Avro schema before tasks launched. Also we should support output logical decimal type as BYTES (By default we output as FIXED) ## How was this patch tested? Unit test Closes #22094 from gengliangwang/AvroSerializerMatch. Authored-by: Gengliang Wang Signed-off-by: DB Tsai Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/ab197308 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/ab197308 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/ab197308 Branch: refs/heads/master Commit: ab197308a79c74f0a4205a8f60438811b5e0b991 Parents: c220cc4 Author: Gengliang Wang Authored: Tue Aug 14 04:43:14 2018 + Committer: DB Tsai Committed: Tue Aug 14 04:43:14 2018 + -- .../apache/spark/sql/avro/AvroSerializer.scala | 108 +++ .../spark/sql/avro/AvroLogicalTypeSuite.scala | 40 +++ .../org/apache/spark/sql/avro/AvroSuite.scala | 57 ++ 3 files changed, 158 insertions(+), 47 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/ab197308/external/avro/src/main/scala/org/apache/spark/sql/avro/AvroSerializer.scala -- diff --git a/external/avro/src/main/scala/org/apache/spark/sql/avro/AvroSerializer.scala b/external/avro/src/main/scala/org/apache/spark/sql/avro/AvroSerializer.scala index 3a9544c..f551c83 100644 --- a/external/avro/src/main/scala/org/apache/spark/sql/avro/AvroSerializer.scala +++ b/external/avro/src/main/scala/org/apache/spark/sql/avro/AvroSerializer.scala @@ -26,6 +26,7 @@ import org.apache.avro.Conversions.DecimalConversion import org.apache.avro.LogicalTypes.{TimestampMicros, TimestampMillis} import org.apache.avro.Schema import org.apache.avro.Schema.Type +import org.apache.avro.Schema.Type._ import org.apache.avro.generic.GenericData.{EnumSymbol, Fixed, Record} import org.apache.avro.generic.GenericData.Record import org.apache.avro.util.Utf8 @@ -72,62 +73,70 @@ class AvroSerializer(rootCatalystType: DataType, rootAvroType: Schema, nullable: private lazy val decimalConversions = new DecimalConversion() private def newConverter(catalystType: DataType, avroType: Schema): Converter = { -catalystType match { - case NullType => +(catalystType, avroType.getType) match { + case (NullType, NULL) => (getter, ordinal) => null - case BooleanType => + case (BooleanType, BOOLEAN) => (getter, ordinal) => getter.getBoolean(ordinal) - case ByteType => + case (ByteType, INT) => (getter, ordinal) => getter.getByte(ordinal).toInt - case ShortType => + case (ShortType, INT) => (getter, ordinal) => getter.getShort(ordinal).toInt - case IntegerType => + case (IntegerType, INT) => (getter, ordinal) => getter.getInt(ordinal) - case LongType => + case (LongType, LONG) => (getter, ordinal) => getter.getLong(ordinal) - case FloatType => + case (FloatType, FLOAT) => (getter, ordinal) => getter.getFloat(ordinal) - case DoubleType => + case (DoubleType, DOUBLE) => (getter, ordinal) => getter.getDouble(ordinal) - case d: DecimalType => + case (d: DecimalType, FIXED) +if avroType.getLogicalType == LogicalTypes.decimal(d.precision, d.scale) => (getter, ordinal) => val decimal = getter.getDecimal(ordinal, d.precision, d.scale) decimalConversions.toFixed(decimal.toJavaBigDecimal, avroType, LogicalTypes.decimal(d.precision, d.scale)) - case StringType => avroType.getType match { -case Type.ENUM => - import scala.collection.JavaConverters._ - val enumSymbols: Set[String] = avroType.getEnumSymbols.asScala.toSet - (getter, ordinal) => -val data = getter.getUTF8String(ordinal).toString -if (!enumSymbols.contains(data)) { - throw new IncompatibleSchemaException( -"Cannot write \"" + data + "\" since it's not defined in enum \"" + - enumSymbols.mkString("\", \"") + "\"") -} -new EnumSymbol(avroType, data) -case _ => - (getter, ordinal) => new Utf8(getter.getUTF8String(ordinal).getBytes) - } - case BinaryType => avroType.getType match { -case Type.FIXED => - val size = avroType.getFixedSize(
spark git commit: [SPARK-22974][ML] Attach attributes to output column of CountVectorModel
Repository: spark Updated Branches: refs/heads/master ab197308a -> 3eb52092b [SPARK-22974][ML] Attach attributes to output column of CountVectorModel ## What changes were proposed in this pull request? The output column from `CountVectorModel` lacks attribute. So a later transformer like `Interaction` can raise error because no attribute available. ## How was this patch tested? Added test. Please review http://spark.apache.org/contributing.html before opening a pull request. Closes #20313 from viirya/SPARK-22974. Authored-by: Liang-Chi Hsieh Signed-off-by: DB Tsai Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/3eb52092 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/3eb52092 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/3eb52092 Branch: refs/heads/master Commit: 3eb52092b3aa9d7d2fc1e50ac237d47bfb3b9e92 Parents: ab19730 Author: Liang-Chi Hsieh Authored: Tue Aug 14 05:05:16 2018 + Committer: DB Tsai Committed: Tue Aug 14 05:05:16 2018 + -- .../apache/spark/ml/feature/CountVectorizer.scala | 5 - .../spark/ml/feature/CountVectorizerSuite.scala | 16 2 files changed, 20 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/3eb52092/mllib/src/main/scala/org/apache/spark/ml/feature/CountVectorizer.scala -- diff --git a/mllib/src/main/scala/org/apache/spark/ml/feature/CountVectorizer.scala b/mllib/src/main/scala/org/apache/spark/ml/feature/CountVectorizer.scala index 10c48c3..dc8eb82 100644 --- a/mllib/src/main/scala/org/apache/spark/ml/feature/CountVectorizer.scala +++ b/mllib/src/main/scala/org/apache/spark/ml/feature/CountVectorizer.scala @@ -21,6 +21,7 @@ import org.apache.hadoop.fs.Path import org.apache.spark.annotation.Since import org.apache.spark.broadcast.Broadcast import org.apache.spark.ml.{Estimator, Model} +import org.apache.spark.ml.attribute.{Attribute, AttributeGroup, NumericAttribute} import org.apache.spark.ml.linalg.{Vectors, VectorUDT} import org.apache.spark.ml.param._ import org.apache.spark.ml.param.shared.{HasInputCol, HasOutputCol} @@ -317,7 +318,9 @@ class CountVectorizerModel( Vectors.sparse(dictBr.value.size, effectiveCounts) } -dataset.withColumn($(outputCol), vectorizer(col($(inputCol +val attrs = vocabulary.map(_ => new NumericAttribute).asInstanceOf[Array[Attribute]] +val metadata = new AttributeGroup($(outputCol), attrs).toMetadata() +dataset.withColumn($(outputCol), vectorizer(col($(inputCol))), metadata) } @Since("1.5.0") http://git-wip-us.apache.org/repos/asf/spark/blob/3eb52092/mllib/src/test/scala/org/apache/spark/ml/feature/CountVectorizerSuite.scala -- diff --git a/mllib/src/test/scala/org/apache/spark/ml/feature/CountVectorizerSuite.scala b/mllib/src/test/scala/org/apache/spark/ml/feature/CountVectorizerSuite.scala index 6121766..bca580d 100644 --- a/mllib/src/test/scala/org/apache/spark/ml/feature/CountVectorizerSuite.scala +++ b/mllib/src/test/scala/org/apache/spark/ml/feature/CountVectorizerSuite.scala @@ -289,4 +289,20 @@ class CountVectorizerSuite extends MLTest with DefaultReadWriteTest { val newInstance = testDefaultReadWrite(instance) assert(newInstance.vocabulary === instance.vocabulary) } + + test("SPARK-22974: CountVectorModel should attach proper attribute to output column") { +val df = spark.createDataFrame(Seq( + (0, 1.0, Array("a", "b", "c")), + (1, 2.0, Array("a", "b", "b", "c", "a", "d")) +)).toDF("id", "features1", "words") + +val cvm = new CountVectorizerModel(Array("a", "b", "c")) + .setInputCol("words") + .setOutputCol("features2") + +val df1 = cvm.transform(df) +val interaction = new Interaction().setInputCols(Array("features1", "features2")) + .setOutputCol("features") +interaction.transform(df1) + } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r28704 - in /dev/spark/2.3.3-SNAPSHOT-2018_08_13_22_02-29a0403-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s
Author: pwendell Date: Tue Aug 14 05:16:02 2018 New Revision: 28704 Log: Apache Spark 2.3.3-SNAPSHOT-2018_08_13_22_02-29a0403 docs [This commit notification would consist of 1443 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.] - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r28707 - in /dev/spark/v2.3.2-rc5-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _site/api/java/org/apache/spark
Author: jshao Date: Tue Aug 14 06:54:52 2018 New Revision: 28707 Log: Apache Spark v2.3.2-rc5 docs [This commit notification would consist of 1446 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.] - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org