[spark] branch master updated (ca326c4 -> 654f19d)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from ca326c4 [SPARK-22748][SQL] Analyze __grouping__id as a literal function add 654f19d [SPARK-34621][SQL] Unify output of ShowCreateTableAsSerdeCommand and ShowCreateTableCommand No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/plans/logical/v2Commands.scala| 15 --- .../sql/catalyst/analysis/ResolveSessionCatalog.scala| 6 +++--- .../org/apache/spark/sql/execution/command/tables.scala | 16 +++- .../execution/datasources/v2/DataSourceV2Strategy.scala | 2 +- .../test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala | 3 ++- 5 files changed, 25 insertions(+), 17 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (358697b -> ca326c4)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 358697b [SPARK-34635][UI] Add trailing slashes in URLs to reduce unnecessary redirects add ca326c4 [SPARK-22748][SQL] Analyze __grouping__id as a literal function No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/analysis/Analyzer.scala | 25 .../sql/catalyst/analysis/AnalysisSuite.scala | 68 ++ 2 files changed, 83 insertions(+), 10 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (75db6e7 -> 358697b)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 75db6e7 [MINOR][SQL][DOCS] Fix some spelling issues in SQL migration guide add 358697b [SPARK-34635][UI] Add trailing slashes in URLs to reduce unnecessary redirects No new revisions were added by this update. Summary of changes: .../scala/org/apache/spark/deploy/master/ui/ApplicationPage.scala | 4 ++-- .../src/main/scala/org/apache/spark/deploy/master/ui/MasterPage.scala | 2 +- core/src/main/scala/org/apache/spark/deploy/worker/DriverRunner.scala | 4 ++-- .../src/main/scala/org/apache/spark/deploy/worker/ui/WorkerPage.scala | 4 ++-- 4 files changed, 7 insertions(+), 7 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated (4574d99 -> c54482e)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git. from 4574d99 [SPARK-32924][WEBUI] Make duration column in master UI sorted in the correct order add c54482e [SPARK-34607][SQL][3.0] Add `Utils.isMemberClass` to fix a malformed class name error on jdk8u No new revisions were added by this update. Summary of changes: .../main/scala/org/apache/spark/util/Utils.scala | 28 + .../spark/sql/catalyst/encoders/OuterScopes.scala | 2 +- .../sql/catalyst/expressions/objects/objects.scala | 2 +- .../catalyst/encoders/ExpressionEncoderSuite.scala | 70 ++ 4 files changed, 100 insertions(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.1 updated: [SPARK-34607][SQL][3.1] Add `Utils.isMemberClass` to fix a malformed class name error on jdk8u
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch branch-3.1 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.1 by this push: new f3caba1 [SPARK-34607][SQL][3.1] Add `Utils.isMemberClass` to fix a malformed class name error on jdk8u f3caba1 is described below commit f3caba1f8d649889713de4824412a3ee2666bc6f Author: Takeshi Yamamuro AuthorDate: Thu Mar 4 22:40:57 2021 -0800 [SPARK-34607][SQL][3.1] Add `Utils.isMemberClass` to fix a malformed class name error on jdk8u This PR intends to fix a bug of `objects.NewInstance` if a user runs Spark on jdk8u and a given `cls` in `NewInstance` is a deeply-nested inner class, e.g.,. ``` object OuterLevelWithVeryVeryVeryLongClassName1 { object OuterLevelWithVeryVeryVeryLongClassName2 { object OuterLevelWithVeryVeryVeryLongClassName3 { object OuterLevelWithVeryVeryVeryLongClassName4 { object OuterLevelWithVeryVeryVeryLongClassName5 { object OuterLevelWithVeryVeryVeryLongClassName6 { object OuterLevelWithVeryVeryVeryLongClassName7 { object OuterLevelWithVeryVeryVeryLongClassName8 { object OuterLevelWithVeryVeryVeryLongClassName9 { object OuterLevelWithVeryVeryVeryLongClassName10 { object OuterLevelWithVeryVeryVeryLongClassName11 { object OuterLevelWithVeryVeryVeryLongClassName12 { object OuterLevelWithVeryVeryVeryLongClassName13 { object OuterLevelWithVeryVeryVeryLongClassName14 { object OuterLevelWithVeryVeryVeryLongClassName15 { object OuterLevelWithVeryVeryVeryLongClassName16 { object OuterLevelWithVeryVeryVeryLongClassName17 { object OuterLevelWithVeryVeryVeryLongClassName18 { object OuterLevelWithVeryVeryVeryLongClassName19 { object OuterLevelWithVeryVeryVeryLongClassName20 { case class MalformedNameExample2(x: Int) ``` The root cause that Kris (@rednaxelafx) investigated is as follows (Kudos to Kris); The reason why the test case above is so convoluted is in the way Scala generates the class name for nested classes. In general, Scala generates a class name for a nested class by inserting the dollar-sign ( `$` ) in between each level of class nesting. The problem is that this format can concatenate into a very long string that goes beyond certain limits, so Scala will change the class name format beyond certain length threshold. For the example above, we can see that the first two levels of class nesting have class names that look like this: ``` org.apache.spark.sql.catalyst.encoders.ExpressionEncoderSuite$OuterLevelWithVeryVeryVeryLongClassName1$ org.apache.spark.sql.catalyst.encoders.ExpressionEncoderSuite$OuterLevelWithVeryVeryVeryLongClassName1$OuterLevelWithVeryVeryVeryLongClassName2$ ``` If we leave out the fact that Scala uses a dollar-sign ( `$` ) suffix for the class name of the companion object, `OuterLevelWithVeryVeryVeryLongClassName1`'s full name is a prefix (substring) of `OuterLevelWithVeryVeryVeryLongClassName2`. But if we keep going deeper into the levels of nesting, you'll find names that look like: ``` org.apache.spark.sql.catalyst.encoders.ExpressionEncoderSuite$OuterLevelWithVeryVeryVeryLongClassNam2a1321b953c615695d7442b2adb1ryVeryLongClassName8$OuterLevelWithVeryVeryVeryLongClassName9$OuterLevelWithVeryVeryVeryLongClassName10$ org.apache.spark.sql.catalyst.encoders.ExpressionEncoderSuite$OuterLevelWithVeryVeryVeryLongClassNam2a1321b953c615695d7442b2adb1ryVeryLongClassName8$OuterLevelWithVeryVeryVeryLongClassName9$OuterLevelWithVeryVeryVeryLongClassName10$OuterLevelWithVeryVeryVeryLongClassName11$ org.apache.spark.sql.catalyst.encoders.ExpressionEncoderSuite$OuterLevelWithVeryVeryVeryLongClassNam85f068777e7ecf112afcbe997d461bVeryLongClassName11$OuterLevelWithVeryVeryVeryLongClassName12$ org.apache.spark.sql.catalyst.encoders.ExpressionEncoderSuite$OuterLevelWithVeryVeryVeryLongClassNam85f068777e7ecf112afcbe997d461bVeryLongClassName11$OuterLevelWithVeryVeryVeryLongClassName12$OuterLevelWithVeryVeryVeryLongClassName13$
[spark] branch master updated (dc78f33 -> 75db6e7)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from dc78f33 [SPARK-34609][SQL] Unify resolveExpressionBottomUp and resolveExpressionTopDown add 75db6e7 [MINOR][SQL][DOCS] Fix some spelling issues in SQL migration guide No new revisions were added by this update. Summary of changes: docs/sql-migration-guide.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (979b9bc -> dc78f33)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 979b9bc [SPARK-34608][SQL] Remove unused output of AddJarCommand add dc78f33 [SPARK-34609][SQL] Unify resolveExpressionBottomUp and resolveExpressionTopDown No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/analysis/Analyzer.scala | 243 +++-- 1 file changed, 130 insertions(+), 113 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.1 updated: [SPARK-34596][SQL] Use Utils.getSimpleName to avoid hitting Malformed class name in NewInstance.doGenCode
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch branch-3.1 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.1 by this push: new 9615c0e [SPARK-34596][SQL] Use Utils.getSimpleName to avoid hitting Malformed class name in NewInstance.doGenCode 9615c0e is described below commit 9615c0ec1ed3f787756a453a9744f22077ce9251 Author: Kris Mok AuthorDate: Wed Mar 3 12:22:51 2021 +0900 [SPARK-34596][SQL] Use Utils.getSimpleName to avoid hitting Malformed class name in NewInstance.doGenCode ### What changes were proposed in this pull request? Use `Utils.getSimpleName` to avoid hitting `Malformed class name` error in `NewInstance.doGenCode`. ### Why are the changes needed? On older JDK versions (e.g. JDK8u), nested Scala classes may trigger `java.lang.Class.getSimpleName` to throw an `java.lang.InternalError: Malformed class name` error. In this particular case, creating an `ExpressionEncoder` on such a nested Scala class would create a `NewInstance` expression under the hood, which will trigger the problem during codegen. Similar to https://github.com/apache/spark/pull/29050, we should use Spark's `Utils.getSimpleName` utility function in place of `Class.getSimpleName` to avoid hitting the issue. There are two other occurrences of `java.lang.Class.getSimpleName` in the same file, but they're safe because they're only guaranteed to be only used on Java classes, which don't have this problem, e.g.: ```scala // Make a copy of the data if it's unsafe-backed def makeCopyIfInstanceOf(clazz: Class[_ <: Any], value: String) = s"$value instanceof ${clazz.getSimpleName}? ${value}.copy() : $value" val genFunctionValue: String = lambdaFunction.dataType match { case StructType(_) => makeCopyIfInstanceOf(classOf[UnsafeRow], genFunction.value) case ArrayType(_, _) => makeCopyIfInstanceOf(classOf[UnsafeArrayData], genFunction.value) case MapType(_, _, _) => makeCopyIfInstanceOf(classOf[UnsafeMapData], genFunction.value) case _ => genFunction.value } ``` The Unsafe-* family of types are all Java types, so they're okay. ### Does this PR introduce _any_ user-facing change? Fixes a bug that throws an error when using `ExpressionEncoder` on some nested Scala types, otherwise no changes. ### How was this patch tested? Added a test case to `org.apache.spark.sql.catalyst.encoders.ExpressionEncoderSuite`. It'll fail on JDK8u before the fix, and pass after the fix. Closes #31709 from rednaxelafx/spark-34596-master. Authored-by: Kris Mok Signed-off-by: HyukjinKwon (cherry picked from commit ecf4811764f1ef91954c865a864e0bf6691f99a6) Signed-off-by: HyukjinKwon --- .../spark/sql/catalyst/expressions/objects/objects.scala | 2 +- .../spark/sql/catalyst/encoders/ExpressionEncoderSuite.scala | 12 2 files changed, 13 insertions(+), 1 deletion(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala index f391b31..8801c7d 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala @@ -489,7 +489,7 @@ case class NewInstance( // that might be defined on the companion object. case 0 => s"$className$$.MODULE$$.apply($argString)" case _ => outer.map { gen => -s"${gen.value}.new ${cls.getSimpleName}($argString)" +s"${gen.value}.new ${Utils.getSimpleName(cls)}($argString)" }.getOrElse { s"new $className($argString)" } diff --git a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoderSuite.scala b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoderSuite.scala index f2598a9..2635264 100644 --- a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoderSuite.scala +++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoderSuite.scala @@ -205,6 +205,18 @@ class ExpressionEncoderSuite extends CodegenInterpretedPlanTest with AnalysisTes encodeDecodeTest(Array(Option(InnerClass(1))), "array of optional inner class") + // holder class to trigger Class.getSimpleName issue + object MalformedClassObject extends Serializable { +case class MalformedNameExample(x: Int) + } + + { +OuterScopes.addOuterScope(MalformedClassObject) +encodeDecodeTest( + MalformedClassObject.MalformedNameExample(42), + "nested Scala class should work") + } + productTest(PrimitiveData(1, 1, 1, 1,
[spark] branch master updated (43aacd5 -> 979b9bc)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 43aacd5 [SPARK-34613][SQL] Fix view does not capture disable hint config add 979b9bc [SPARK-34608][SQL] Remove unused output of AddJarCommand No new revisions were added by this update. Summary of changes: .../scala/org/apache/spark/sql/execution/command/resources.scala | 8 +--- 1 file changed, 1 insertion(+), 7 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (814d81c -> 43aacd5)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 814d81c [SPARK-34376][SQL] Support regexp as a SQL function add 43aacd5 [SPARK-34613][SQL] Fix view does not capture disable hint config No new revisions were added by this update. Summary of changes: .../org/apache/spark/sql/execution/command/views.scala| 12 +++- .../org/apache/spark/sql/execution/SQLViewTestSuite.scala | 15 +++ 2 files changed, 26 insertions(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.1 updated: [SPARK-34613][SQL] Fix view does not capture disable hint config
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a commit to branch branch-3.1 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.1 by this push: new b73b1dc [SPARK-34613][SQL] Fix view does not capture disable hint config b73b1dc is described below commit b73b1dca54ec95d2085eb6d5daf659d695fea1a5 Author: ulysses-you AuthorDate: Fri Mar 5 12:19:30 2021 +0800 [SPARK-34613][SQL] Fix view does not capture disable hint config ### What changes were proposed in this pull request? Add allow list to capture sql config for view. ### Why are the changes needed? Spark use origin text sql to store view then capture and store sql config into view metadata. Capture config will skip some config with some prefix, e.g. `spark.sql.optimizer.` but unfortunately `spark.sql.optimizer.disableHints` is start with `spark.sql.optimizer.`. We need a allow list to help capture the config. ### Does this PR introduce _any_ user-facing change? Yes bug fix. ### How was this patch tested? Add test. Closes #31732 from ulysses-you/SPARK-34613. Authored-by: ulysses-you Signed-off-by: Wenchen Fan (cherry picked from commit 43aacd5069294e1215e86cd43bd0810bda998be2) Signed-off-by: Wenchen Fan --- .../org/apache/spark/sql/execution/command/views.scala| 12 +++- .../org/apache/spark/sql/execution/SQLViewTestSuite.scala | 15 +++ 2 files changed, 26 insertions(+), 1 deletion(-) diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala index 960fe4a..ef0e90d 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala @@ -353,8 +353,18 @@ object ViewHelper { "spark.sql.shuffle.", "spark.sql.adaptive.") + private val configAllowList = Seq( +SQLConf.DISABLE_HINTS.key + ) + + /** + * Capture view config either of: + * 1. exists in allowList + * 2. do not exists in denyList + */ private def shouldCaptureConfig(key: String): Boolean = { -!configPrefixDenyList.exists(prefix => key.startsWith(prefix)) +configAllowList.exists(prefix => key.equals(prefix)) || + !configPrefixDenyList.exists(prefix => key.startsWith(prefix)) } import CatalogTable._ diff --git a/sql/core/src/test/scala/org/apache/spark/sql/execution/SQLViewTestSuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/execution/SQLViewTestSuite.scala index 84a20bb..88218b1 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/execution/SQLViewTestSuite.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/execution/SQLViewTestSuite.scala @@ -18,6 +18,7 @@ package org.apache.spark.sql.execution import org.apache.spark.sql.{AnalysisException, QueryTest, Row} +import org.apache.spark.sql.catalyst.plans.logical.Repartition import org.apache.spark.sql.internal.SQLConf._ import org.apache.spark.sql.test.{SharedSparkSession, SQLTestUtils} @@ -278,6 +279,20 @@ abstract class SQLViewTestSuite extends QueryTest with SQLTestUtils { } } } + + test("SPARK-34613: Fix view does not capture disable hint config") { +withSQLConf(DISABLE_HINTS.key -> "true") { + val viewName = createView("v1", "SELECT /*+ repartition(1) */ 1") + withView(viewName) { +assert( + sql(s"SELECT * FROM $viewName").queryExecution.analyzed.collect { +case e: Repartition => e + }.isEmpty +) +checkViewOutput(viewName, Seq(Row(1))) + } +} + } } class LocalTempViewTestSuite extends SQLViewTestSuite with SharedSparkSession { - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (dbce74d -> 814d81c)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from dbce74d [SPARK-34607][SQL] Add `Utils.isMemberClass` to fix a malformed class name error on jdk8u add 814d81c [SPARK-34376][SQL] Support regexp as a SQL function No new revisions were added by this update. Summary of changes: .../sql/catalyst/analysis/FunctionRegistry.scala | 1 + .../sql-functions/sql-expression-schema.md | 3 +- .../sql-tests/inputs/regexp-functions.sql | 6 +++- .../sql-tests/results/regexp-functions.sql.out | 34 +- 4 files changed, 41 insertions(+), 3 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (9ac5ee2e -> dbce74d)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 9ac5ee2e [SPARK-32924][WEBUI] Make duration column in master UI sorted in the correct order add dbce74d [SPARK-34607][SQL] Add `Utils.isMemberClass` to fix a malformed class name error on jdk8u No new revisions were added by this update. Summary of changes: .../main/scala/org/apache/spark/util/Utils.scala | 28 + .../spark/sql/catalyst/encoders/OuterScopes.scala | 2 +- .../sql/catalyst/expressions/objects/objects.scala | 2 +- .../catalyst/encoders/ExpressionEncoderSuite.scala | 70 ++ 4 files changed, 100 insertions(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.4 updated: [SPARK-32924][WEBUI] Make duration column in master UI sorted in the correct order
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new eb4601e [SPARK-32924][WEBUI] Make duration column in master UI sorted in the correct order eb4601e is described below commit eb4601e7e9c29d775224068f1e7b4a1df8040da1 Author: Baohe Zhang AuthorDate: Thu Mar 4 15:37:33 2021 -0800 [SPARK-32924][WEBUI] Make duration column in master UI sorted in the correct order ### What changes were proposed in this pull request? Make the "duration" column in standalone mode master UI sorted by numeric duration, hence the column can be sorted by the correct order. Before changes: ![image](https://user-images.githubusercontent.com/26694233/110025426-f5a49300-7cf4-11eb-86f0-2febade86be9.png) After changes: ![image](https://user-images.githubusercontent.com/26694233/110025604-33092080-7cf5-11eb-8b34-215688faf56d.png) ### Why are the changes needed? Fix a UI bug to make the sorting consistent across different pages. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Ran several apps with different durations and verified the duration column on the master page can be sorted correctly. Closes #31743 from baohe-zhang/SPARK-32924. Authored-by: Baohe Zhang Signed-off-by: Dongjoon Hyun (cherry picked from commit 9ac5ee2e17ca491eabf2e6e7d33ce7cfb5a002a7) Signed-off-by: Dongjoon Hyun --- .../src/main/scala/org/apache/spark/deploy/master/ui/MasterPage.scala | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/core/src/main/scala/org/apache/spark/deploy/master/ui/MasterPage.scala b/core/src/main/scala/org/apache/spark/deploy/master/ui/MasterPage.scala index b8afe20..ab5090c 100644 --- a/core/src/main/scala/org/apache/spark/deploy/master/ui/MasterPage.scala +++ b/core/src/main/scala/org/apache/spark/deploy/master/ui/MasterPage.scala @@ -277,7 +277,9 @@ private[ui] class MasterPage(parent: MasterWebUI) extends WebUIPage("") { {UIUtils.formatDate(app.submitDate)} {app.desc.user} {app.state.toString} - {UIUtils.formatDuration(app.duration)} + +{UIUtils.formatDuration(app.duration)} + } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-32924][WEBUI] Make duration column in master UI sorted in the correct order
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 4574d99 [SPARK-32924][WEBUI] Make duration column in master UI sorted in the correct order 4574d99 is described below commit 4574d99b63b447347c4b65021418a3b81fef5db6 Author: Baohe Zhang AuthorDate: Thu Mar 4 15:37:33 2021 -0800 [SPARK-32924][WEBUI] Make duration column in master UI sorted in the correct order ### What changes were proposed in this pull request? Make the "duration" column in standalone mode master UI sorted by numeric duration, hence the column can be sorted by the correct order. Before changes: ![image](https://user-images.githubusercontent.com/26694233/110025426-f5a49300-7cf4-11eb-86f0-2febade86be9.png) After changes: ![image](https://user-images.githubusercontent.com/26694233/110025604-33092080-7cf5-11eb-8b34-215688faf56d.png) ### Why are the changes needed? Fix a UI bug to make the sorting consistent across different pages. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Ran several apps with different durations and verified the duration column on the master page can be sorted correctly. Closes #31743 from baohe-zhang/SPARK-32924. Authored-by: Baohe Zhang Signed-off-by: Dongjoon Hyun (cherry picked from commit 9ac5ee2e17ca491eabf2e6e7d33ce7cfb5a002a7) Signed-off-by: Dongjoon Hyun --- .../src/main/scala/org/apache/spark/deploy/master/ui/MasterPage.scala | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/core/src/main/scala/org/apache/spark/deploy/master/ui/MasterPage.scala b/core/src/main/scala/org/apache/spark/deploy/master/ui/MasterPage.scala index fcbeba9..85000a0 100644 --- a/core/src/main/scala/org/apache/spark/deploy/master/ui/MasterPage.scala +++ b/core/src/main/scala/org/apache/spark/deploy/master/ui/MasterPage.scala @@ -309,7 +309,9 @@ private[ui] class MasterPage(parent: MasterWebUI) extends WebUIPage("") { {UIUtils.formatDate(app.submitDate)} {app.desc.user} {app.state.toString} - {UIUtils.formatDuration(app.duration)} + +{UIUtils.formatDuration(app.duration)} + } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.1 updated: [SPARK-32924][WEBUI] Make duration column in master UI sorted in the correct order
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch branch-3.1 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.1 by this push: new ee472fa [SPARK-32924][WEBUI] Make duration column in master UI sorted in the correct order ee472fa is described below commit ee472fa0569f7dd7bb649fc72b77642c4311dd11 Author: Baohe Zhang AuthorDate: Thu Mar 4 15:37:33 2021 -0800 [SPARK-32924][WEBUI] Make duration column in master UI sorted in the correct order ### What changes were proposed in this pull request? Make the "duration" column in standalone mode master UI sorted by numeric duration, hence the column can be sorted by the correct order. Before changes: ![image](https://user-images.githubusercontent.com/26694233/110025426-f5a49300-7cf4-11eb-86f0-2febade86be9.png) After changes: ![image](https://user-images.githubusercontent.com/26694233/110025604-33092080-7cf5-11eb-8b34-215688faf56d.png) ### Why are the changes needed? Fix a UI bug to make the sorting consistent across different pages. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Ran several apps with different durations and verified the duration column on the master page can be sorted correctly. Closes #31743 from baohe-zhang/SPARK-32924. Authored-by: Baohe Zhang Signed-off-by: Dongjoon Hyun (cherry picked from commit 9ac5ee2e17ca491eabf2e6e7d33ce7cfb5a002a7) Signed-off-by: Dongjoon Hyun --- .../src/main/scala/org/apache/spark/deploy/master/ui/MasterPage.scala | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/core/src/main/scala/org/apache/spark/deploy/master/ui/MasterPage.scala b/core/src/main/scala/org/apache/spark/deploy/master/ui/MasterPage.scala index 9e1f753..1dda683 100644 --- a/core/src/main/scala/org/apache/spark/deploy/master/ui/MasterPage.scala +++ b/core/src/main/scala/org/apache/spark/deploy/master/ui/MasterPage.scala @@ -309,7 +309,9 @@ private[ui] class MasterPage(parent: MasterWebUI) extends WebUIPage("") { {UIUtils.formatDate(app.submitDate)} {app.desc.user} {app.state.toString} - {UIUtils.formatDuration(app.duration)} + +{UIUtils.formatDuration(app.duration)} + } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (17601e0 -> 9ac5ee2e)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 17601e0 [SPARK-34605][SQL] Support `java.time.Duration` as an external type of the day-time interval type add 9ac5ee2e [SPARK-32924][WEBUI] Make duration column in master UI sorted in the correct order No new revisions were added by this update. Summary of changes: .../src/main/scala/org/apache/spark/deploy/master/ui/MasterPage.scala | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (e7e0161 -> 17601e0)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from e7e0161 [SPARK-34482][SS] Correct the active SparkSession for StreamExecution.logicalPlan add 17601e0 [SPARK-34605][SQL] Support `java.time.Duration` as an external type of the day-time interval type No new revisions were added by this update. Summary of changes: .../expressions/SpecializedGettersReader.java | 3 ++ .../main/scala/org/apache/spark/sql/Encoders.scala | 8 + .../sql/catalyst/CatalystTypeConverters.scala | 16 - .../sql/catalyst/DeserializerBuildHelper.scala | 11 +- .../apache/spark/sql/catalyst/InternalRow.scala| 6 ++-- .../spark/sql/catalyst/JavaTypeInference.scala | 6 .../spark/sql/catalyst/ScalaReflection.scala | 14 ++-- .../spark/sql/catalyst/SerializerBuildHelper.scala | 11 +- .../apache/spark/sql/catalyst/dsl/package.scala| 5 +++ .../spark/sql/catalyst/encoders/RowEncoder.scala | 7 .../expressions/InterpretedUnsafeProjection.scala | 2 +- .../catalyst/expressions/SpecificInternalRow.scala | 4 +-- .../expressions/codegen/CodeGenerator.scala| 4 +-- .../spark/sql/catalyst/expressions/literals.scala | 10 -- .../spark/sql/catalyst/util/IntervalUtils.scala| 29 .../org/apache/spark/sql/types/DataType.scala | 3 +- .../sql/catalyst/CatalystTypeConvertersSuite.scala | 39 -- .../sql/catalyst/encoders/RowEncoderSuite.scala| 12 ++- .../expressions/LiteralExpressionSuite.scala | 20 ++- .../sql/catalyst/util/IntervalUtilsSuite.scala | 22 .../scala/org/apache/spark/sql/SQLImplicits.scala | 3 ++ .../org/apache/spark/sql/JavaDatasetSuite.java | 9 + .../scala/org/apache/spark/sql/DatasetSuite.scala | 5 +++ 23 files changed, 229 insertions(+), 20 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.1 updated: [SPARK-34482][SS] Correct the active SparkSession for StreamExecution.logicalPlan
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a commit to branch branch-3.1 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.1 by this push: new b1b31ce [SPARK-34482][SS] Correct the active SparkSession for StreamExecution.logicalPlan b1b31ce is described below commit b1b31ce4e27db55aaf74920c6ebb406aa5ee327e Author: yi.wu AuthorDate: Thu Mar 4 22:41:11 2021 +0800 [SPARK-34482][SS] Correct the active SparkSession for StreamExecution.logicalPlan ### What changes were proposed in this pull request? Set the active SparkSession to `sparkSessionForStream` and diable AQE & CBO before initializing the `StreamExecution.logicalPlan`. ### Why are the changes needed? The active session should be `sparkSessionForStream`. Otherwise, settings like https://github.com/apache/spark/blob/6b34745cb9b294c91cd126c2ea44c039ee83cb84/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala#L332-L335 wouldn't take effect if callers access them from the active SQLConf, e.g., the rule of `InsertAdaptiveSparkPlan`. Besides, unlike `InsertAdaptiveSparkPlan` (which skips streaming plan), `CostBasedJoinReorder` seems to have the chance to take effect theoretically. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Tested manually. Before the fix, `InsertAdaptiveSparkPlan` would try to apply AQE on the plan(wouldn't take effect though). After this fix, the rule returns directly. Closes #31600 from Ngone51/active-session-for-stream. Authored-by: yi.wu Signed-off-by: Wenchen Fan (cherry picked from commit e7e016192f882cfb430d706c2099e58e1bcc014c) Signed-off-by: Wenchen Fan --- .../sql/execution/streaming/StreamExecution.scala | 42 +++--- .../apache/spark/sql/streaming/StreamSuite.scala | 33 - 2 files changed, 54 insertions(+), 21 deletions(-) diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala index 6b0d33b..1b145f2 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala @@ -314,26 +314,28 @@ abstract class StreamExecution( startLatch.countDown() // While active, repeatedly attempt to run batches. - SparkSession.setActiveSession(sparkSession) - - updateStatusMessage("Initializing sources") - // force initialization of the logical plan so that the sources can be created - logicalPlan - - // Adaptive execution can change num shuffle partitions, disallow - sparkSessionForStream.conf.set(SQLConf.ADAPTIVE_EXECUTION_ENABLED.key, "false") - // Disable cost-based join optimization as we do not want stateful operations to be rearranged - sparkSessionForStream.conf.set(SQLConf.CBO_ENABLED.key, "false") - offsetSeqMetadata = OffsetSeqMetadata( -batchWatermarkMs = 0, batchTimestampMs = 0, sparkSessionForStream.conf) - - if (state.compareAndSet(INITIALIZING, ACTIVE)) { -// Unblock `awaitInitialization` -initializationLatch.countDown() -runActivatedStream(sparkSessionForStream) -updateStatusMessage("Stopped") - } else { -// `stop()` is already called. Let `finally` finish the cleanup. + sparkSessionForStream.withActive { +// Adaptive execution can change num shuffle partitions, disallow +sparkSessionForStream.conf.set(SQLConf.ADAPTIVE_EXECUTION_ENABLED.key, "false") +// Disable cost-based join optimization as we do not want stateful operations +// to be rearranged +sparkSessionForStream.conf.set(SQLConf.CBO_ENABLED.key, "false") + +updateStatusMessage("Initializing sources") +// force initialization of the logical plan so that the sources can be created +logicalPlan + +offsetSeqMetadata = OffsetSeqMetadata( + batchWatermarkMs = 0, batchTimestampMs = 0, sparkSessionForStream.conf) + +if (state.compareAndSet(INITIALIZING, ACTIVE)) { + // Unblock `awaitInitialization` + initializationLatch.countDown() + runActivatedStream(sparkSessionForStream) + updateStatusMessage("Stopped") +} else { + // `stop()` is already called. Let `finally` finish the cleanup. +} } } catch { case e if isInterruptedByStop(e, sparkSession.sparkContext) => diff --git a/sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamSuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamSuite.scala index ed284df..0d2d00f 100644 ---
[spark] branch master updated: [SPARK-34482][SS] Correct the active SparkSession for StreamExecution.logicalPlan
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new e7e0161 [SPARK-34482][SS] Correct the active SparkSession for StreamExecution.logicalPlan e7e0161 is described below commit e7e016192f882cfb430d706c2099e58e1bcc014c Author: yi.wu AuthorDate: Thu Mar 4 22:41:11 2021 +0800 [SPARK-34482][SS] Correct the active SparkSession for StreamExecution.logicalPlan ### What changes were proposed in this pull request? Set the active SparkSession to `sparkSessionForStream` and diable AQE & CBO before initializing the `StreamExecution.logicalPlan`. ### Why are the changes needed? The active session should be `sparkSessionForStream`. Otherwise, settings like https://github.com/apache/spark/blob/6b34745cb9b294c91cd126c2ea44c039ee83cb84/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala#L332-L335 wouldn't take effect if callers access them from the active SQLConf, e.g., the rule of `InsertAdaptiveSparkPlan`. Besides, unlike `InsertAdaptiveSparkPlan` (which skips streaming plan), `CostBasedJoinReorder` seems to have the chance to take effect theoretically. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Tested manually. Before the fix, `InsertAdaptiveSparkPlan` would try to apply AQE on the plan(wouldn't take effect though). After this fix, the rule returns directly. Closes #31600 from Ngone51/active-session-for-stream. Authored-by: yi.wu Signed-off-by: Wenchen Fan --- .../sql/execution/streaming/StreamExecution.scala | 42 +++--- .../apache/spark/sql/streaming/StreamSuite.scala | 33 - 2 files changed, 54 insertions(+), 21 deletions(-) diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala index 67803ad..ae010f9 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala @@ -323,26 +323,28 @@ abstract class StreamExecution( startLatch.countDown() // While active, repeatedly attempt to run batches. - SparkSession.setActiveSession(sparkSession) - - updateStatusMessage("Initializing sources") - // force initialization of the logical plan so that the sources can be created - logicalPlan - - // Adaptive execution can change num shuffle partitions, disallow - sparkSessionForStream.conf.set(SQLConf.ADAPTIVE_EXECUTION_ENABLED.key, "false") - // Disable cost-based join optimization as we do not want stateful operations to be rearranged - sparkSessionForStream.conf.set(SQLConf.CBO_ENABLED.key, "false") - offsetSeqMetadata = OffsetSeqMetadata( -batchWatermarkMs = 0, batchTimestampMs = 0, sparkSessionForStream.conf) - - if (state.compareAndSet(INITIALIZING, ACTIVE)) { -// Unblock `awaitInitialization` -initializationLatch.countDown() -runActivatedStream(sparkSessionForStream) -updateStatusMessage("Stopped") - } else { -// `stop()` is already called. Let `finally` finish the cleanup. + sparkSessionForStream.withActive { +// Adaptive execution can change num shuffle partitions, disallow +sparkSessionForStream.conf.set(SQLConf.ADAPTIVE_EXECUTION_ENABLED.key, "false") +// Disable cost-based join optimization as we do not want stateful operations +// to be rearranged +sparkSessionForStream.conf.set(SQLConf.CBO_ENABLED.key, "false") + +updateStatusMessage("Initializing sources") +// force initialization of the logical plan so that the sources can be created +logicalPlan + +offsetSeqMetadata = OffsetSeqMetadata( + batchWatermarkMs = 0, batchTimestampMs = 0, sparkSessionForStream.conf) + +if (state.compareAndSet(INITIALIZING, ACTIVE)) { + // Unblock `awaitInitialization` + initializationLatch.countDown() + runActivatedStream(sparkSessionForStream) + updateStatusMessage("Stopped") +} else { + // `stop()` is already called. Let `finally` finish the cleanup. +} } } catch { case e if isInterruptedByStop(e, sparkSession.sparkContext) => diff --git a/sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamSuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamSuite.scala index c4e43d2..fb6922a 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamSuite.scala +++
[spark] branch master updated (2b1c170 -> 401e270)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 2b1c170 [SPARK-34614][SQL] ANSI mode: Casting String to Boolean should throw exception on parse error add 401e270 [SPARK-34567][SQL] CreateTableAsSelect should update metrics too No new revisions were added by this update. Summary of changes: .../sql/execution/command/DataWritingCommand.scala | 27 -- .../execution/command/createDataSourceTables.scala | 2 +- .../sql/execution/datasources/DataSource.scala | 5 +++- .../sql/execution/metric/SQLMetricsSuite.scala | 17 ++ .../execution/CreateHiveTableAsSelectCommand.scala | 2 ++ .../spark/sql/hive/execution/SQLMetricsSuite.scala | 27 ++ 6 files changed, 76 insertions(+), 4 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.1 updated: [SPARK-34567][SQL] CreateTableAsSelect should update metrics too
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a commit to branch branch-3.1 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.1 by this push: new e3cadc3 [SPARK-34567][SQL] CreateTableAsSelect should update metrics too e3cadc3 is described below commit e3cadc32102fb502e3b8ac5f6bacb39ec80df26a Author: Angerszh AuthorDate: Thu Mar 4 20:42:47 2021 +0800 [SPARK-34567][SQL] CreateTableAsSelect should update metrics too ### What changes were proposed in this pull request? For command `CreateTableAsSelect` we use `InsertIntoHiveTable`, `InsertIntoHadoopFsRelationCommand` to insert data. We will update metrics of `InsertIntoHiveTable`, `InsertIntoHadoopFsRelationCommand` in `FileFormatWriter.write()`, but we only show CreateTableAsSelectCommand in WebUI SQL Tab. We need to update `CreateTableAsSelectCommand`'s metrics too. Before this PR: ![image](https://user-images.githubusercontent.com/46485123/109411226-81f44480-79db-11eb-99cb-b9686b15bf61.png) After this PR: ![image](https://user-images.githubusercontent.com/46485123/109411232-8ae51600-79db-11eb-9111-3bea0bc2d475.png) ![image](https://user-images.githubusercontent.com/46485123/109905192-62aa2f80-7cd9-11eb-91f9-04b16c9238ae.png) ### Why are the changes needed? Complete SQL Metrics ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested?
[spark] branch master updated (53e4dba -> 2b1c170)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 53e4dba [SPARK-34599][SQL] Fix the issue that INSERT INTO OVERWRITE doesn't support partition columns containing dot for DSv2 add 2b1c170 [SPARK-34614][SQL] ANSI mode: Casting String to Boolean should throw exception on parse error No new revisions were added by this update. Summary of changes: docs/sql-ref-ansi-compliance.md| 1 + .../spark/sql/catalyst/expressions/Cast.scala | 14 +- .../spark/sql/catalyst/expressions/CastSuite.scala | 244 + .../sql-tests/results/postgreSQL/boolean.sql.out | 85 +++ 4 files changed, 264 insertions(+), 80 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org