[GitHub] [spark] vli-databricks opened a new pull request, #36791: [SPARK-39392][SQL][3.3] Refine ANSI error messages for try_* function…

2022-06-07 Thread GitBox
vli-databricks opened a new pull request, #36791: URL: https://github.com/apache/spark/pull/36791 … hints ### What changes were proposed in this pull request? Refine ANSI error messages and remove 'To return NULL instead' ### Why are the changes needed? Imp

[GitHub] [spark] dongjoon-hyun commented on pull request #35561: [MINOR][DOCS] Fixed closing tags in running-on-kubernetes.md

2022-06-07 Thread GitBox
dongjoon-hyun commented on PR #35561: URL: https://github.com/apache/spark/pull/35561#issuecomment-1148929814 Hi, @zr-msft . Did you check Apache Spark 3.3 RC5 document? It should be there. - https://dist.apache.org/repos/dist/dev/spark/v3.3.0-rc5-docs/_site/index.html For branch

[GitHub] [spark] zr-msft commented on pull request #35561: [MINOR][DOCS] Fixed closing tags in running-on-kubernetes.md

2022-06-07 Thread GitBox
zr-msft commented on PR #35561: URL: https://github.com/apache/spark/pull/35561#issuecomment-1148923071 @dongjoon-hyun I've periodically checked the docs site and I'm not seeing any changes show up based on commits i've added from this PR: * https://spark.apache.org/docs/latest/running-o

[GitHub] [spark] pan3793 commented on pull request #36784: [SPARK-39396][SQL] Fix LDAP login exception 'error code 49 - invalid credentials'

2022-06-07 Thread GitBox
pan3793 commented on PR #36784: URL: https://github.com/apache/spark/pull/36784#issuecomment-1148900904 Thanks for ping me, I think the current `LdapAuthenticationProviderImpl` comes from a very old version of Hive w/o UT, so the exsiting UT can not cover your change. The `LdapAuthentica

[GitHub] [spark] dtenedor commented on a diff in pull request #36745: [SPARK-39359][SQL] Restrict DEFAULT columns to allowlist of supported data source types

2022-06-07 Thread GitBox
dtenedor commented on code in PR #36745: URL: https://github.com/apache/spark/pull/36745#discussion_r891452022 ## sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala: ## @@ -2881,6 +2881,15 @@ object SQLConf { .booleanConf .createWithDefault(tru

[GitHub] [spark] wangyum opened a new pull request, #36790: [SPARK-39402][SQL] Optimize ReplaceCTERefWithRepartition to support coalesce partitions

2022-06-07 Thread GitBox
wangyum opened a new pull request, #36790: URL: https://github.com/apache/spark/pull/36790 ### What changes were proposed in this pull request? Optimize `ReplaceCTERefWithRepartition` to support coalesce partitions. For example: Before this PR | After this PR -- | -- ![i

[GitHub] [spark] pan3793 opened a new pull request, #36789: [SPARK-39403] Add SPARK_SUBMIT_OPTS in spark-env.sh.template

2022-06-07 Thread GitBox
pan3793 opened a new pull request, #36789: URL: https://github.com/apache/spark/pull/36789 ### What changes were proposed in this pull request? Add SPARK_SUBMIT_OPTS in spark-env.sh.template ### Why are the changes needed? Spark support using SPARK_SUBMIT_OPTS to

[GitHub] [spark] srielau commented on a diff in pull request #36693: [SPARK-39349] Add a centralized CheckError method for QA of error path

2022-06-07 Thread GitBox
srielau commented on code in PR #36693: URL: https://github.com/apache/spark/pull/36693#discussion_r891425678 ## core/src/main/java/org/apache/spark/memory/SparkOutOfMemoryError.java: ## @@ -39,11 +39,17 @@ public SparkOutOfMemoryError(OutOfMemoryError e) { } public

[GitHub] [spark] srielau commented on a diff in pull request #36693: [SPARK-39349] Add a centralized CheckError method for QA of error path

2022-06-07 Thread GitBox
srielau commented on code in PR #36693: URL: https://github.com/apache/spark/pull/36693#discussion_r891425369 ## core/src/test/scala/org/apache/spark/SparkFunSuite.scala: ## @@ -264,6 +264,87 @@ abstract class SparkFunSuite } } + /** + * Checks an exception with an

[GitHub] [spark] cloud-fan commented on a diff in pull request #36150: [SPARK-38864][SQL] Add melt / unpivot to Dataset

2022-06-07 Thread GitBox
cloud-fan commented on code in PR #36150: URL: https://github.com/apache/spark/pull/36150#discussion_r891409655 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala: ## @@ -736,6 +737,24 @@ abstract class TypeCoercionBase { } } + /*

[GitHub] [spark] cloud-fan commented on a diff in pull request #36150: [SPARK-38864][SQL] Add melt / unpivot to Dataset

2022-06-07 Thread GitBox
cloud-fan commented on code in PR #36150: URL: https://github.com/apache/spark/pull/36150#discussion_r891408643 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala: ## @@ -736,6 +737,24 @@ abstract class TypeCoercionBase { } } + /*

[GitHub] [spark] srielau commented on a diff in pull request #36693: [SPARK-39349] Add a centralized CheckError method for QA of error path

2022-06-07 Thread GitBox
srielau commented on code in PR #36693: URL: https://github.com/apache/spark/pull/36693#discussion_r891404260 ## sql/catalyst/src/main/scala/org/apache/spark/sql/AnalysisException.scala: ## @@ -36,13 +36,31 @@ class AnalysisException protected[sql] ( @transient val plan: Op

[GitHub] [spark] cloud-fan commented on a diff in pull request #36150: [SPARK-38864][SQL] Add melt / unpivot to Dataset

2022-06-07 Thread GitBox
cloud-fan commented on code in PR #36150: URL: https://github.com/apache/spark/pull/36150#discussion_r891400243 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala: ## @@ -806,6 +825,7 @@ abstract class TypeCoercionBase { object TypeCoercion

[GitHub] [spark] cloud-fan commented on a diff in pull request #36150: [SPARK-38864][SQL] Add melt / unpivot to Dataset

2022-06-07 Thread GitBox
cloud-fan commented on code in PR #36150: URL: https://github.com/apache/spark/pull/36150#discussion_r891400243 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala: ## @@ -806,6 +825,7 @@ abstract class TypeCoercionBase { object TypeCoercion

[GitHub] [spark] cloud-fan commented on a diff in pull request #36150: [SPARK-38864][SQL] Add melt / unpivot to Dataset

2022-06-07 Thread GitBox
cloud-fan commented on code in PR #36150: URL: https://github.com/apache/spark/pull/36150#discussion_r891397088 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala: ## @@ -1227,6 +1227,49 @@ case class Pivot( override protect

[GitHub] [spark] srielau commented on a diff in pull request #36693: [SPARK-39349] Add a centralized CheckError method for QA of error path

2022-06-07 Thread GitBox
srielau commented on code in PR #36693: URL: https://github.com/apache/spark/pull/36693#discussion_r891395982 ## core/src/test/scala/org/apache/spark/SparkFunSuite.scala: ## @@ -264,6 +264,87 @@ abstract class SparkFunSuite } } + /** + * Checks an exception with an

[GitHub] [spark] wangyum opened a new pull request, #36788: [SPARK-39401][SQL][TESTS] Replace withView with withTempView in CTEInlineSuite

2022-06-07 Thread GitBox
wangyum opened a new pull request, #36788: URL: https://github.com/apache/spark/pull/36788 ### What changes were proposed in this pull request? This PR replaces `withView` with `withTempView` in `CTEInlineSuite. ### Why are the changes needed? To use correct API. #

[GitHub] [spark] cloud-fan commented on a diff in pull request #36150: [SPARK-38864][SQL] Add melt / unpivot to Dataset

2022-06-07 Thread GitBox
cloud-fan commented on code in PR #36150: URL: https://github.com/apache/spark/pull/36150#discussion_r891394188 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala: ## @@ -1227,6 +1227,49 @@ case class Pivot( override protect

[GitHub] [spark] srielau commented on a diff in pull request #36693: [SPARK-39349] Add a centralized CheckError method for QA of error path

2022-06-07 Thread GitBox
srielau commented on code in PR #36693: URL: https://github.com/apache/spark/pull/36693#discussion_r891389538 ## core/src/main/scala/org/apache/spark/ErrorInfo.scala: ## @@ -98,6 +100,29 @@ private[spark] object SparkThrowableHelper { s"[$displayClass] $displayMessage$displ

[GitHub] [spark] srielau commented on a diff in pull request #36693: [SPARK-39349] Add a centralized CheckError method for QA of error path

2022-06-07 Thread GitBox
srielau commented on code in PR #36693: URL: https://github.com/apache/spark/pull/36693#discussion_r891373947 ## core/src/main/java/org/apache/spark/memory/SparkOutOfMemoryError.java: ## @@ -39,11 +39,17 @@ public SparkOutOfMemoryError(OutOfMemoryError e) { } public

[GitHub] [spark] srielau commented on a diff in pull request #36693: [SPARK-39349] Add a centralized CheckError method for QA of error path

2022-06-07 Thread GitBox
srielau commented on code in PR #36693: URL: https://github.com/apache/spark/pull/36693#discussion_r891355010 ## core/src/test/scala/org/apache/spark/SparkFunSuite.scala: ## @@ -264,6 +264,87 @@ abstract class SparkFunSuite } } + /** + * Checks an exception with an

[GitHub] [spark] srielau commented on a diff in pull request #36693: [SPARK-39349] Add a centralized CheckError method for QA of error path

2022-06-07 Thread GitBox
srielau commented on code in PR #36693: URL: https://github.com/apache/spark/pull/36693#discussion_r891349846 ## core/src/main/resources/error/error-classes.json: ## @@ -333,7 +332,7 @@ }, "SECOND_FUNCTION_ARGUMENT_NOT_INTEGER" : { "message" : [ - "The second arg

[GitHub] [spark] cloud-fan commented on a diff in pull request #36586: [SPARK-39236][SQL] Make CreateTable and ListTables be compatible with 3 layer namespace

2022-06-07 Thread GitBox
cloud-fan commented on code in PR #36586: URL: https://github.com/apache/spark/pull/36586#discussion_r891346881 ## sql/core/src/test/scala/org/apache/spark/sql/internal/CatalogSuite.scala: ## @@ -553,4 +570,100 @@ class CatalogSuite extends SharedSparkSession with AnalysisTest

[GitHub] [spark] cloud-fan commented on a diff in pull request #36586: [SPARK-39236][SQL] Make CreateTable and ListTables be compatible with 3 layer namespace

2022-06-07 Thread GitBox
cloud-fan commented on code in PR #36586: URL: https://github.com/apache/spark/pull/36586#discussion_r891345079 ## sql/core/src/test/scala/org/apache/spark/sql/internal/CatalogSuite.scala: ## @@ -299,15 +314,19 @@ class CatalogSuite extends SharedSparkSession with AnalysisTest

[GitHub] [spark] cloud-fan commented on a diff in pull request #36586: [SPARK-39236][SQL] Make CreateTable and ListTables be compatible with 3 layer namespace

2022-06-07 Thread GitBox
cloud-fan commented on code in PR #36586: URL: https://github.com/apache/spark/pull/36586#discussion_r891344711 ## sql/core/src/test/scala/org/apache/spark/sql/internal/CatalogSuite.scala: ## @@ -299,15 +314,19 @@ class CatalogSuite extends SharedSparkSession with AnalysisTest

[GitHub] [spark] cloud-fan commented on a diff in pull request #36586: [SPARK-39236][SQL] Make CreateTable and ListTables be compatible with 3 layer namespace

2022-06-07 Thread GitBox
cloud-fan commented on code in PR #36586: URL: https://github.com/apache/spark/pull/36586#discussion_r891344191 ## sql/core/src/test/scala/org/apache/spark/sql/internal/CatalogSuite.scala: ## @@ -290,7 +304,8 @@ class CatalogSuite extends SharedSparkSession with AnalysisTest {

[GitHub] [spark] cloud-fan commented on a diff in pull request #36586: [SPARK-39236][SQL] Make CreateTable and ListTables be compatible with 3 layer namespace

2022-06-07 Thread GitBox
cloud-fan commented on code in PR #36586: URL: https://github.com/apache/spark/pull/36586#discussion_r891341003 ## sql/core/src/main/scala/org/apache/spark/sql/internal/CatalogImpl.scala: ## @@ -117,14 +128,44 @@ class CatalogImpl(sparkSession: SparkSession) extends Catalog {

[GitHub] [spark] cloud-fan commented on a diff in pull request #36586: [SPARK-39236][SQL] Make CreateTable and ListTables be compatible with 3 layer namespace

2022-06-07 Thread GitBox
cloud-fan commented on code in PR #36586: URL: https://github.com/apache/spark/pull/36586#discussion_r891340370 ## sql/core/src/main/scala/org/apache/spark/sql/internal/CatalogImpl.scala: ## @@ -117,14 +128,44 @@ class CatalogImpl(sparkSession: SparkSession) extends Catalog {

[GitHub] [spark] cloud-fan commented on a diff in pull request #36586: [SPARK-39236][SQL] Make CreateTable and ListTables be compatible with 3 layer namespace

2022-06-07 Thread GitBox
cloud-fan commented on code in PR #36586: URL: https://github.com/apache/spark/pull/36586#discussion_r891339983 ## sql/core/src/main/scala/org/apache/spark/sql/catalog/interface.scala: ## @@ -55,7 +55,8 @@ class Database( * A table in Spark, as returned by the `listTables` met

[GitHub] [spark] cloud-fan commented on a diff in pull request #36586: [SPARK-39236][SQL] Make CreateTable and ListTables be compatible with 3 layer namespace

2022-06-07 Thread GitBox
cloud-fan commented on code in PR #36586: URL: https://github.com/apache/spark/pull/36586#discussion_r891337356 ## sql/core/src/main/scala/org/apache/spark/sql/internal/CatalogImpl.scala: ## @@ -97,8 +98,18 @@ class CatalogImpl(sparkSession: SparkSession) extends Catalog {

[GitHub] [spark] Ngone51 commented on a diff in pull request #36162: [SPARK-32170][CORE] Improve the speculation through the stage task metrics.

2022-06-07 Thread GitBox
Ngone51 commented on code in PR #36162: URL: https://github.com/apache/spark/pull/36162#discussion_r891337273 ## core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala: ## @@ -1108,45 +1164,48 @@ private[spark] class TaskSetManager( // `successfulTaskDurations`

[GitHub] [spark] cloud-fan commented on a diff in pull request #36586: [SPARK-39236][SQL] Make CreateTable and ListTables be compatible with 3 layer namespace

2022-06-07 Thread GitBox
cloud-fan commented on code in PR #36586: URL: https://github.com/apache/spark/pull/36586#discussion_r891336456 ## sql/core/src/main/scala/org/apache/spark/sql/catalog/interface.scala: ## @@ -64,15 +65,34 @@ class Database( @Stable class Table( val name: String, -@Nul

[GitHub] [spark] cloud-fan commented on a diff in pull request #36586: [SPARK-39236][SQL] Make CreateTable and ListTables be compatible with 3 layer namespace

2022-06-07 Thread GitBox
cloud-fan commented on code in PR #36586: URL: https://github.com/apache/spark/pull/36586#discussion_r891335856 ## sql/core/src/main/scala/org/apache/spark/sql/catalog/interface.scala: ## @@ -64,15 +65,34 @@ class Database( @Stable class Table( val name: String, -@Nul

[GitHub] [spark] Ngone51 commented on a diff in pull request #36162: [SPARK-32170][CORE] Improve the speculation through the stage task metrics.

2022-06-07 Thread GitBox
Ngone51 commented on code in PR #36162: URL: https://github.com/apache/spark/pull/36162#discussion_r891335474 ## core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala: ## @@ -1217,6 +1260,71 @@ private[spark] class TaskSetManager( def executorAdded(): Unit = {

[GitHub] [spark] Ngone51 commented on a diff in pull request #36162: [SPARK-32170][CORE] Improve the speculation through the stage task metrics.

2022-06-07 Thread GitBox
Ngone51 commented on code in PR #36162: URL: https://github.com/apache/spark/pull/36162#discussion_r891334508 ## core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala: ## @@ -1217,6 +1260,71 @@ private[spark] class TaskSetManager( def executorAdded(): Unit = {

[GitHub] [spark] cloud-fan commented on a diff in pull request #36586: [SPARK-39236][SQL] Make CreateTable and ListTables be compatible with 3 layer namespace

2022-06-07 Thread GitBox
cloud-fan commented on code in PR #36586: URL: https://github.com/apache/spark/pull/36586#discussion_r891334793 ## sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala: ## @@ -2185,6 +2185,11 @@ object QueryCompilationErrors extends QueryErrorsBas

[GitHub] [spark] srowen commented on a diff in pull request #36737: [SPARK-39347] [SS] Generate wrong time window when (timestamp-startTime) % slideDuration…

2022-06-07 Thread GitBox
srowen commented on code in PR #36737: URL: https://github.com/apache/spark/pull/36737#discussion_r891330435 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala: ## @@ -3963,8 +3966,10 @@ object TimeWindowing extends Rule[LogicalPlan] {

[GitHub] [spark] Ngone51 commented on a diff in pull request #36162: [SPARK-32170][CORE] Improve the speculation through the stage task metrics.

2022-06-07 Thread GitBox
Ngone51 commented on code in PR #36162: URL: https://github.com/apache/spark/pull/36162#discussion_r891324885 ## core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala: ## @@ -1217,6 +1260,71 @@ private[spark] class TaskSetManager( def executorAdded(): Unit = {

[GitHub] [spark] Ngone51 commented on a diff in pull request #36162: [SPARK-32170][CORE] Improve the speculation through the stage task metrics.

2022-06-07 Thread GitBox
Ngone51 commented on code in PR #36162: URL: https://github.com/apache/spark/pull/36162#discussion_r891322239 ## core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala: ## @@ -1217,6 +1260,71 @@ private[spark] class TaskSetManager( def executorAdded(): Unit = {

[GitHub] [spark] wangyum commented on pull request #36787: [SPARK-39387][BUILD][FOLLOWUP] Upgrade hive-storage-api to 2.7.3

2022-06-07 Thread GitBox
wangyum commented on PR #36787: URL: https://github.com/apache/spark/pull/36787#issuecomment-1148767712 The 2.7.2 will throw runtime exception: ``` 22:38:20.734 ERROR org.apache.spark.util.Utils: Aborting task java.lang.RuntimeException: Overflow of newLength. smallBuffer.length=107

[GitHub] [spark] LuciferYang commented on pull request #36694: [MINOR][BUILD] Remove redundant maven `` definition

2022-06-07 Thread GitBox
LuciferYang commented on PR #36694: URL: https://github.com/apache/spark/pull/36694#issuecomment-1148767278 > Is it redundant because of the parent POM? Yes > yeah maybe but I don't think it hurts anything and it's 2 lines so just leave these as they are? --

[GitHub] [spark] cloud-fan commented on a diff in pull request #36693: [SPARK-39349] Add a centralized CheckError method for QA of error path

2022-06-07 Thread GitBox
cloud-fan commented on code in PR #36693: URL: https://github.com/apache/spark/pull/36693#discussion_r891320687 ## sql/catalyst/src/main/scala/org/apache/spark/sql/AnalysisException.scala: ## @@ -36,13 +36,31 @@ class AnalysisException protected[sql] ( @transient val plan:

[GitHub] [spark] Ngone51 commented on a diff in pull request #36162: [SPARK-32170][CORE] Improve the speculation through the stage task metrics.

2022-06-07 Thread GitBox
Ngone51 commented on code in PR #36162: URL: https://github.com/apache/spark/pull/36162#discussion_r891318626 ## core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala: ## @@ -863,6 +872,29 @@ private[spark] class TaskSchedulerImpl( executorUpdates) }

[GitHub] [spark] tgravescs commented on pull request #36716: [SPARK-39062][CORE] Add stage level resource scheduling support for standalone cluster

2022-06-07 Thread GitBox
tgravescs commented on PR #36716: URL: https://github.com/apache/spark/pull/36716#issuecomment-1148763057 > The feature is enabled when dynamic allocation enabled in standalone cluster. So last time I checked dynamic allocation in standalone mode had issues. Have this been addresse

[GitHub] [spark] Ngone51 commented on a diff in pull request #36162: [SPARK-32170][CORE] Improve the speculation through the stage task metrics.

2022-06-07 Thread GitBox
Ngone51 commented on code in PR #36162: URL: https://github.com/apache/spark/pull/36162#discussion_r891316334 ## core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala: ## @@ -800,6 +814,10 @@ private[spark] class TaskSetManager( info.markFinished(TaskState.FIN

[GitHub] [spark] cloud-fan commented on a diff in pull request #36693: [SPARK-39349] Add a centralized CheckError method for QA of error path

2022-06-07 Thread GitBox
cloud-fan commented on code in PR #36693: URL: https://github.com/apache/spark/pull/36693#discussion_r891314795 ## core/src/test/scala/org/apache/spark/SparkFunSuite.scala: ## @@ -264,6 +264,87 @@ abstract class SparkFunSuite } } + /** + * Checks an exception with a

[GitHub] [spark] Ngone51 commented on a diff in pull request #36162: [SPARK-32170][CORE] Improve the speculation through the stage task metrics.

2022-06-07 Thread GitBox
Ngone51 commented on code in PR #36162: URL: https://github.com/apache/spark/pull/36162#discussion_r891314239 ## core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala: ## @@ -1217,6 +1260,71 @@ private[spark] class TaskSetManager( def executorAdded(): Unit = {

[GitHub] [spark] Ngone51 commented on a diff in pull request #36162: [SPARK-32170][CORE] Improve the speculation through the stage task metrics.

2022-06-07 Thread GitBox
Ngone51 commented on code in PR #36162: URL: https://github.com/apache/spark/pull/36162#discussion_r891312743 ## core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala: ## @@ -1217,6 +1260,71 @@ private[spark] class TaskSetManager( def executorAdded(): Unit = {

[GitHub] [spark] cloud-fan commented on a diff in pull request #36693: [SPARK-39349] Add a centralized CheckError method for QA of error path

2022-06-07 Thread GitBox
cloud-fan commented on code in PR #36693: URL: https://github.com/apache/spark/pull/36693#discussion_r891312375 ## core/src/test/scala/org/apache/spark/SparkFunSuite.scala: ## @@ -264,6 +264,87 @@ abstract class SparkFunSuite } } + /** + * Checks an exception with a

[GitHub] [spark] Ngone51 commented on a diff in pull request #36162: [SPARK-32170][CORE] Improve the speculation through the stage task metrics.

2022-06-07 Thread GitBox
Ngone51 commented on code in PR #36162: URL: https://github.com/apache/spark/pull/36162#discussion_r891311964 ## core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala: ## @@ -1217,6 +1260,71 @@ private[spark] class TaskSetManager( def executorAdded(): Unit = {

[GitHub] [spark] cloud-fan commented on a diff in pull request #36693: [SPARK-39349] Add a centralized CheckError method for QA of error path

2022-06-07 Thread GitBox
cloud-fan commented on code in PR #36693: URL: https://github.com/apache/spark/pull/36693#discussion_r891311339 ## core/src/test/scala/org/apache/spark/SparkFunSuite.scala: ## @@ -264,6 +264,87 @@ abstract class SparkFunSuite } } + /** + * Checks an exception with a

[GitHub] [spark] cloud-fan commented on a diff in pull request #36693: [SPARK-39349] Add a centralized CheckError method for QA of error path

2022-06-07 Thread GitBox
cloud-fan commented on code in PR #36693: URL: https://github.com/apache/spark/pull/36693#discussion_r891309492 ## core/src/test/scala/org/apache/spark/SparkFunSuite.scala: ## @@ -264,6 +264,87 @@ abstract class SparkFunSuite } } + /** + * Checks an exception with a

[GitHub] [spark] cloud-fan commented on a diff in pull request #36693: [SPARK-39349] Add a centralized CheckError method for QA of error path

2022-06-07 Thread GitBox
cloud-fan commented on code in PR #36693: URL: https://github.com/apache/spark/pull/36693#discussion_r891302240 ## core/src/main/scala/org/apache/spark/ErrorInfo.scala: ## @@ -98,6 +100,29 @@ private[spark] object SparkThrowableHelper { s"[$displayClass] $displayMessage$dis

[GitHub] [spark] cloud-fan commented on a diff in pull request #36693: [SPARK-39349] Add a centralized CheckError method for QA of error path

2022-06-07 Thread GitBox
cloud-fan commented on code in PR #36693: URL: https://github.com/apache/spark/pull/36693#discussion_r891301726 ## core/src/main/scala/org/apache/spark/ErrorInfo.scala: ## @@ -73,18 +73,20 @@ private[spark] object SparkThrowableHelper { def getMessage( errorClass: St

[GitHub] [spark] Ngone51 commented on a diff in pull request #36162: [SPARK-32170][CORE] Improve the speculation through the stage task metrics.

2022-06-07 Thread GitBox
Ngone51 commented on code in PR #36162: URL: https://github.com/apache/spark/pull/36162#discussion_r891307188 ## core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala: ## @@ -1217,6 +1260,71 @@ private[spark] class TaskSetManager( def executorAdded(): Unit = {

[GitHub] [spark] cloud-fan commented on a diff in pull request #36693: [SPARK-39349] Add a centralized CheckError method for QA of error path

2022-06-07 Thread GitBox
cloud-fan commented on code in PR #36693: URL: https://github.com/apache/spark/pull/36693#discussion_r891304223 ## core/src/main/scala/org/apache/spark/ErrorInfo.scala: ## @@ -73,18 +73,20 @@ private[spark] object SparkThrowableHelper { def getMessage( errorClass: St

[GitHub] [spark] cloud-fan commented on a diff in pull request #36693: [SPARK-39349] Add a centralized CheckError method for QA of error path

2022-06-07 Thread GitBox
cloud-fan commented on code in PR #36693: URL: https://github.com/apache/spark/pull/36693#discussion_r891302240 ## core/src/main/scala/org/apache/spark/ErrorInfo.scala: ## @@ -98,6 +100,29 @@ private[spark] object SparkThrowableHelper { s"[$displayClass] $displayMessage$dis

[GitHub] [spark] cloud-fan commented on a diff in pull request #36693: [SPARK-39349] Add a centralized CheckError method for QA of error path

2022-06-07 Thread GitBox
cloud-fan commented on code in PR #36693: URL: https://github.com/apache/spark/pull/36693#discussion_r891301726 ## core/src/main/scala/org/apache/spark/ErrorInfo.scala: ## @@ -73,18 +73,20 @@ private[spark] object SparkThrowableHelper { def getMessage( errorClass: St

[GitHub] [spark] cloud-fan commented on a diff in pull request #36693: [SPARK-39349] Add a centralized CheckError method for QA of error path

2022-06-07 Thread GitBox
cloud-fan commented on code in PR #36693: URL: https://github.com/apache/spark/pull/36693#discussion_r891297431 ## core/src/main/java/org/apache/spark/memory/SparkOutOfMemoryError.java: ## @@ -39,11 +39,17 @@ public SparkOutOfMemoryError(OutOfMemoryError e) { } publi

[GitHub] [spark] cloud-fan commented on a diff in pull request #36693: [SPARK-39349] Add a centralized CheckError method for QA of error path

2022-06-07 Thread GitBox
cloud-fan commented on code in PR #36693: URL: https://github.com/apache/spark/pull/36693#discussion_r891293373 ## core/src/main/resources/error/error-classes.json: ## @@ -333,7 +332,7 @@ }, "SECOND_FUNCTION_ARGUMENT_NOT_INTEGER" : { "message" : [ - "The second a

[GitHub] [spark] cloud-fan commented on a diff in pull request #36693: [SPARK-39349] Add a centralized CheckError method for QA of error path

2022-06-07 Thread GitBox
cloud-fan commented on code in PR #36693: URL: https://github.com/apache/spark/pull/36693#discussion_r891292604 ## core/src/main/resources/error/error-classes.json: ## @@ -157,8 +157,7 @@ "See more details in SPARK-31404. You can set the SQL config or", "

[GitHub] [spark] cloud-fan commented on a diff in pull request #36693: [SPARK-39349] Add a centralized CheckError method for QA of error path

2022-06-07 Thread GitBox
cloud-fan commented on code in PR #36693: URL: https://github.com/apache/spark/pull/36693#discussion_r891290994 ## core/src/main/java/org/apache/spark/SparkThrowable.java: ## @@ -36,6 +36,10 @@ public interface SparkThrowable { // If null, error class is not set String get

[GitHub] [spark] srowen commented on a diff in pull request #36499: [SPARK-38846][SQL] Add explicit data mapping between Teradata Numeric Type and Spark DecimalType

2022-06-07 Thread GitBox
srowen commented on code in PR #36499: URL: https://github.com/apache/spark/pull/36499#discussion_r891287048 ## sql/core/src/main/scala/org/apache/spark/sql/jdbc/TeradataDialect.scala: ## @@ -96,4 +97,29 @@ private case object TeradataDialect extends JdbcDialect { override de

[GitHub] [spark] cloud-fan commented on a diff in pull request #36703: [SPARK-39321][SQL] Refactor TryCast to use RuntimeReplaceable

2022-06-07 Thread GitBox
cloud-fan commented on code in PR #36703: URL: https://github.com/apache/spark/pull/36703#discussion_r891256881 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala: ## @@ -1792,15 +1792,16 @@ class AstBuilder extends SqlBaseParserBaseVisitor[Any

[GitHub] [spark] AmplabJenkins commented on pull request #36778: [SPARK-39383][SQL] Support DEFAULT columns in ALTER TABLE ALTER COLUMNS to V2 data sources

2022-06-07 Thread GitBox
AmplabJenkins commented on PR #36778: URL: https://github.com/apache/spark/pull/36778#issuecomment-1148680929 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] AmplabJenkins commented on pull request #36781: [SPARK-39393][SQL] Parquet data source only supports push-down predicate filters for non-repeated primitive types

2022-06-07 Thread GitBox
AmplabJenkins commented on PR #36781: URL: https://github.com/apache/spark/pull/36781#issuecomment-1148680852 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] AmplabJenkins commented on pull request #36784: [SPARK-39396][SQL] Fix LDAP login exception 'error code 49 - invalid credentials'

2022-06-07 Thread GitBox
AmplabJenkins commented on PR #36784: URL: https://github.com/apache/spark/pull/36784#issuecomment-1148680767 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] cloud-fan commented on a diff in pull request #36704: [SPARK-39346][SQL] Convert asserts/illegal state exception to internal errors on each phase

2022-06-07 Thread GitBox
cloud-fan commented on code in PR #36704: URL: https://github.com/apache/spark/pull/36704#discussion_r891224222 ## connector/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaMicroBatchSourceSuite.scala: ## @@ -666,9 +667,10 @@ abstract class KafkaMicroBatchSource

[GitHub] [spark] MaxGekk closed pull request #36753: [SPARK-39259][SQL][3.2] Evaluate timestamps consistently in subqueries

2022-06-07 Thread GitBox
MaxGekk closed pull request #36753: [SPARK-39259][SQL][3.2] Evaluate timestamps consistently in subqueries URL: https://github.com/apache/spark/pull/36753 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

[GitHub] [spark] olaky commented on pull request #36753: [SPARK-39259][SQL][3.2] Evaluate timestamps consistently in subqueries

2022-06-07 Thread GitBox
olaky commented on PR #36753: URL: https://github.com/apache/spark/pull/36753#issuecomment-1148633785 Merging is blocked because of a test failure that also surfaces in https://github.com/apache/spark/pull/36386 -- This is an automated message from the Apache Git Service. To respond to th

[GitHub] [spark] gengliangwang closed pull request #36745: [SPARK-39359][SQL] Restrict DEFAULT columns to allowlist of supported data source types

2022-06-07 Thread GitBox
gengliangwang closed pull request #36745: [SPARK-39359][SQL] Restrict DEFAULT columns to allowlist of supported data source types URL: https://github.com/apache/spark/pull/36745 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] gengliangwang commented on pull request #36745: [SPARK-39359][SQL] Restrict DEFAULT columns to allowlist of supported data source types

2022-06-07 Thread GitBox
gengliangwang commented on PR #36745: URL: https://github.com/apache/spark/pull/36745#issuecomment-1148625244 I am merging this one to master now. We can have a new DS API for this later. -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] Ngone51 commented on a diff in pull request #36716: [SPARK-39062][CORE] Add stage level resource scheduling support for standalone cluster

2022-06-07 Thread GitBox
Ngone51 commented on code in PR #36716: URL: https://github.com/apache/spark/pull/36716#discussion_r891178775 ## core/src/test/scala/org/apache/spark/deploy/master/MasterSuite.scala: ## @@ -530,6 +535,87 @@ class MasterSuite extends SparkFunSuite schedulingWithEverything(sp

[GitHub] [spark] olaky commented on pull request #36386: [SPARK-38918][SQL][3.2] Nested column pruning should filter out attributes that do not belong to the current relation

2022-06-07 Thread GitBox
olaky commented on PR #36386: URL: https://github.com/apache/spark/pull/36386#issuecomment-1148616709 So the only change in the plan I can see that makes the test fail is that the last plan node has a source filename in it now, for example `Scan parquet default.web_site [web_site_sk,w

[GitHub] [spark] Ngone51 commented on a diff in pull request #36716: [SPARK-39062][CORE] Add stage level resource scheduling support for standalone cluster

2022-06-07 Thread GitBox
Ngone51 commented on code in PR #36716: URL: https://github.com/apache/spark/pull/36716#discussion_r891171201 ## core/src/test/scala/org/apache/spark/deploy/master/MasterSuite.scala: ## @@ -530,6 +535,87 @@ class MasterSuite extends SparkFunSuite schedulingWithEverything(sp

[GitHub] [spark] Ngone51 commented on a diff in pull request #36716: [SPARK-39062][CORE] Add stage level resource scheduling support for standalone cluster

2022-06-07 Thread GitBox
Ngone51 commented on code in PR #36716: URL: https://github.com/apache/spark/pull/36716#discussion_r891166646 ## core/src/main/scala/org/apache/spark/deploy/client/StandaloneAppClient.scala: ## @@ -299,9 +300,10 @@ private[spark] class StandaloneAppClient( * * @return wh

[GitHub] [spark] olaky commented on pull request #36386: [SPARK-38918][SQL][3.2] Nested column pruning should filter out attributes that do not belong to the current relation

2022-06-07 Thread GitBox
olaky commented on PR #36386: URL: https://github.com/apache/spark/pull/36386#issuecomment-1148609532 I am facing the same issues here: https://github.com/apache/spark/pull/36753 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [spark] HeartSaVioR closed pull request #35484: [SPARK-38181][SS][DOCS] Update comments in KafkaDataConsumer.scala

2022-06-07 Thread GitBox
HeartSaVioR closed pull request #35484: [SPARK-38181][SS][DOCS] Update comments in KafkaDataConsumer.scala URL: https://github.com/apache/spark/pull/35484 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

[GitHub] [spark] HeartSaVioR commented on pull request #35484: [SPARK-38181][SS][DOCS] Update comments in KafkaDataConsumer.scala

2022-06-07 Thread GitBox
HeartSaVioR commented on PR #35484: URL: https://github.com/apache/spark/pull/35484#issuecomment-1148596916 Thanks! Merging to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] cxzl25 opened a new pull request, #36787: [SPARK-39387][BUILD][FOLLOWUP] Upgrade hive-storage-api to 2.7.3

2022-06-07 Thread GitBox
cxzl25 opened a new pull request, #36787: URL: https://github.com/apache/spark/pull/36787 ### What changes were proposed in this pull request? Add UT, test whether the Overflow of newLength problem is fixed. ### Why are the changes needed? https://github.com/apache/spark/pull

[GitHub] [spark] HeartSaVioR commented on a diff in pull request #36704: [SPARK-39346][SQL] Convert asserts/illegal state exception to internal errors on each phase

2022-06-07 Thread GitBox
HeartSaVioR commented on code in PR #36704: URL: https://github.com/apache/spark/pull/36704#discussion_r891139790 ## connector/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaMicroBatchSourceSuite.scala: ## @@ -666,9 +667,10 @@ abstract class KafkaMicroBatchSour

[GitHub] [spark] gengliangwang commented on a diff in pull request #36703: [SPARK-39321][SQL] Refactor TryCast to use RuntimeReplaceable

2022-06-07 Thread GitBox
gengliangwang commented on code in PR #36703: URL: https://github.com/apache/spark/pull/36703#discussion_r891136404 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala: ## @@ -1792,15 +1792,16 @@ class AstBuilder extends SqlBaseParserBaseVisitor

[GitHub] [spark] ulysses-you commented on pull request #36785: [SPARK-39397][SQL] Relax AliasAwareOutputExpression to support alias with expression

2022-06-07 Thread GitBox
ulysses-you commented on PR #36785: URL: https://github.com/apache/spark/pull/36785#issuecomment-1148578665 cc @cloud-fan @prakharjain09 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

[GitHub] [spark] Ngone51 commented on a diff in pull request #36716: [SPARK-39062][CORE] Add stage level resource scheduling support for standalone cluster

2022-06-07 Thread GitBox
Ngone51 commented on code in PR #36716: URL: https://github.com/apache/spark/pull/36716#discussion_r891129088 ## core/src/test/scala/org/apache/spark/deploy/JsonProtocolSuite.scala: ## @@ -107,11 +107,11 @@ object JsonConstants { |{"id":"id","starttime":3,"name":"name",

[GitHub] [spark] cloud-fan closed pull request #35612: [SPARK-38289][SQL] Refactor SQL CLI exit code to make it more clear

2022-06-07 Thread GitBox
cloud-fan closed pull request #35612: [SPARK-38289][SQL] Refactor SQL CLI exit code to make it more clear URL: https://github.com/apache/spark/pull/35612 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] cloud-fan commented on pull request #35612: [SPARK-38289][SQL] Refactor SQL CLI exit code to make it more clear

2022-06-07 Thread GitBox
cloud-fan commented on PR #35612: URL: https://github.com/apache/spark/pull/35612#issuecomment-1148543123 thanks, merging to master! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

[GitHub] [spark] cloud-fan commented on a diff in pull request #36704: [SPARK-39346][SQL] Convert asserts/illegal state exception to internal errors on each phase

2022-06-07 Thread GitBox
cloud-fan commented on code in PR #36704: URL: https://github.com/apache/spark/pull/36704#discussion_r891099418 ## connector/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaMicroBatchSourceSuite.scala: ## @@ -666,9 +667,10 @@ abstract class KafkaMicroBatchSource

[GitHub] [spark] AngersZhuuuu commented on pull request #36786: [SPARK-39400][SQL] spark-sql should remove hive resource dir in all case

2022-06-07 Thread GitBox
AngersZh commented on PR #36786: URL: https://github.com/apache/spark/pull/36786#issuecomment-1148527000 ping @cloud-fan @wangyum -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] AngersZhuuuu opened a new pull request, #36786: [SPARK-39400][SQL] spark-sql should remove hive resource dir in all case

2022-06-07 Thread GitBox
AngersZh opened a new pull request, #36786: URL: https://github.com/apache/spark/pull/36786 ### What changes were proposed in this pull request? In current code, when we use `spark-sql` `-e` , `-f` or use `ctrl + c` to close `spark-sql` session, will remain hive session resource dir

[GitHub] [spark] MaxGekk commented on pull request #36753: [SPARK-39259][SQL][3.2] Evaluate timestamps consistently in subqueries

2022-06-07 Thread GitBox
MaxGekk commented on PR #36753: URL: https://github.com/apache/spark/pull/36753#issuecomment-1148512960 @olaky The changes cause conflicts in branch-3.1. Could you PRs w/ backports to 3.1 and 3.0, please. -- This is an automated message from the Apache Git Service. To respond to the messa

[GitHub] [spark] gengliangwang closed pull request #34970: [DO NOT MERGE] investigate test failures if we test ANSI mode in github actions

2022-06-07 Thread GitBox
gengliangwang closed pull request #34970: [DO NOT MERGE] investigate test failures if we test ANSI mode in github actions URL: https://github.com/apache/spark/pull/34970 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [spark] gengliangwang closed pull request #36174: [SPARK-34659][UI] Fix wrong application ID when reverse proxy URL contains "proxy" or "history"

2022-06-07 Thread GitBox
gengliangwang closed pull request #36174: [SPARK-34659][UI] Fix wrong application ID when reverse proxy URL contains "proxy" or "history" URL: https://github.com/apache/spark/pull/36174 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Gi

[GitHub] [spark] MaxGekk commented on pull request #36753: [SPARK-39259][SQL][3.2] Evaluate timestamps consistently in subqueries

2022-06-07 Thread GitBox
MaxGekk commented on PR #36753: URL: https://github.com/apache/spark/pull/36753#issuecomment-1148509294 +1, LGTM. Merging to 3.2 and trying to merge to 3.1/3.0. Thank you, @olaky and @JoshRosen @dongjoon-hyun for review. -- This is an automated message from the Apache Git Service. To re

[GitHub] [spark] gengliangwang commented on a diff in pull request #36745: [SPARK-39359][SQL] Restrict DEFAULT columns to allowlist of supported data source types

2022-06-07 Thread GitBox
gengliangwang commented on code in PR #36745: URL: https://github.com/apache/spark/pull/36745#discussion_r891067965 ## sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala: ## @@ -2881,6 +2881,15 @@ object SQLConf { .booleanConf .createWithDefaul

[GitHub] [spark] MaxGekk commented on pull request #36753: [SPARK-39259][SQL][3.2] Evaluate timestamps consistently in subqueries

2022-06-07 Thread GitBox
MaxGekk commented on PR #36753: URL: https://github.com/apache/spark/pull/36753#issuecomment-1148505223 I guess the failure is not related to PR's changes: ``` [info] - check simplified (tpcds-v1.4/q4) *** FAILED *** (945 milliseconds) [info] Plans did not match: ``` -- This

[GitHub] [spark] MaxGekk commented on pull request #36703: [SPARK-39321][SQL] Refactor TryCast to use RuntimeReplaceable

2022-06-07 Thread GitBox
MaxGekk commented on PR #36703: URL: https://github.com/apache/spark/pull/36703#issuecomment-1148502630 @cloud-fan Could you resolve conflicts, please. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

[GitHub] [spark] MaxGekk commented on pull request #36780: [SPARK-39392][SQL] Refine ANSI error messages for try_* function hints

2022-06-07 Thread GitBox
MaxGekk commented on PR #36780: URL: https://github.com/apache/spark/pull/36780#issuecomment-1148501765 @vli-databricks Could you backport the changes to branch-3.3, please. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] MaxGekk closed pull request #36780: [SPARK-39392][SQL] Refine ANSI error messages for try_* function hints

2022-06-07 Thread GitBox
MaxGekk closed pull request #36780: [SPARK-39392][SQL] Refine ANSI error messages for try_* function hints URL: https://github.com/apache/spark/pull/36780 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

[GitHub] [spark] MaxGekk commented on pull request #36780: [SPARK-39392][SQL] Refine ANSI error messages for try_* function hints

2022-06-07 Thread GitBox
MaxGekk commented on PR #36780: URL: https://github.com/apache/spark/pull/36780#issuecomment-1148499879 +1, LGTM. Merging to master. Thank you, @vli-databricks and @gengliangwang @HyukjinKwon for review. -- This is an automated message from the Apache Git Service. To respond to the mess

[GitHub] [spark] AngersZhuuuu commented on pull request #35612: [SPARK-38289][SQL] Refactor SQL CLI exit code to make it more clear

2022-06-07 Thread GitBox
AngersZh commented on PR #35612: URL: https://github.com/apache/spark/pull/35612#issuecomment-1148499208 > @AngersZh can you retrigger the tests? GA passed now -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

<    1   2   3   >