[GitHub] [spark] wangyum commented on issue #25762: [SPARK-29056] ThriftServerSessionPage displays 1970/01/01 finish and close time when unset
wangyum commented on issue #25762: [SPARK-29056] ThriftServerSessionPage displays 1970/01/01 finish and close time when unset URL: https://github.com/apache/spark/pull/25762#issuecomment-530677333 We did the same logic in [SQLStatsTable](https://github.com/apache/spark/blob/a428f406693f1c372dc0e378f6b413eca9e367ac/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/ui/ThriftServerPage.scala#L92-L93). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on a change in pull request #25710: [SPARK-29008][SQL] Define an individual method for each common subexpression in HashAggregateExec
viirya commented on a change in pull request #25710: [SPARK-29008][SQL] Define an individual method for each common subexpression in HashAggregateExec URL: https://github.com/apache/spark/pull/25710#discussion_r323562581 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/WholeStageCodegenSuite.scala ## @@ -419,4 +419,27 @@ class WholeStageCodegenSuite extends QueryTest with SharedSparkSession { } } } + + test("Give up splitting subexpression code if a parameter length goes over the limit") { +withSQLConf( +SQLConf.CODEGEN_SPLIT_AGGREGATE_FUNC.key -> "false", Review comment: This test must be run under CODEGEN_SPLIT_AGGREGATE_FUNC = false? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] JkSelf commented on a change in pull request #25295: [SPARK-28560][SQL] Optimize shuffle reader to local shuffle reader when smj converted to bhj in adaptive execution
JkSelf commented on a change in pull request #25295: [SPARK-28560][SQL] Optimize shuffle reader to local shuffle reader when smj converted to bhj in adaptive execution URL: https://github.com/apache/spark/pull/25295#discussion_r323562618 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/ReduceNumShufflePartitions.scala ## @@ -180,25 +180,45 @@ case class ReduceNumShufflePartitions(conf: SQLConf) extends Rule[SparkPlan] { case class CoalescedShuffleReaderExec( child: QueryStageExec, -partitionStartIndices: Array[Int]) extends UnaryExecNode { +partitionStartIndices: Array[Int], +var isLocal: Boolean = false) extends UnaryExecNode { override def output: Seq[Attribute] = child.output override def doCanonicalize(): SparkPlan = child.canonicalized override def outputPartitioning: Partitioning = { Review comment: Yes I need. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25770: [SPARK-29064][CORE] Add PrometheusResource to export Executor metrics
AmplabJenkins removed a comment on issue #25770: [SPARK-29064][CORE] Add PrometheusResource to export Executor metrics URL: https://github.com/apache/spark/pull/25770#issuecomment-530671871 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/110500/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25770: [SPARK-29064][CORE] Add PrometheusResource to export Executor metrics
AmplabJenkins removed a comment on issue #25770: [SPARK-29064][CORE] Add PrometheusResource to export Executor metrics URL: https://github.com/apache/spark/pull/25770#issuecomment-530671867 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25770: [SPARK-29064][CORE] Add PrometheusResource to export Executor metrics
AmplabJenkins commented on issue #25770: [SPARK-29064][CORE] Add PrometheusResource to export Executor metrics URL: https://github.com/apache/spark/pull/25770#issuecomment-530671871 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/110500/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25770: [SPARK-29064][CORE] Add PrometheusResource to export Executor metrics
AmplabJenkins commented on issue #25770: [SPARK-29064][CORE] Add PrometheusResource to export Executor metrics URL: https://github.com/apache/spark/pull/25770#issuecomment-530671867 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #25770: [SPARK-29064][CORE] Add PrometheusResource to export Executor metrics
SparkQA removed a comment on issue #25770: [SPARK-29064][CORE] Add PrometheusResource to export Executor metrics URL: https://github.com/apache/spark/pull/25770#issuecomment-530653022 **[Test build #110500 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110500/testReport)** for PR 25770 at commit [`88413eb`](https://github.com/apache/spark/commit/88413eb56b10794c6ad42a754aec1cca50bc0237). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25769: [SPARK-29032][CORE] Add PrometheusServlet to monitor Masster/Worker/Driver
AmplabJenkins removed a comment on issue #25769: [SPARK-29032][CORE] Add PrometheusServlet to monitor Masster/Worker/Driver URL: https://github.com/apache/spark/pull/25769#issuecomment-530671421 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/110499/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25770: [SPARK-29064][CORE] Add PrometheusResource to export Executor metrics
SparkQA commented on issue #25770: [SPARK-29064][CORE] Add PrometheusResource to export Executor metrics URL: https://github.com/apache/spark/pull/25770#issuecomment-530671676 **[Test build #110500 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110500/testReport)** for PR 25770 at commit [`88413eb`](https://github.com/apache/spark/commit/88413eb56b10794c6ad42a754aec1cca50bc0237). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on a change in pull request #25295: [SPARK-28560][SQL] Optimize shuffle reader to local shuffle reader when smj converted to bhj in adaptive execution
viirya commented on a change in pull request #25295: [SPARK-28560][SQL] Optimize shuffle reader to local shuffle reader when smj converted to bhj in adaptive execution URL: https://github.com/apache/spark/pull/25295#discussion_r323561410 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/ReduceNumShufflePartitions.scala ## @@ -180,25 +180,45 @@ case class ReduceNumShufflePartitions(conf: SQLConf) extends Rule[SparkPlan] { case class CoalescedShuffleReaderExec( child: QueryStageExec, -partitionStartIndices: Array[Int]) extends UnaryExecNode { +partitionStartIndices: Array[Int], +var isLocal: Boolean = false) extends UnaryExecNode { override def output: Seq[Attribute] = child.output override def doCanonicalize(): SparkPlan = child.canonicalized override def outputPartitioning: Partitioning = { Review comment: Ur, don't you rely on see if EnsureRequirements introduces additional shuffle exchange, to decide doing local shuffle reader or not? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25769: [SPARK-29032][CORE] Add PrometheusServlet to monitor Masster/Worker/Driver
AmplabJenkins commented on issue #25769: [SPARK-29032][CORE] Add PrometheusServlet to monitor Masster/Worker/Driver URL: https://github.com/apache/spark/pull/25769#issuecomment-530671416 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25769: [SPARK-29032][CORE] Add PrometheusServlet to monitor Masster/Worker/Driver
AmplabJenkins removed a comment on issue #25769: [SPARK-29032][CORE] Add PrometheusServlet to monitor Masster/Worker/Driver URL: https://github.com/apache/spark/pull/25769#issuecomment-530671416 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25769: [SPARK-29032][CORE] Add PrometheusServlet to monitor Masster/Worker/Driver
AmplabJenkins commented on issue #25769: [SPARK-29032][CORE] Add PrometheusServlet to monitor Masster/Worker/Driver URL: https://github.com/apache/spark/pull/25769#issuecomment-530671421 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/110499/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25769: [SPARK-29032][CORE] Add PrometheusServlet to monitor Masster/Worker/Driver
SparkQA commented on issue #25769: [SPARK-29032][CORE] Add PrometheusServlet to monitor Masster/Worker/Driver URL: https://github.com/apache/spark/pull/25769#issuecomment-530671261 **[Test build #110499 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110499/testReport)** for PR 25769 at commit [`27a9f0c`](https://github.com/apache/spark/commit/27a9f0cecd214fa257ea7cda67123929fad85dc8). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] planga82 removed a comment on issue #25723: [SPARK-29019][WebUI] Improve tooltip JDBC/ODBC Server tab
planga82 removed a comment on issue #25723: [SPARK-29019][WebUI] Improve tooltip JDBC/ODBC Server tab URL: https://github.com/apache/spark/pull/25723#issuecomment-530656418 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #25769: [SPARK-29032][CORE] Add PrometheusServlet to monitor Masster/Worker/Driver
SparkQA removed a comment on issue #25769: [SPARK-29032][CORE] Add PrometheusServlet to monitor Masster/Worker/Driver URL: https://github.com/apache/spark/pull/25769#issuecomment-530650453 **[Test build #110499 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110499/testReport)** for PR 25769 at commit [`27a9f0c`](https://github.com/apache/spark/commit/27a9f0cecd214fa257ea7cda67123929fad85dc8). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25771: [SPARK-28970][SQL] Implement USE CATALOG/NAMESPACE for Data Source V2
AmplabJenkins removed a comment on issue #25771: [SPARK-28970][SQL] Implement USE CATALOG/NAMESPACE for Data Source V2 URL: https://github.com/apache/spark/pull/25771#issuecomment-530669811 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/110501/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25771: [SPARK-28970][SQL] Implement USE CATALOG/NAMESPACE for Data Source V2
SparkQA commented on issue #25771: [SPARK-28970][SQL] Implement USE CATALOG/NAMESPACE for Data Source V2 URL: https://github.com/apache/spark/pull/25771#issuecomment-530669790 **[Test build #110501 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110501/testReport)** for PR 25771 at commit [`0d5246d`](https://github.com/apache/spark/commit/0d5246dfdee91b7707513515b2e4686341d8863e). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25771: [SPARK-28970][SQL] Implement USE CATALOG/NAMESPACE for Data Source V2
AmplabJenkins removed a comment on issue #25771: [SPARK-28970][SQL] Implement USE CATALOG/NAMESPACE for Data Source V2 URL: https://github.com/apache/spark/pull/25771#issuecomment-530669805 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25771: [SPARK-28970][SQL] Implement USE CATALOG/NAMESPACE for Data Source V2
AmplabJenkins commented on issue #25771: [SPARK-28970][SQL] Implement USE CATALOG/NAMESPACE for Data Source V2 URL: https://github.com/apache/spark/pull/25771#issuecomment-530669811 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/110501/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #25771: [SPARK-28970][SQL] Implement USE CATALOG/NAMESPACE for Data Source V2
SparkQA removed a comment on issue #25771: [SPARK-28970][SQL] Implement USE CATALOG/NAMESPACE for Data Source V2 URL: https://github.com/apache/spark/pull/25771#issuecomment-530665476 **[Test build #110501 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110501/testReport)** for PR 25771 at commit [`0d5246d`](https://github.com/apache/spark/commit/0d5246dfdee91b7707513515b2e4686341d8863e). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25771: [SPARK-28970][SQL] Implement USE CATALOG/NAMESPACE for Data Source V2
AmplabJenkins commented on issue #25771: [SPARK-28970][SQL] Implement USE CATALOG/NAMESPACE for Data Source V2 URL: https://github.com/apache/spark/pull/25771#issuecomment-530669805 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25766: [SPARK-29061][SQL] Prints bytecode statistics in debugCodegen
AmplabJenkins removed a comment on issue #25766: [SPARK-29061][SQL] Prints bytecode statistics in debugCodegen URL: https://github.com/apache/spark/pull/25766#issuecomment-530666109 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25766: [SPARK-29061][SQL] Prints bytecode statistics in debugCodegen
AmplabJenkins removed a comment on issue #25766: [SPARK-29061][SQL] Prints bytecode statistics in debugCodegen URL: https://github.com/apache/spark/pull/25766#issuecomment-530666113 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/110493/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25766: [SPARK-29061][SQL] Prints bytecode statistics in debugCodegen
AmplabJenkins commented on issue #25766: [SPARK-29061][SQL] Prints bytecode statistics in debugCodegen URL: https://github.com/apache/spark/pull/25766#issuecomment-530666113 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/110493/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25766: [SPARK-29061][SQL] Prints bytecode statistics in debugCodegen
AmplabJenkins commented on issue #25766: [SPARK-29061][SQL] Prints bytecode statistics in debugCodegen URL: https://github.com/apache/spark/pull/25766#issuecomment-530666109 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #25766: [SPARK-29061][SQL] Prints bytecode statistics in debugCodegen
SparkQA removed a comment on issue #25766: [SPARK-29061][SQL] Prints bytecode statistics in debugCodegen URL: https://github.com/apache/spark/pull/25766#issuecomment-530618885 **[Test build #110493 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110493/testReport)** for PR 25766 at commit [`1b27080`](https://github.com/apache/spark/commit/1b27080c79eda70119d92c38015988d0d2dbc5b6). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25766: [SPARK-29061][SQL] Prints bytecode statistics in debugCodegen
SparkQA commented on issue #25766: [SPARK-29061][SQL] Prints bytecode statistics in debugCodegen URL: https://github.com/apache/spark/pull/25766#issuecomment-530665796 **[Test build #110493 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110493/testReport)** for PR 25766 at commit [`1b27080`](https://github.com/apache/spark/commit/1b27080c79eda70119d92c38015988d0d2dbc5b6). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `case class ByteCodeStats(maxClassCodeSize: Int, maxMethodCodeSize: Int, maxConstPoolSize: Int)` * ` * Returns the bytecode statistics (max class bytecode size, max method bytecode size, and` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25771: [SPARK-28970][SQL] Implement USE CATALOG/NAMESPACE for Data Source V2
AmplabJenkins removed a comment on issue #25771: [SPARK-28970][SQL] Implement USE CATALOG/NAMESPACE for Data Source V2 URL: https://github.com/apache/spark/pull/25771#issuecomment-530664917 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25771: [SPARK-28970][SQL] Implement USE CATALOG/NAMESPACE for Data Source V2
SparkQA commented on issue #25771: [SPARK-28970][SQL] Implement USE CATALOG/NAMESPACE for Data Source V2 URL: https://github.com/apache/spark/pull/25771#issuecomment-530665476 **[Test build #110501 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110501/testReport)** for PR 25771 at commit [`0d5246d`](https://github.com/apache/spark/commit/0d5246dfdee91b7707513515b2e4686341d8863e). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25771: [SPARK-28970][SQL] Implement USE CATALOG/NAMESPACE for Data Source V2
AmplabJenkins removed a comment on issue #25771: [SPARK-28970][SQL] Implement USE CATALOG/NAMESPACE for Data Source V2 URL: https://github.com/apache/spark/pull/25771#issuecomment-530665091 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/15476/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] imback82 commented on a change in pull request #25771: [SPARK-28970][SQL] Implement USE CATALOG/NAMESPACE for Data Source V2
imback82 commented on a change in pull request #25771: [SPARK-28970][SQL] Implement USE CATALOG/NAMESPACE for Data Source V2 URL: https://github.com/apache/spark/pull/25771#discussion_r323556054 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2SessionCatalog.scala ## @@ -56,7 +56,10 @@ class V2SessionCatalog(catalog: SessionCatalog, conf: SQLConf) override def listTables(namespace: Array[String]): Array[Identifier] = { namespace match { case Array(db) => -catalog.listTables(db).map(ident => Identifier.of(Array(db), ident.table)).toArray +catalog + .listTables(db) + .map(ident => Identifier.of(Array(ident.database.getOrElse("")), ident.table)) Review comment: Matching the behavior or https://github.com/apache/spark/blob/850833fa177ec1f265e143fc383e40ec2c8341a6/sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala#L776 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25771: [SPARK-28970][SQL] Implement USE CATALOG/NAMESPACE for Data Source V2
AmplabJenkins commented on issue #25771: [SPARK-28970][SQL] Implement USE CATALOG/NAMESPACE for Data Source V2 URL: https://github.com/apache/spark/pull/25771#issuecomment-530665087 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25771: [SPARK-28970][SQL] Implement USE CATALOG/NAMESPACE for Data Source V2
AmplabJenkins removed a comment on issue #25771: [SPARK-28970][SQL] Implement USE CATALOG/NAMESPACE for Data Source V2 URL: https://github.com/apache/spark/pull/25771#issuecomment-530665087 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25771: [SPARK-28970][SQL] Implement USE CATALOG/NAMESPACE for Data Source V2
AmplabJenkins commented on issue #25771: [SPARK-28970][SQL] Implement USE CATALOG/NAMESPACE for Data Source V2 URL: https://github.com/apache/spark/pull/25771#issuecomment-530665091 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/15476/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] imback82 commented on a change in pull request #25771: [SPARK-28970][SQL] Implement USE CATALOG/NAMESPACE for Data Source V2
imback82 commented on a change in pull request #25771: [SPARK-28970][SQL] Implement USE CATALOG/NAMESPACE for Data Source V2 URL: https://github.com/apache/spark/pull/25771#discussion_r323555674 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceResolution.scala ## @@ -188,18 +186,10 @@ case class DataSourceResolution( } case ShowTablesStatement(None, pattern) => - defaultCatalog match { -case Some(catalog) => - ShowTables( -catalog.asTableCatalog, -catalogManager.currentNamespace, -pattern) -case None => - ShowTablesCommand(None, pattern) - } + ShowTables(currentCatalog.asTableCatalog, catalogManager.currentNamespace, pattern) Review comment: If there is no catalog specified, this will not fallback to v1 anymore since `currentCatalog` always returns a `CatalogPlugin`. This should be OK right? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25771: [SPARK-28970][SQL] Implement USE CATALOG/NAMESPACE for Data Source V2
AmplabJenkins commented on issue #25771: [SPARK-28970][SQL] Implement USE CATALOG/NAMESPACE for Data Source V2 URL: https://github.com/apache/spark/pull/25771#issuecomment-530664917 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] imback82 commented on issue #25771: [SPARK-28970][SQL] Implement USE CATALOG/NAMESPACE for Data Source V2
imback82 commented on issue #25771: [SPARK-28970][SQL] Implement USE CATALOG/NAMESPACE for Data Source V2 URL: https://github.com/apache/spark/pull/25771#issuecomment-530664381 @cloud-fan / @rdblue This will probably be refactored after https://github.com/apache/spark/pull/25747 is merged, but I wanted to send this out to get some feedback on the usage of current catalog. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] imback82 opened a new pull request #25771: [SPARK-28970][SQL] Implement USE CATALOG/NAMESPACE for Data Source V2
imback82 opened a new pull request #25771: [SPARK-28970][SQL] Implement USE CATALOG/NAMESPACE for Data Source V2 URL: https://github.com/apache/spark/pull/25771 ### What changes were proposed in this pull request? This PR exposes USE CATALOG/USE SQL commands as described in this [SPIP](https://docs.google.com/document/d/1jEcvomPiTc5GtB9F7d2RTVVpMY64Qy7INCA_rFEd9HQ/edit#) It also exposes `currentCatalog` in `CatalogManager`. Finally, it changes `SHOW NAMESPACES` and `SHOW TABLES` to use the current catalog if no catalog is specified (instead of default catalog). ### Why are the changes needed? There is currently no mechanism to change current catalog/namespace thru SQL commands. ### Does this PR introduce any user-facing change? Yes, you can perform the following: ```scala // Sets the current catalog to 'testcat' spark.sql("USE CATALOG testcat") // Sets the current catalog to 'testcat' and current namespace to 'ns1.ns2'. spark.sql("USE ns1.ns2 IN testcat") // Now, the following will use 'testcat' as the current catalog and 'ns1.ns2' as the current namespace. spark.sql("SHOW NAMESPACES") ``` ### How was this patch tested? Added new unit tests. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] imback82 commented on issue #25771: [SPARK-28970][SQL] Implement USE CATALOG/NAMESPACE for Data Source V2
imback82 commented on issue #25771: [SPARK-28970][SQL] Implement USE CATALOG/NAMESPACE for Data Source V2 URL: https://github.com/apache/spark/pull/25771#issuecomment-530664131 cc: @cloud-fan @rdblue This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] JkSelf commented on a change in pull request #25295: [SPARK-28560][SQL] Optimize shuffle reader to local shuffle reader when smj converted to bhj in adaptive execution
JkSelf commented on a change in pull request #25295: [SPARK-28560][SQL] Optimize shuffle reader to local shuffle reader when smj converted to bhj in adaptive execution URL: https://github.com/apache/spark/pull/25295#discussion_r323554098 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/ReduceNumShufflePartitions.scala ## @@ -180,25 +180,45 @@ case class ReduceNumShufflePartitions(conf: SQLConf) extends Rule[SparkPlan] { case class CoalescedShuffleReaderExec( child: QueryStageExec, -partitionStartIndices: Array[Int]) extends UnaryExecNode { +partitionStartIndices: Array[Int], +var isLocal: Boolean = false) extends UnaryExecNode { override def output: Seq[Attribute] = child.output override def doCanonicalize(): SparkPlan = child.canonicalized override def outputPartitioning: Partitioning = { Review comment: @viirya Maybe not override `requiredChildDistribution`. Because the `requiredChildDistribution` of `CoalescedShuffleReaderExec` is `UnspecificedDistribution` whether the `isLocal ` is `true `or `false`, the `EnsureRequirements ` will not introduce the additional shuffle exchange. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] xianyinxin commented on issue #25626: [SPARK-28892][SQL] Add UPDATE support for DataSource V2
xianyinxin commented on issue #25626: [SPARK-28892][SQL] Add UPDATE support for DataSource V2 URL: https://github.com/apache/spark/pull/25626#issuecomment-530662601 @rdblue, JDBC is the most important use case for this API I've seen. The building of this API for JDBC is not started yet, as the public expressions need to be improved if we plan to do that. Could you explain your concern about this API? Were you feeling this is not needed, or we must implement the "row-based" API first? Any comments or suggestions are welcome. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25710: [SPARK-29008][SQL] Define an individual method for each common subexpression in HashAggregateExec
AmplabJenkins removed a comment on issue #25710: [SPARK-29008][SQL] Define an individual method for each common subexpression in HashAggregateExec URL: https://github.com/apache/spark/pull/25710#issuecomment-530662428 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/110495/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25710: [SPARK-29008][SQL] Define an individual method for each common subexpression in HashAggregateExec
AmplabJenkins removed a comment on issue #25710: [SPARK-29008][SQL] Define an individual method for each common subexpression in HashAggregateExec URL: https://github.com/apache/spark/pull/25710#issuecomment-530662426 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25710: [SPARK-29008][SQL] Define an individual method for each common subexpression in HashAggregateExec
AmplabJenkins commented on issue #25710: [SPARK-29008][SQL] Define an individual method for each common subexpression in HashAggregateExec URL: https://github.com/apache/spark/pull/25710#issuecomment-530662428 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/110495/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25710: [SPARK-29008][SQL] Define an individual method for each common subexpression in HashAggregateExec
AmplabJenkins commented on issue #25710: [SPARK-29008][SQL] Define an individual method for each common subexpression in HashAggregateExec URL: https://github.com/apache/spark/pull/25710#issuecomment-530662426 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #25710: [SPARK-29008][SQL] Define an individual method for each common subexpression in HashAggregateExec
SparkQA removed a comment on issue #25710: [SPARK-29008][SQL] Define an individual method for each common subexpression in HashAggregateExec URL: https://github.com/apache/spark/pull/25710#issuecomment-530620232 **[Test build #110495 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110495/testReport)** for PR 25710 at commit [`3314954`](https://github.com/apache/spark/commit/3314954406203170fe2fff7ebf20a9e038bc689e). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25710: [SPARK-29008][SQL] Define an individual method for each common subexpression in HashAggregateExec
SparkQA commented on issue #25710: [SPARK-29008][SQL] Define an individual method for each common subexpression in HashAggregateExec URL: https://github.com/apache/spark/pull/25710#issuecomment-530662081 **[Test build #110495 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110495/testReport)** for PR 25710 at commit [`3314954`](https://github.com/apache/spark/commit/3314954406203170fe2fff7ebf20a9e038bc689e). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gatorsmile commented on issue #25749: [SPARK-29041][PYTHON] Allows createDataFrame to accept bytes as binary type
gatorsmile commented on issue #25749: [SPARK-29041][PYTHON] Allows createDataFrame to accept bytes as binary type URL: https://github.com/apache/spark/pull/25749#issuecomment-530661185 Do we have a user-facing documentation about type mapping? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #25734: [SPARK-28939][SQL][2.4] Propagate SQLConf for plans executed by toRdd
dongjoon-hyun commented on a change in pull request #25734: [SPARK-28939][SQL][2.4] Propagate SQLConf for plans executed by toRdd URL: https://github.com/apache/spark/pull/25734#discussion_r323551991 ## File path: docs/sql-migration-guide-upgrade.md ## @@ -7,6 +7,12 @@ displayTitle: Spark SQL Upgrading Guide * Table of contents {:toc} +## Upgrading from Spark SQL 2.4 to 2.4.5 + + - Starting from 2.4.5, SQL configurations are effective also when a Dataset is converted to an RDD and its + plan is executed due to action on the derived RDD. The previous buggy behavior can be restored setting + `spark.sql.legacy.rdd.applyConf` to `false`. Review comment: Hi, @hvanhovell , @cloud-fan and @gatorsmile . As @mgaido91 asked here, this PR will add this flag only at `branch-2.4`. In this case, is it okay this config will be **added and deprecated** at `2.4.5` and will be removed at `3.0.0`? For me, we don't need to add this configuration to `master`. Did I understand correctly? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #25734: [SPARK-28939][SQL][2.4] Propagate SQLConf for plans executed by toRdd
dongjoon-hyun commented on a change in pull request #25734: [SPARK-28939][SQL][2.4] Propagate SQLConf for plans executed by toRdd URL: https://github.com/apache/spark/pull/25734#discussion_r323551425 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ## @@ -1298,6 +1298,14 @@ object SQLConf { .booleanConf .createWithDefault(true) + val USE_CONF_ON_RDD_OPERATION = +buildConf("spark.sql.legacy.rdd.applyConf") + .internal() + .doc("When false, SQL configurations are disregarded when operations on a RDD derived from" + +" a dataframe are executed. This is the (buggy) behavior up to 2.4.3.") Review comment: `up to 2.4.4` ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HeartSaVioR commented on issue #25753: [SPARK-29046][SQL] Fix NPE in SQLConf.get when active SparkContext is stopping
HeartSaVioR commented on issue #25753: [SPARK-29046][SQL] Fix NPE in SQLConf.get when active SparkContext is stopping URL: https://github.com/apache/spark/pull/25753#issuecomment-530659454 Thanks all for reviewing and merging! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #25769: [SPARK-29032][CORE] Add PrometheusServlet to monitor Masster/Worker/Driver
dongjoon-hyun commented on a change in pull request #25769: [SPARK-29032][CORE] Add PrometheusServlet to monitor Masster/Worker/Driver URL: https://github.com/apache/spark/pull/25769#discussion_r323550125 ## File path: core/src/main/scala/org/apache/spark/metrics/MetricsSystem.scala ## @@ -201,6 +203,12 @@ private[spark] class MetricsSystem private ( classOf[Properties], classOf[MetricRegistry], classOf[SecurityManager]) .newInstance(kv._2, registry, securityMgr) metricsServlet = Some(servlet) + } else if (kv._1 == "prometheusServlet") { Review comment: For this, I'll keep the current form instead of `case`, @srowen ~ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gatorsmile commented on a change in pull request #25768: [SPARK-29063][SQL] Modify fillValue approach to support joined dataframe
gatorsmile commented on a change in pull request #25768: [SPARK-29063][SQL] Modify fillValue approach to support joined dataframe URL: https://github.com/apache/spark/pull/25768#discussion_r323549161 ## File path: sql/core/src/main/scala/org/apache/spark/sql/DataFrameNaFunctions.scala ## @@ -497,12 +497,10 @@ final class DataFrameNaFunctions private[sql](df: DataFrame) { throw new IllegalArgumentException(s"$targetType is not matched at fillValue") } // Only fill if the column is part of the cols list. - if (typeMatches && cols.exists(col => columnEquals(f.name, col))) { -fillCol[T](f, value) - } else { -df.col(f.name) - } + typeMatches && cols.exists(col => columnEquals(f.name, col)) +}.map { col => + (col.name, fillCol[T](col, value)) } -df.select(projections : _*) +df.withColumns(fillColumnsInfo.map(_._1), fillColumnsInfo.map(_._2)) Review comment: When `df` has a duplicate column name, what is the behavior? Also, we need to add test cases to ensure the behaviors are consistent. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] planga82 commented on issue #25723: [SPARK-29019][WebUI] Improve tooltip JDBC/ODBC Server tab
planga82 commented on issue #25723: [SPARK-29019][WebUI] Improve tooltip JDBC/ODBC Server tab URL: https://github.com/apache/spark/pull/25723#issuecomment-530656418 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] JkSelf commented on a change in pull request #25295: [SPARK-28560][SQL] Optimize shuffle reader to local shuffle reader when smj converted to bhj in adaptive execution
JkSelf commented on a change in pull request #25295: [SPARK-28560][SQL] Optimize shuffle reader to local shuffle reader when smj converted to bhj in adaptive execution URL: https://github.com/apache/spark/pull/25295#discussion_r323548060 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/LocalShuffledRowRDD.scala ## @@ -0,0 +1,118 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.adaptive + +import org.apache.spark._ +import org.apache.spark.rdd.{RDD, ShuffledRDDPartition} +import org.apache.spark.sql.catalyst.InternalRow +import org.apache.spark.sql.execution.metric.{SQLMetric, SQLShuffleReadMetricsReporter} + +/** + * This is a specialized version of [[org.apache.spark.sql.execution.ShuffledRowRDD]]. This is used + * in Spark SQL adaptive execution when a shuffle join is converted to broadcast join at runtime + * because the map output of one input table is small enough for broadcast. This RDD represents the + * data of another input table of the join that reads from shuffle. Each partition of the RDD reads + * the whole data from just one mapper output locally. So actually there is no data transferred + * from the network. + + * This RDD takes a [[ShuffleDependency]] (`dependency`). + * + * The `dependency` has the parent RDD of this RDD, which represents the dataset before shuffle + * (i.e. map output). Elements of this RDD are (partitionId, Row) pairs. + * Partition ids should be in the range [0, numPartitions - 1]. + * `dependency.partitioner.numPartitions` is the number of pre-shuffle partitions. (i.e. the number + * of partitions of the map output). The post-shuffle partition number is the same to the parent + * RDD's partition number. + */ +class LocalShuffledRowRDD( + var dependency: ShuffleDependency[Int, InternalRow, InternalRow], + metrics: Map[String, SQLMetric], + specifiedPartitionStartIndices: Option[Array[Int]] = None, + specifiedPartitionEndIndices: Option[Array[Int]] = None) Review comment: @viirya Currently not. We may need the `specifiedPartitionEndIndices `variable to skip the partitions with 0 size in the following optimization. And I will retain and use it when create `LocalShuffledRowRDD `later. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun opened a new pull request #25770: [SPARK-29064][CORE] Add PrometheusResource to export Executor metrics
dongjoon-hyun opened a new pull request #25770: [SPARK-29064][CORE] Add PrometheusResource to export Executor metrics URL: https://github.com/apache/spark/pull/25770 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce any user-facing change? ### How was this patch tested? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25770: [SPARK-29064][CORE] Add PrometheusResource to export Executor metrics
AmplabJenkins removed a comment on issue #25770: [SPARK-29064][CORE] Add PrometheusResource to export Executor metrics URL: https://github.com/apache/spark/pull/25770#issuecomment-530652703 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25770: [SPARK-29064][CORE] Add PrometheusResource to export Executor metrics
SparkQA commented on issue #25770: [SPARK-29064][CORE] Add PrometheusResource to export Executor metrics URL: https://github.com/apache/spark/pull/25770#issuecomment-530653022 **[Test build #110500 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110500/testReport)** for PR 25770 at commit [`88413eb`](https://github.com/apache/spark/commit/88413eb56b10794c6ad42a754aec1cca50bc0237). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25770: [SPARK-29064][CORE] Add PrometheusResource to export Executor metrics
AmplabJenkins commented on issue #25770: [SPARK-29064][CORE] Add PrometheusResource to export Executor metrics URL: https://github.com/apache/spark/pull/25770#issuecomment-530652707 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/15475/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25770: [SPARK-29064][CORE] Add PrometheusResource to export Executor metrics
AmplabJenkins removed a comment on issue #25770: [SPARK-29064][CORE] Add PrometheusResource to export Executor metrics URL: https://github.com/apache/spark/pull/25770#issuecomment-530652707 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/15475/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25770: [SPARK-29064][CORE] Add PrometheusResource to export Executor metrics
AmplabJenkins commented on issue #25770: [SPARK-29064][CORE] Add PrometheusResource to export Executor metrics URL: https://github.com/apache/spark/pull/25770#issuecomment-530652703 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25769: [SPARK-29032][CORE] Add PrometheusServlet to monitor Masster/Worker/Driver
AmplabJenkins removed a comment on issue #25769: [SPARK-29032][CORE] Add PrometheusServlet to monitor Masster/Worker/Driver URL: https://github.com/apache/spark/pull/25769#issuecomment-530650100 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25767: [SPARK-29062][SQL] Add V1_BATCH_WRITE to the TableCapabilityChecks
AmplabJenkins removed a comment on issue #25767: [SPARK-29062][SQL] Add V1_BATCH_WRITE to the TableCapabilityChecks URL: https://github.com/apache/spark/pull/25767#issuecomment-530650336 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/110494/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25769: [SPARK-29032][CORE] Add PrometheusServlet to monitor Masster/Worker/Driver
AmplabJenkins removed a comment on issue #25769: [SPARK-29032][CORE] Add PrometheusServlet to monitor Masster/Worker/Driver URL: https://github.com/apache/spark/pull/25769#issuecomment-530650105 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/15474/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25769: [SPARK-29032][CORE] Add PrometheusServlet to monitor Masster/Worker/Driver
SparkQA commented on issue #25769: [SPARK-29032][CORE] Add PrometheusServlet to monitor Masster/Worker/Driver URL: https://github.com/apache/spark/pull/25769#issuecomment-530650453 **[Test build #110499 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110499/testReport)** for PR 25769 at commit [`27a9f0c`](https://github.com/apache/spark/commit/27a9f0cecd214fa257ea7cda67123929fad85dc8). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25767: [SPARK-29062][SQL] Add V1_BATCH_WRITE to the TableCapabilityChecks
AmplabJenkins removed a comment on issue #25767: [SPARK-29062][SQL] Add V1_BATCH_WRITE to the TableCapabilityChecks URL: https://github.com/apache/spark/pull/25767#issuecomment-530650331 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25767: [SPARK-29062][SQL] Add V1_BATCH_WRITE to the TableCapabilityChecks
AmplabJenkins commented on issue #25767: [SPARK-29062][SQL] Add V1_BATCH_WRITE to the TableCapabilityChecks URL: https://github.com/apache/spark/pull/25767#issuecomment-530650336 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/110494/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #25767: [SPARK-29062][SQL] Add V1_BATCH_WRITE to the TableCapabilityChecks
SparkQA removed a comment on issue #25767: [SPARK-29062][SQL] Add V1_BATCH_WRITE to the TableCapabilityChecks URL: https://github.com/apache/spark/pull/25767#issuecomment-530620211 **[Test build #110494 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110494/testReport)** for PR 25767 at commit [`a3f87bf`](https://github.com/apache/spark/commit/a3f87bf69968a7f6ebe4fd75e0f904c9db024db2). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #23952: [SPARK-26929][SQL]fix table owner use user instead of principal when create table through spark-sql or beeline
AmplabJenkins removed a comment on issue #23952: [SPARK-26929][SQL]fix table owner use user instead of principal when create table through spark-sql or beeline URL: https://github.com/apache/spark/pull/23952#issuecomment-530650162 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/110496/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #23952: [SPARK-26929][SQL]fix table owner use user instead of principal when create table through spark-sql or beeline
AmplabJenkins removed a comment on issue #23952: [SPARK-26929][SQL]fix table owner use user instead of principal when create table through spark-sql or beeline URL: https://github.com/apache/spark/pull/23952#issuecomment-530650160 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25767: [SPARK-29062][SQL] Add V1_BATCH_WRITE to the TableCapabilityChecks
AmplabJenkins commented on issue #25767: [SPARK-29062][SQL] Add V1_BATCH_WRITE to the TableCapabilityChecks URL: https://github.com/apache/spark/pull/25767#issuecomment-530650331 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25767: [SPARK-29062][SQL] Add V1_BATCH_WRITE to the TableCapabilityChecks
SparkQA commented on issue #25767: [SPARK-29062][SQL] Add V1_BATCH_WRITE to the TableCapabilityChecks URL: https://github.com/apache/spark/pull/25767#issuecomment-530650187 **[Test build #110494 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110494/testReport)** for PR 25767 at commit [`a3f87bf`](https://github.com/apache/spark/commit/a3f87bf69968a7f6ebe4fd75e0f904c9db024db2). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25769: [SPARK-29032][CORE] Add PrometheusServlet to monitor Masster/Worker/Driver
AmplabJenkins commented on issue #25769: [SPARK-29032][CORE] Add PrometheusServlet to monitor Masster/Worker/Driver URL: https://github.com/apache/spark/pull/25769#issuecomment-530650100 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #23952: [SPARK-26929][SQL]fix table owner use user instead of principal when create table through spark-sql or beeline
AmplabJenkins commented on issue #23952: [SPARK-26929][SQL]fix table owner use user instead of principal when create table through spark-sql or beeline URL: https://github.com/apache/spark/pull/23952#issuecomment-530650162 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/110496/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #23952: [SPARK-26929][SQL]fix table owner use user instead of principal when create table through spark-sql or beeline
SparkQA removed a comment on issue #23952: [SPARK-26929][SQL]fix table owner use user instead of principal when create table through spark-sql or beeline URL: https://github.com/apache/spark/pull/23952#issuecomment-530630631 **[Test build #110496 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110496/testReport)** for PR 23952 at commit [`2f37513`](https://github.com/apache/spark/commit/2f37513969306b89eb9f7650bc3d082b44705114). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25769: [SPARK-29032][CORE] Add PrometheusServlet to monitor Masster/Worker/Driver
AmplabJenkins commented on issue #25769: [SPARK-29032][CORE] Add PrometheusServlet to monitor Masster/Worker/Driver URL: https://github.com/apache/spark/pull/25769#issuecomment-530650105 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/15474/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #23952: [SPARK-26929][SQL]fix table owner use user instead of principal when create table through spark-sql or beeline
AmplabJenkins commented on issue #23952: [SPARK-26929][SQL]fix table owner use user instead of principal when create table through spark-sql or beeline URL: https://github.com/apache/spark/pull/23952#issuecomment-530650160 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #23952: [SPARK-26929][SQL]fix table owner use user instead of principal when create table through spark-sql or beeline
SparkQA commented on issue #23952: [SPARK-26929][SQL]fix table owner use user instead of principal when create table through spark-sql or beeline URL: https://github.com/apache/spark/pull/23952#issuecomment-530649890 **[Test build #110496 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110496/testReport)** for PR 23952 at commit [`2f37513`](https://github.com/apache/spark/commit/2f37513969306b89eb9f7650bc3d082b44705114). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun opened a new pull request #25769: [SPARK-29032][CORE] Add PrometheusServlet to monitor Masster/Worker/Driver
dongjoon-hyun opened a new pull request #25769: [SPARK-29032][CORE] Add PrometheusServlet to monitor Masster/Worker/Driver URL: https://github.com/apache/spark/pull/25769 ### What changes were proposed in this pull request? This PR aims to simplify `Prometheus` support by adding `PrometheusServlet`. The main use cases are `K8s` and `Spark Standalone` cluster environments. ### Why are the changes needed? Prometheus.io is a CNCF project used widely with K8s. - https://github.com/prometheus/prometheus For `Master/Worker/Driver`, `Spark JMX Sink` and `Prometheus JMX Converter` combination is used in many cases. This PR exports natively support it for the better UX. ### Does this PR introduce any user-facing change? Yes. New web interfaces are added along with the existing JSON API. | | JSON End Point |Prometheus End Point | | --- | --- | -- | | Master | /metrics/master/json/ | /metrics/master/prometheus/ | | Master | /metrics/applications/json/ | /metrics/applications/prometheus/ | | Worker | /metrics/json/ | /metrics/prometheus/ | | Driver | /metrics/json/ | /metrics/prometheus/ | ``` $ bin/spark-shell ... Spark context Web UI available at http://localhost:4040 ... ``` ``` $ curl --silent http://localhost:4040/metrics/prometheus/ | head -n5 metrics_local_1568101220707_driver_BlockManager_disk_diskSpaceUsed_MB_Value 0 metrics_local_1568101220707_driver_BlockManager_memory_maxMem_MB_Value 366 metrics_local_1568101220707_driver_BlockManager_memory_maxOffHeapMem_MB_Value 0 metrics_local_1568101220707_driver_BlockManager_memory_maxOnHeapMem_MB_Value 366 metrics_local_1568101220707_driver_BlockManager_memory_memUsed_MB_Value 0 ``` ### How was this patch tested? Pass the Jenkins with the update UTs and manually connect the new end-points with `curl`. Or, run `prometheus --config.file=config.yaml` with the following configuration and see through the Prometheus UI. **config.yaml** ```yaml global: scrape_interval: 5s evaluation_interval: 15s external_labels: monitor: 'codelab-monitor' rule_files: scrape_configs: - job_name: 'spark-master' metrics_path: '/metrics/master/prometheus/' static_configs: - targets: ['localhost:8080'] - job_name: 'spark-applications' metrics_path: '/metrics/applications/prometheus/' static_configs: - targets: ['localhost:8080'] - job_name: 'spark-worker' metrics_path: '/metrics/prometheus/' static_configs: - targets: ['localhost:8081'] - job_name: 'spark-driver' metrics_path: '/metrics/prometheus/' static_configs: - targets: ['localhost:4040'] ``` This is an automated message from the Apache Git Service.
[GitHub] [spark] AmplabJenkins removed a comment on issue #25768: [SPARK-29063][SQL] Modify fillValue approach to support joined dataframe
AmplabJenkins removed a comment on issue #25768: [SPARK-29063][SQL] Modify fillValue approach to support joined dataframe URL: https://github.com/apache/spark/pull/25768#issuecomment-530648607 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25768: [SPARK-29063][SQL] Modify fillValue approach to support joined dataframe
AmplabJenkins removed a comment on issue #25768: [SPARK-29063][SQL] Modify fillValue approach to support joined dataframe URL: https://github.com/apache/spark/pull/25768#issuecomment-530648612 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/15473/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25768: [SPARK-29063][SQL] Modify fillValue approach to support joined dataframe
SparkQA commented on issue #25768: [SPARK-29063][SQL] Modify fillValue approach to support joined dataframe URL: https://github.com/apache/spark/pull/25768#issuecomment-530648993 **[Test build #110498 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110498/testReport)** for PR 25768 at commit [`3602807`](https://github.com/apache/spark/commit/36028079d6bdd413a306c69abdb7af515e78cc1e). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] xuanyuanking commented on issue #25768: [SPARK-29063][SQL] Modify fillValue approach to support joined dataframe
xuanyuanking commented on issue #25768: [SPARK-29063][SQL] Modify fillValue approach to support joined dataframe URL: https://github.com/apache/spark/pull/25768#issuecomment-530648532 cc @gatorsmile This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25768: [SPARK-29063][SQL] Modify fillValue approach to support joined dataframe
AmplabJenkins commented on issue #25768: [SPARK-29063][SQL] Modify fillValue approach to support joined dataframe URL: https://github.com/apache/spark/pull/25768#issuecomment-530648612 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/15473/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25768: [SPARK-29063][SQL] Modify fillValue approach to support joined dataframe
AmplabJenkins commented on issue #25768: [SPARK-29063][SQL] Modify fillValue approach to support joined dataframe URL: https://github.com/apache/spark/pull/25768#issuecomment-530648607 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] xuanyuanking opened a new pull request #25768: [SPARK-29063][SQL] Modify fillValue approach to support joined dataframe
xuanyuanking opened a new pull request #25768: [SPARK-29063][SQL] Modify fillValue approach to support joined dataframe URL: https://github.com/apache/spark/pull/25768 ### What changes were proposed in this pull request? Modify the approach in `DataFrameNaFunctions.fillValue`, the new one uses `df.withColumns` which only address the columns need to be filled. After this change, there are no more ambiguous fileds detected for joined dataframe. ### Why are the changes needed? Before this change, when you have a joined table that has the same field name from both original table, fillna will fail even if you specify a subset that does not include the 'ambiguous' fields. ``` scala> val df1 = Seq(("f1-1", "f2", null), ("f1-2", null, null), ("f1-3", "f2", "f3-1"), ("f1-4", "f2", "f3-1")).toDF("f1", "f2", "f3") scala> val df2 = Seq(("f1-1", null, null), ("f1-2", "f2", null), ("f1-3", "f2", "f4-1")).toDF("f1", "f2", "f4") scala> val df_join = df1.alias("df1").join(df2.alias("df2"), Seq("f1"), joinType="left_outer") scala> df_join.na.fill("", cols=Seq("f4")) org.apache.spark.sql.AnalysisException: Reference 'f2' is ambiguous, could be: df1.f2, df2.f2.; ``` ### Does this PR introduce any user-facing change? Yes, fillna operation will pass and give the right answer for a joined table. ### How was this patch tested? Local test and newly added UT. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on issue #25741: [SPARK-29032][CORE] Simplify Prometheus support by adding PrometheusServlet/Resource
dongjoon-hyun commented on issue #25741: [SPARK-29032][CORE] Simplify Prometheus support by adding PrometheusServlet/Resource URL: https://github.com/apache/spark/pull/25741#issuecomment-530646672 Sorry for this change~ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on issue #25741: [SPARK-29032][CORE] Simplify Prometheus support by adding PrometheusServlet/Resource
dongjoon-hyun commented on issue #25741: [SPARK-29032][CORE] Simplify Prometheus support by adding PrometheusServlet/Resource URL: https://github.com/apache/spark/pull/25741#issuecomment-530646516 Although this is not a long PR, but I decided to split this into two PRs because `PrometheusResource` is a new feature aligned with SPARK-23429 (which is added at 3.0.0). That will make the review easier. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun closed pull request #25741: [SPARK-29032][CORE] Simplify Prometheus support by adding PrometheusServlet/Resource
dongjoon-hyun closed pull request #25741: [SPARK-29032][CORE] Simplify Prometheus support by adding PrometheusServlet/Resource URL: https://github.com/apache/spark/pull/25741 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25758: [SPARK-28856][FOLLOW-UP][SQL][TEST] Add the `namespaces` keyword to TableIdentifierParserSuite
AmplabJenkins removed a comment on issue #25758: [SPARK-28856][FOLLOW-UP][SQL][TEST] Add the `namespaces` keyword to TableIdentifierParserSuite URL: https://github.com/apache/spark/pull/25758#issuecomment-530641831 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25758: [SPARK-28856][FOLLOW-UP][SQL][TEST] Add the `namespaces` keyword to TableIdentifierParserSuite
AmplabJenkins removed a comment on issue #25758: [SPARK-28856][FOLLOW-UP][SQL][TEST] Add the `namespaces` keyword to TableIdentifierParserSuite URL: https://github.com/apache/spark/pull/25758#issuecomment-530641832 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/15472/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25758: [SPARK-28856][FOLLOW-UP][SQL][TEST] Add the `namespaces` keyword to TableIdentifierParserSuite
AmplabJenkins commented on issue #25758: [SPARK-28856][FOLLOW-UP][SQL][TEST] Add the `namespaces` keyword to TableIdentifierParserSuite URL: https://github.com/apache/spark/pull/25758#issuecomment-530641831 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25758: [SPARK-28856][FOLLOW-UP][SQL][TEST] Add the `namespaces` keyword to TableIdentifierParserSuite
AmplabJenkins commented on issue #25758: [SPARK-28856][FOLLOW-UP][SQL][TEST] Add the `namespaces` keyword to TableIdentifierParserSuite URL: https://github.com/apache/spark/pull/25758#issuecomment-530641832 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/15472/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25758: [SPARK-28856][FOLLOW-UP][SQL][TEST] Add the `namespaces` keyword to TableIdentifierParserSuite
SparkQA commented on issue #25758: [SPARK-28856][FOLLOW-UP][SQL][TEST] Add the `namespaces` keyword to TableIdentifierParserSuite URL: https://github.com/apache/spark/pull/25758#issuecomment-530640733 **[Test build #110497 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110497/testReport)** for PR 25758 at commit [`e9a78f8`](https://github.com/apache/spark/commit/e9a78f8bd736497a2d9c2b90de5ccdc8781e0bf2). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] JkSelf commented on issue #25295: [SPARK-28560][SQL] Optimize shuffle reader to local shuffle reader when smj converted to bhj in adaptive execution
JkSelf commented on issue #25295: [SPARK-28560][SQL] Optimize shuffle reader to local shuffle reader when smj converted to bhj in adaptive execution URL: https://github.com/apache/spark/pull/25295#issuecomment-530640464 @cloud-fan The specific `ShuffleRDD` is implemented by reading the whole data from one mapper output locally to ensure there is no data transferred from the network. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] JkSelf commented on a change in pull request #25295: [SPARK-28560][SQL] Optimize shuffle reader to local shuffle reader when smj converted to bhj in adaptive execution
JkSelf commented on a change in pull request #25295: [SPARK-28560][SQL] Optimize shuffle reader to local shuffle reader when smj converted to bhj in adaptive execution URL: https://github.com/apache/spark/pull/25295#discussion_r323535120 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/ReduceNumShufflePartitions.scala ## @@ -180,25 +180,45 @@ case class ReduceNumShufflePartitions(conf: SQLConf) extends Rule[SparkPlan] { case class CoalescedShuffleReaderExec( child: QueryStageExec, -partitionStartIndices: Array[Int]) extends UnaryExecNode { +partitionStartIndices: Array[Int], +var isLocal: Boolean = false) extends UnaryExecNode { Review comment: `If we change the shuffle to local shuffle reader, then the partitions become pre-shuffle partitions, and their data size is different.` @cloud-fan Here the local shuffle reader is still optimize the post-shuffle partitions. And I don't understand why the partitions become pre-shuffle partitions? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] wangyum commented on issue #25758: [SPARK-28856][FOLLOW-UP][SQL][TEST] Add the `namespaces` keyword to TableIdentifierParserSuite
wangyum commented on issue #25758: [SPARK-28856][FOLLOW-UP][SQL][TEST] Add the `namespaces` keyword to TableIdentifierParserSuite URL: https://github.com/apache/spark/pull/25758#issuecomment-530640285 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon closed pull request #25753: [SPARK-29046][SQL] Fix NPE in SQLConf.get when active SparkContext is stopping
HyukjinKwon closed pull request #25753: [SPARK-29046][SQL] Fix NPE in SQLConf.get when active SparkContext is stopping URL: https://github.com/apache/spark/pull/25753 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org