[GitHub] [spark] AmplabJenkins removed a comment on pull request #34070: [SPARK-36840][SQL] Support DPP if there is no selective predicate on the filtering side
AmplabJenkins removed a comment on pull request #34070: URL: https://github.com/apache/spark/pull/34070#issuecomment-974610838 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145473/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34673: [WIP][SPARK-37343][SQL] Implement createIndex, IndexExists and dropIndex in JDBC (Postgres dialect)
AmplabJenkins removed a comment on pull request #34673: URL: https://github.com/apache/spark/pull/34673#issuecomment-974610792 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49947/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34070: [SPARK-36840][SQL] Support DPP if there is no selective predicate on the filtering side
AmplabJenkins commented on pull request #34070: URL: https://github.com/apache/spark/pull/34070#issuecomment-974610838 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145473/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34673: [WIP][SPARK-37343][SQL] Implement createIndex, IndexExists and dropIndex in JDBC (Postgres dialect)
AmplabJenkins commented on pull request #34673: URL: https://github.com/apache/spark/pull/34673#issuecomment-974610792 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49947/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34070: [SPARK-36840][SQL] Support DPP if there is no selective predicate on the filtering side
SparkQA commented on pull request #34070: URL: https://github.com/apache/spark/pull/34070#issuecomment-974610647 **[Test build #145473 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145473/testReport)** for PR 34070 at commit [`50484e2`](https://github.com/apache/spark/commit/50484e23c714e06057993757efa20034a261d2bf). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #34070: [SPARK-36840][SQL] Support DPP if there is no selective predicate on the filtering side
SparkQA removed a comment on pull request #34070: URL: https://github.com/apache/spark/pull/34070#issuecomment-974582054 **[Test build #145473 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145473/testReport)** for PR 34070 at commit [`50484e2`](https://github.com/apache/spark/commit/50484e23c714e06057993757efa20034a261d2bf). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34673: [WIP][SPARK-37343][SQL] Implement createIndex, IndexExists and dropIndex in JDBC (Postgres dialect)
SparkQA commented on pull request #34673: URL: https://github.com/apache/spark/pull/34673#issuecomment-974609349 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49947/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34673: [WIP][SPARK-37343][SQL] Implement createIndex, IndexExists and dropIndex in JDBC (Postgres dialect)
SparkQA commented on pull request #34673: URL: https://github.com/apache/spark/pull/34673#issuecomment-974603469 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49947/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dchvn commented on pull request #34673: [WIP][SPARK-37343][SQL] Implement createIndex, IndexExists and dropIndex in JDBC (Postgres dialect)
dchvn commented on pull request #34673: URL: https://github.com/apache/spark/pull/34673#issuecomment-974602693 CC @huaxingao FYI. Please take a look when you find some time, thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dchvn commented on a change in pull request #34673: [WIP][SPARK-37343][SQL] Implement createIndex, IndexExists and dropIndex in JDBC (Postgres dialect)
dchvn commented on a change in pull request #34673: URL: https://github.com/apache/spark/pull/34673#discussion_r753640549 ## File path: sql/core/src/main/scala/org/apache/spark/sql/jdbc/PostgresDialect.scala ## @@ -164,4 +172,78 @@ private object PostgresDialect extends JdbcDialect { s"TABLESAMPLE BERNOULLI" + s" (${(sample.upperBound - sample.lowerBound) * 100}) REPEATABLE (${sample.seed})" } + + // CREATE INDEX syntax + // https://www.postgresql.org/docs/14/sql-createindex.html + override def createIndex( + indexName: String, + tableName: String, + columns: Array[NamedReference], + columnsProperties: util.Map[NamedReference, util.Map[String, String]], + properties: util.Map[String, String]): String = { +val columnList = columns.map(col => quoteIdentifier(col.fieldNames.head)) +var indexPropertiesStr: String = "" +var hasIndexProperties: Boolean = false +var indexType = "" + +if (!properties.isEmpty) { + var indexPropertyList: Array[String] = Array.empty + properties.asScala.foreach { case (k, v) => +if (k.equals(SupportsIndex.PROP_TYPE)) { + if (v.equalsIgnoreCase("BTREE") || v.equalsIgnoreCase("HASH")) { +indexType = s"USING $v" + } else { +throw new UnsupportedOperationException(s"Index Type $v is not supported." + + " The supported Index Types are: BTREE and HASH") + } +} else { + hasIndexProperties = true + indexPropertyList = indexPropertyList :+ s"$k = $v" +} + } + if (hasIndexProperties) { +indexPropertiesStr += "WITH (" + indexPropertyList.mkString(", ") + ")" + } +} + +s"CREATE INDEX ${quoteIdentifier(indexName)} ON ${quoteIdentifier(tableName)}" + + s" $indexType (${columnList.mkString(", ")}) $indexPropertiesStr" + } + + // SHOW INDEX syntax + // https://www.postgresql.org/docs/14/view-pg-indexes.html + override def indexExists( + conn: Connection, + indexName: String, + tableName: String, + options: JDBCOptions): Boolean = { +val sql = s"SELECT * FROM pg_indexes WHERE tablename = '$tableName'" +try { + JdbcUtils.checkIfIndexExists(conn, indexName, sql, "indexname", options) +} catch { + case _: Exception => +logWarning("Cannot retrieved index info.") +false +} + } + + // DROP INDEX syntax + // https://www.postgresql.org/docs/14/sql-dropindex.html + override def dropIndex(indexName: String, tableName: String): String = { +s"DROP INDEX ${quoteIdentifier(indexName)}" + } + + override def classifyException(message: String, e: Throwable): AnalysisException = { +e match { + case sqlException: SQLException => +sqlException.getSQLState match { + // https://www.postgresql.org/docs/14/errcodes-appendix.html + case "42P07" => throw new IndexAlreadyExistsException(message, cause = Some(e)) + case "42704" => throw new NoSuchIndexException(message, cause = Some(e)) Review comment: I use Postgres error codes to handle the exception, but `getErrorCode` returns `0`. So I use `getSQLState` instead. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dchvn commented on a change in pull request #34673: [WIP][SPARK-37343][SQL] Implement createIndex, IndexExists and dropIndex in JDBC (Postgres dialect)
dchvn commented on a change in pull request #34673: URL: https://github.com/apache/spark/pull/34673#discussion_r753640303 ## File path: sql/core/src/main/scala/org/apache/spark/sql/jdbc/PostgresDialect.scala ## @@ -164,4 +172,78 @@ private object PostgresDialect extends JdbcDialect { s"TABLESAMPLE BERNOULLI" + s" (${(sample.upperBound - sample.lowerBound) * 100}) REPEATABLE (${sample.seed})" } + + // CREATE INDEX syntax + // https://www.postgresql.org/docs/14/sql-createindex.html + override def createIndex( + indexName: String, + tableName: String, + columns: Array[NamedReference], + columnsProperties: util.Map[NamedReference, util.Map[String, String]], + properties: util.Map[String, String]): String = { +val columnList = columns.map(col => quoteIdentifier(col.fieldNames.head)) +var indexPropertiesStr: String = "" +var hasIndexProperties: Boolean = false +var indexType = "" + +if (!properties.isEmpty) { + var indexPropertyList: Array[String] = Array.empty + properties.asScala.foreach { case (k, v) => +if (k.equals(SupportsIndex.PROP_TYPE)) { + if (v.equalsIgnoreCase("BTREE") || v.equalsIgnoreCase("HASH")) { +indexType = s"USING $v" + } else { +throw new UnsupportedOperationException(s"Index Type $v is not supported." + + " The supported Index Types are: BTREE and HASH") + } +} else { + hasIndexProperties = true + indexPropertyList = indexPropertyList :+ s"$k = $v" +} + } + if (hasIndexProperties) { +indexPropertiesStr += "WITH (" + indexPropertyList.mkString(", ") + ")" + } +} + +s"CREATE INDEX ${quoteIdentifier(indexName)} ON ${quoteIdentifier(tableName)}" + + s" $indexType (${columnList.mkString(", ")}) $indexPropertiesStr" + } + + // SHOW INDEX syntax + // https://www.postgresql.org/docs/14/view-pg-indexes.html + override def indexExists( + conn: Connection, + indexName: String, + tableName: String, + options: JDBCOptions): Boolean = { +val sql = s"SELECT * FROM pg_indexes WHERE tablename = '$tableName'" +try { + JdbcUtils.checkIfIndexExists(conn, indexName, sql, "indexname", options) Review comment: I can not use `JdbcUtils.executeQuery(conn, options, sql)`. When I use it to execute query show index in Postgres, `rs.next()` throw exception `"The ResultSet is closed"`. That function return `resultset`, but it is closed when `statement` is closed in finally block. As in java 8 docs: [ResultSet](https://docs.oracle.com/javase/8/docs/api/java/sql/ResultSet.html#:~:text=A%20ResultSet%20object%20is%20automatically%20closed%20when%20the%20Statement%20object%20that%20generated%20it%20is%20closed%2C%20re-executed%2C%20or%20used%20to%20retrieve%20the%20next%20result%20from%20a%20sequence%20of%20multiple%20results.). So I create `JdbcUtils.checkIfIndexExists` to execute the resultset in `JdbcUtils` for checking if the index exists, and just return boolean. WDYT, @huaxingao? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34642: [SPARK-37369][SQL] Avoid redundant ColumnarToRow transistion on InMemoryTableScan
AmplabJenkins removed a comment on pull request #34642: URL: https://github.com/apache/spark/pull/34642#issuecomment-974599044 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49946/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34673: [WIP][SPARK-37343][SQL] Implement createIndex, IndexExists and dropIndex in JDBC (Postgres dialect)
SparkQA commented on pull request #34673: URL: https://github.com/apache/spark/pull/34673#issuecomment-974599139 **[Test build #145475 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145475/testReport)** for PR 34673 at commit [`baae935`](https://github.com/apache/spark/commit/baae935fc0c0cc47c4397be662ac7d36dfbbe8b0). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34642: [SPARK-37369][SQL] Avoid redundant ColumnarToRow transistion on InMemoryTableScan
AmplabJenkins commented on pull request #34642: URL: https://github.com/apache/spark/pull/34642#issuecomment-974599044 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49946/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dchvn commented on a change in pull request #34673: [WIP][SPARK-37343][SQL] Implement createIndex, IndexExists and dropIndex in JDBC (Postgres dialect)
dchvn commented on a change in pull request #34673: URL: https://github.com/apache/spark/pull/34673#discussion_r753636069 ## File path: external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/v2/V2JDBCTest.scala ## @@ -193,6 +193,8 @@ private[v2] trait V2JDBCTest extends SharedSparkSession with DockerIntegrationFu test("SPARK-36895: Test INDEX Using SQL") { if (supportsIndex) { + val indexOptions = if (catalogName.equals("mysql")) "KEY_BLOCK_SIZE=10" +else if (catalogName.equals("postgresql")) "FILLFACTOR=70" Review comment: Depend on type of jdbc dialect, we change the index options for test. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dchvn opened a new pull request #34673: [WIP][SPARK-37343][SQL] Implement createIndex, IndexExists and dropIndex in JDBC (Postgres dialect)
dchvn opened a new pull request #34673: URL: https://github.com/apache/spark/pull/34673 ### What changes were proposed in this pull request? Implementing `createIndex`/`IndexExists`/`dropIndex` in DS V2 JDBC for Postgres dialect. ### Why are the changes needed? This is a subtask of the V2 Index support. This PR implements `createIndex`, `IndexExists` and `dropIndex`. After review for some changes in this PR, I will create new PR for `listIndexs`, or add it in this PR. This PR only implements `createIndex`, `IndexExists` and `dropIndex` in Postgres dialect. ### Does this PR introduce _any_ user-facing change? Yes, `createIndex`/`IndexExists`/`dropIndex` in DS V2 JDBC ### How was this patch tested? New test. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HeartSaVioR commented on pull request #34630: [SPARK-37224][SS][FOLLOWUP] Add benchmark on basic state store operations
HeartSaVioR commented on pull request #34630: URL: https://github.com/apache/spark/pull/34630#issuecomment-974596292 Thanks @dongjoon-hyun and @viirya for reviewing and merging! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34642: [SPARK-37369][SQL] Avoid redundant ColumnarToRow transistion on InMemoryTableScan
SparkQA commented on pull request #34642: URL: https://github.com/apache/spark/pull/34642#issuecomment-974595947 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49946/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34070: [SPARK-36840][SQL] Support DPP if there is no selective predicate on the filtering side
AmplabJenkins removed a comment on pull request #34070: URL: https://github.com/apache/spark/pull/34070#issuecomment-974593431 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49945/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34070: [SPARK-36840][SQL] Support DPP if there is no selective predicate on the filtering side
AmplabJenkins commented on pull request #34070: URL: https://github.com/apache/spark/pull/34070#issuecomment-974593431 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49945/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34070: [SPARK-36840][SQL] Support DPP if there is no selective predicate on the filtering side
SparkQA commented on pull request #34070: URL: https://github.com/apache/spark/pull/34070#issuecomment-974593427 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49945/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34642: [SPARK-37369][SQL] Avoid redundant ColumnarToRow transistion on InMemoryTableScan
SparkQA commented on pull request #34642: URL: https://github.com/apache/spark/pull/34642#issuecomment-974591704 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49946/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on pull request #34642: [SPARK-37369][SQL] Avoid redundant ColumnarToRow transistion on InMemoryTableScan
viirya commented on pull request #34642: URL: https://github.com/apache/spark/pull/34642#issuecomment-974588925 cc @cloud-fan @dongjoon-hyun @sunchao -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34642: [SPARK-37369][SQL] Avoid redundant ColumnarToRow transistion on InMemoryTableScan
SparkQA commented on pull request #34642: URL: https://github.com/apache/spark/pull/34642#issuecomment-974587333 **[Test build #145474 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145474/testReport)** for PR 34642 at commit [`5fa5aeb`](https://github.com/apache/spark/commit/5fa5aeb155dfba20c94f35325b4ea65a1c6551f8). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34070: [SPARK-36840][SQL] Support DPP if there is no selective predicate on the filtering side
SparkQA commented on pull request #34070: URL: https://github.com/apache/spark/pull/34070#issuecomment-974587251 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49945/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34611: [SPARK-35867][SQL] Enable vectorized read for VectorizedPlainValuesReader.readBooleans
AmplabJenkins removed a comment on pull request #34611: URL: https://github.com/apache/spark/pull/34611#issuecomment-974586629 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145469/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34607: [SPARK-36038][CORE] Speculation metrics summary at stage level
AmplabJenkins removed a comment on pull request #34607: URL: https://github.com/apache/spark/pull/34607#issuecomment-974586626 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34070: [SPARK-36840][SQL] Support DPP if there is no selective predicate on the filtering side
AmplabJenkins commented on pull request #34070: URL: https://github.com/apache/spark/pull/34070#issuecomment-974586627 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145471/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34607: [SPARK-36038][CORE] Speculation metrics summary at stage level
AmplabJenkins commented on pull request #34607: URL: https://github.com/apache/spark/pull/34607#issuecomment-974586628 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34070: [SPARK-36840][SQL] Support DPP if there is no selective predicate on the filtering side
AmplabJenkins removed a comment on pull request #34070: URL: https://github.com/apache/spark/pull/34070#issuecomment-974586627 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145471/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34611: [SPARK-35867][SQL] Enable vectorized read for VectorizedPlainValuesReader.readBooleans
AmplabJenkins commented on pull request #34611: URL: https://github.com/apache/spark/pull/34611#issuecomment-974586629 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145469/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #34070: [SPARK-36840][SQL] Support DPP if there is no selective predicate on the filtering side
SparkQA removed a comment on pull request #34070: URL: https://github.com/apache/spark/pull/34070#issuecomment-974560164 **[Test build #145471 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145471/testReport)** for PR 34070 at commit [`fa966bb`](https://github.com/apache/spark/commit/fa966bbb2ffa6dbccbb0689ac8b2de002bdf4080). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34070: [SPARK-36840][SQL] Support DPP if there is no selective predicate on the filtering side
SparkQA commented on pull request #34070: URL: https://github.com/apache/spark/pull/34070#issuecomment-974586399 **[Test build #145471 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145471/testReport)** for PR 34070 at commit [`fa966bb`](https://github.com/apache/spark/commit/fa966bbb2ffa6dbccbb0689ac8b2de002bdf4080). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #34607: [SPARK-36038][CORE] Speculation metrics summary at stage level
SparkQA removed a comment on pull request #34607: URL: https://github.com/apache/spark/pull/34607#issuecomment-974563277 **[Test build #145472 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145472/testReport)** for PR 34607 at commit [`2ba73d0`](https://github.com/apache/spark/commit/2ba73d03b07eca48f5371aa93f18d5ac5a09225d). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34607: [SPARK-36038][CORE] Speculation metrics summary at stage level
SparkQA commented on pull request #34607: URL: https://github.com/apache/spark/pull/34607#issuecomment-974586006 **[Test build #145472 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145472/testReport)** for PR 34607 at commit [`2ba73d0`](https://github.com/apache/spark/commit/2ba73d03b07eca48f5371aa93f18d5ac5a09225d). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #34611: [SPARK-35867][SQL] Enable vectorized read for VectorizedPlainValuesReader.readBooleans
SparkQA removed a comment on pull request #34611: URL: https://github.com/apache/spark/pull/34611#issuecomment-974520855 **[Test build #145469 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145469/testReport)** for PR 34611 at commit [`a299f42`](https://github.com/apache/spark/commit/a299f423157e1d18c7a4f18ff4f3695e56228dc0). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34611: [SPARK-35867][SQL] Enable vectorized read for VectorizedPlainValuesReader.readBooleans
SparkQA commented on pull request #34611: URL: https://github.com/apache/spark/pull/34611#issuecomment-974584273 **[Test build #145469 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145469/testReport)** for PR 34611 at commit [`a299f42`](https://github.com/apache/spark/commit/a299f423157e1d18c7a4f18ff4f3695e56228dc0). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34607: [SPARK-36038][CORE] Speculation metrics summary at stage level
SparkQA commented on pull request #34607: URL: https://github.com/apache/spark/pull/34607#issuecomment-974583164 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49944/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34070: [SPARK-36840][SQL] Support DPP if there is no selective predicate on the filtering side
SparkQA commented on pull request #34070: URL: https://github.com/apache/spark/pull/34070#issuecomment-974582054 **[Test build #145473 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145473/testReport)** for PR 34070 at commit [`50484e2`](https://github.com/apache/spark/commit/50484e23c714e06057993757efa20034a261d2bf). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34070: [SPARK-36840][SQL] Support DPP if there is no selective predicate on the filtering side
AmplabJenkins removed a comment on pull request #34070: URL: https://github.com/apache/spark/pull/34070#issuecomment-974579898 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49943/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34070: [SPARK-36840][SQL] Support DPP if there is no selective predicate on the filtering side
AmplabJenkins commented on pull request #34070: URL: https://github.com/apache/spark/pull/34070#issuecomment-974579898 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49943/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] imback82 commented on a change in pull request #34666: [SPARK-37192][SQL] Migrate SHOW TBLPROPERTIES to use V2 command by default
imback82 commented on a change in pull request #34666: URL: https://github.com/apache/spark/pull/34666#discussion_r753622693 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/command/ShowTblPropertiesSuiteBase.scala ## @@ -87,4 +88,25 @@ trait ShowTblPropertiesSuiteBase extends QueryTest with DDLCommandTestUtils { assert(res.head.getString(1).contains(s"does not have property: $nonExistingKey")) } } + + test("KEEP THE LEGACY OUTPUT SCHEMA") { +Seq(true, false).foreach { keepLegacySchema => + withSQLConf(SQLConf.LEGACY_KEEP_COMMAND_OUTPUT_SCHEMA.key -> keepLegacySchema.toString) { +withNamespaceAndTable("ns1", "tbl") { tbl => + spark.sql(s"CREATE TABLE $tbl (id bigint, data string) $defaultUsing " + +s"TBLPROPERTIES ('user'='spark', 'status'='new')") Review comment: nit: `s` not needed. ## File path: sql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala ## @@ -401,15 +401,9 @@ class ResolveSessionCatalog(val catalogManager: CatalogManager) throw QueryCompilationErrors.externalCatalogNotSupportShowViewsError(resolved) } -case s @ ShowTableProperties(ResolvedV1TableOrViewIdentifier(ident), propertyKey, output) => - val newOutput = -if (conf.getConf(SQLConf.LEGACY_KEEP_COMMAND_OUTPUT_SCHEMA) && propertyKey.isDefined) { - assert(output.length == 2) - output.tail -} else { - output -} - ShowTablePropertiesCommand(ident.asTableIdentifier, propertyKey, newOutput) +case s @ ShowTableProperties(ResolvedV1TableOrViewIdentifier(ident), propertyKey, output) + if conf.useV1Command => +ShowTablePropertiesCommand(ident.asTableIdentifier, propertyKey, output) Review comment: nit: ```suggestion if conf.useV1Command => ShowTablePropertiesCommand(ident.asTableIdentifier, propertyKey, output) ``` ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/command/v1/ShowTblPropertiesSuite.scala ## @@ -56,7 +44,7 @@ trait ShowTblPropertiesSuiteBase extends command.ShowTblPropertiesSuiteBase } } - test("SHOW TBLPROPERTIES FOR TEMPORARY IEW") { + testV1("SHOW TBLPROPERTIES FOR TEMPORARY IEW") { Review comment: nit: IEW => VIEW -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun closed pull request #34648: [SPARK-37282][TESTS][FOLLOWUP] Extract `Utils.isMacOnAppleSilicon` for reuse in UTs
dongjoon-hyun closed pull request #34648: URL: https://github.com/apache/spark/pull/34648 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34070: [SPARK-36840][SQL] Support DPP if there is no selective predicate on the filtering side
SparkQA commented on pull request #34070: URL: https://github.com/apache/spark/pull/34070#issuecomment-974578359 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49943/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34607: [SPARK-36038][CORE] Speculation metrics summary at stage level
SparkQA commented on pull request #34607: URL: https://github.com/apache/spark/pull/34607#issuecomment-974577901 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49944/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] mridulm commented on a change in pull request #34635: [SPARK-37205][YARN] Introduce a new config 'spark.yarn.am.tokenConfRegex' to support renewing delegation tokens in a multi-cluste
mridulm commented on a change in pull request #34635: URL: https://github.com/apache/spark/pull/34635#discussion_r753617789 ## File path: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala ## @@ -340,6 +344,40 @@ private[spark] class Client( amContainer.setTokens(ByteBuffer.wrap(serializedCreds)) } + /** + * Set configurations sent from AM to RM for renewing delegation tokens. + */ + private def setTokenConf(amContainer: ContainerLaunchContext): Unit = { +// SPARK-37205: this regex is used to grep a list of configurations and send them to YARN RM +// for fetching delegation tokens. See YARN-5910 for more details. +// The feature is only supported in Hadoop 3.x and up, hence the check below. +val regex = sparkConf.get(config.AM_SEND_TOKEN_CONF) +if (regex.nonEmpty && VersionUtils.isHadoop3) { + logInfo(s"Processing token conf (spark.yarn.am.sendTokenConf) with regex $regex") + val dob = new DataOutputBuffer(); + val copy = new Configuration(false); + copy.clear(); + hadoopConf.asScala.foreach { entry => +if (entry.getKey.matches(regex.get)) { + copy.set(entry.getKey, entry.getValue) + logInfo(s"Captured key: ${entry.getKey} -> value: ${entry.getValue}") +} + } + copy.write(dob); + + // since this method was added in Hadoop 2.9 and 3.0, we use reflection here to avoid Review comment: Exactly - both 2.9 and 2.10 for example. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34070: [SPARK-36840][SQL] Support DPP if there is no selective predicate on the filtering side
SparkQA commented on pull request #34070: URL: https://github.com/apache/spark/pull/34070#issuecomment-974568862 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49943/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on pull request #34642: [SPARK-37369][SQL] Avoid redundant ColumnarToRow transistion on InMemoryTableScan
viirya commented on pull request #34642: URL: https://github.com/apache/spark/pull/34642#issuecomment-974567318 Ideally, yes, it is a more general one to have another flag. In practice, I suspect if there will be more such nodes that could choose to output row-based or columnar output for some conditions? For the in-memory relation scan here, adding a new flag `supportsRowBased` seems not bring too much benefit? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34607: [SPARK-36038][CORE] Speculation metrics summary at stage level
SparkQA commented on pull request #34607: URL: https://github.com/apache/spark/pull/34607#issuecomment-974563277 **[Test build #145472 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145472/testReport)** for PR 34607 at commit [`2ba73d0`](https://github.com/apache/spark/commit/2ba73d03b07eca48f5371aa93f18d5ac5a09225d). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #34607: [SPARK-36038][CORE] Speculation metrics summary at stage level
SparkQA removed a comment on pull request #34607: URL: https://github.com/apache/spark/pull/34607#issuecomment-974520884 **[Test build #145470 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145470/testReport)** for PR 34607 at commit [`175d4e6`](https://github.com/apache/spark/commit/175d4e62d6180f416ab40821282ceb8064fef0ee). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34607: [SPARK-36038][CORE] Speculation metrics summary at stage level
AmplabJenkins removed a comment on pull request #34607: URL: https://github.com/apache/spark/pull/34607#issuecomment-974561743 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145470/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34607: [SPARK-36038][CORE] Speculation metrics summary at stage level
AmplabJenkins commented on pull request #34607: URL: https://github.com/apache/spark/pull/34607#issuecomment-974561743 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145470/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34607: [SPARK-36038][CORE] Speculation metrics summary at stage level
SparkQA commented on pull request #34607: URL: https://github.com/apache/spark/pull/34607#issuecomment-974561491 **[Test build #145470 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145470/testReport)** for PR 34607 at commit [`175d4e6`](https://github.com/apache/spark/commit/175d4e62d6180f416ab40821282ceb8064fef0ee). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34070: [SPARK-36840][SQL] Support DPP if there is no selective predicate on the filtering side
SparkQA commented on pull request #34070: URL: https://github.com/apache/spark/pull/34070#issuecomment-974560164 **[Test build #145471 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145471/testReport)** for PR 34070 at commit [`fa966bb`](https://github.com/apache/spark/commit/fa966bbb2ffa6dbccbb0689ac8b2de002bdf4080). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34655: [SPARK-37380][PYTHON] Miscellaneous Python lint infra cleanup
AmplabJenkins removed a comment on pull request #34655: URL: https://github.com/apache/spark/pull/34655#issuecomment-974559382 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145467/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34655: [SPARK-37380][PYTHON] Miscellaneous Python lint infra cleanup
AmplabJenkins commented on pull request #34655: URL: https://github.com/apache/spark/pull/34655#issuecomment-974559382 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145467/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34607: [SPARK-36038][CORE] Speculation metrics summary at stage level
AmplabJenkins removed a comment on pull request #34607: URL: https://github.com/apache/spark/pull/34607#issuecomment-974558972 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49942/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34611: [SPARK-35867][SQL] Enable vectorized read for VectorizedPlainValuesReader.readBooleans
AmplabJenkins removed a comment on pull request #34611: URL: https://github.com/apache/spark/pull/34611#issuecomment-974558973 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49941/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34607: [SPARK-36038][CORE] Speculation metrics summary at stage level
AmplabJenkins commented on pull request #34607: URL: https://github.com/apache/spark/pull/34607#issuecomment-974558972 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49942/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34611: [SPARK-35867][SQL] Enable vectorized read for VectorizedPlainValuesReader.readBooleans
AmplabJenkins commented on pull request #34611: URL: https://github.com/apache/spark/pull/34611#issuecomment-974558973 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49941/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #34655: [SPARK-37380][PYTHON] Miscellaneous Python lint infra cleanup
SparkQA removed a comment on pull request #34655: URL: https://github.com/apache/spark/pull/34655#issuecomment-974491127 **[Test build #145467 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145467/testReport)** for PR 34655 at commit [`5418ec5`](https://github.com/apache/spark/commit/5418ec5d186b0192dc796226937e059ea8566a24). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34655: [SPARK-37380][PYTHON] Miscellaneous Python lint infra cleanup
SparkQA commented on pull request #34655: URL: https://github.com/apache/spark/pull/34655#issuecomment-974558827 **[Test build #145467 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145467/testReport)** for PR 34655 at commit [`5418ec5`](https://github.com/apache/spark/commit/5418ec5d186b0192dc796226937e059ea8566a24). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] wangyum commented on a change in pull request #34070: [SPARK-36840][SQL] Support DPP if there is no selective predicate on the filtering side
wangyum commented on a change in pull request #34070: URL: https://github.com/apache/spark/pull/34070#discussion_r753606004 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/dynamicpruning/PartitionPruning.scala ## @@ -225,16 +230,6 @@ object PartitionPruning extends Rule[LogicalPlan] with PredicateHelper with Join }.isDefined } - /** - * To be able to prune partitions on a join key, the filtering side needs to - * meet the following requirements: - * (1) it can not be a stream - * (2) it needs to contain a selective predicate used for filtering Review comment: Base on another case https://github.com/apache/spark/pull/34070#issuecomment-973893038. May be we can support DPP if filtering side rows less than 1000(`spark.sql.optimizer.dynamicPartitionPruning.filteringSideThreshold`). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34607: [SPARK-36038][CORE] Speculation metrics summary at stage level
SparkQA commented on pull request #34607: URL: https://github.com/apache/spark/pull/34607#issuecomment-974554782 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49942/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34611: [SPARK-35867][SQL] Enable vectorized read for VectorizedPlainValuesReader.readBooleans
SparkQA commented on pull request #34611: URL: https://github.com/apache/spark/pull/34611#issuecomment-974552769 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49941/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] github-actions[bot] commented on pull request #33662: [SPARK-36162][SQL] Support estimation of equal null safe join
github-actions[bot] commented on pull request #33662: URL: https://github.com/apache/spark/pull/33662#issuecomment-974551213 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] github-actions[bot] commented on pull request #33632: [SPARK-36360][Streaming] Delete appName from StreamingSource sourceName
github-actions[bot] commented on pull request #33632: URL: https://github.com/apache/spark/pull/33632#issuecomment-974551221 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34670: [SPARK-37388][SQL] Fix NPE in WidthBucket in WholeStageCodegenExec
AmplabJenkins removed a comment on pull request #34670: URL: https://github.com/apache/spark/pull/34670#issuecomment-974543843 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145466/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34670: [SPARK-37388][SQL] Fix NPE in WidthBucket in WholeStageCodegenExec
AmplabJenkins commented on pull request #34670: URL: https://github.com/apache/spark/pull/34670#issuecomment-974543843 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145466/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #34670: [SPARK-37388][SQL] Fix NPE in WidthBucket in WholeStageCodegenExec
SparkQA removed a comment on pull request #34670: URL: https://github.com/apache/spark/pull/34670#issuecomment-974491065 **[Test build #145466 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145466/testReport)** for PR 34670 at commit [`0cba30d`](https://github.com/apache/spark/commit/0cba30d995a5056cf0f3762fa0d5ec88d2282e05). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34670: [SPARK-37388][SQL] Fix NPE in WidthBucket in WholeStageCodegenExec
SparkQA commented on pull request #34670: URL: https://github.com/apache/spark/pull/34670#issuecomment-974543723 **[Test build #145466 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145466/testReport)** for PR 34670 at commit [`0cba30d`](https://github.com/apache/spark/commit/0cba30d995a5056cf0f3762fa0d5ec88d2282e05). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34671: [SPARK-37393][PySpark][ML] Merge {ml, mllib}/common.pyi into common.py
AmplabJenkins removed a comment on pull request #34671: URL: https://github.com/apache/spark/pull/34671#issuecomment-974542698 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49940/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34671: [SPARK-37393][PySpark][ML] Merge {ml, mllib}/common.pyi into common.py
AmplabJenkins commented on pull request #34671: URL: https://github.com/apache/spark/pull/34671#issuecomment-974542698 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49940/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34671: [SPARK-37393][PySpark][ML] Merge {ml, mllib}/common.pyi into common.py
SparkQA commented on pull request #34671: URL: https://github.com/apache/spark/pull/34671#issuecomment-974542689 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49940/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34672: [SPARK-37394] Skip registering with ESS if a customized shuffle manager is configured
AmplabJenkins commented on pull request #34672: URL: https://github.com/apache/spark/pull/34672#issuecomment-974542514 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34655: [SPARK-37380][PYTHON] Miscellaneous Python lint infra cleanup
AmplabJenkins removed a comment on pull request #34655: URL: https://github.com/apache/spark/pull/34655#issuecomment-974542127 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49939/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34637: Spark-37349 add SQL Rest API parsing logic
AmplabJenkins removed a comment on pull request #34637: URL: https://github.com/apache/spark/pull/34637#issuecomment-974542125 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145465/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34670: [SPARK-37388][SQL] Fix NPE in WidthBucket in WholeStageCodegenExec
AmplabJenkins removed a comment on pull request #34670: URL: https://github.com/apache/spark/pull/34670#issuecomment-974542124 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49938/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34655: [SPARK-37380][PYTHON] Miscellaneous Python lint infra cleanup
AmplabJenkins commented on pull request #34655: URL: https://github.com/apache/spark/pull/34655#issuecomment-974542127 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49939/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34670: [SPARK-37388][SQL] Fix NPE in WidthBucket in WholeStageCodegenExec
AmplabJenkins commented on pull request #34670: URL: https://github.com/apache/spark/pull/34670#issuecomment-974542124 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49938/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34637: Spark-37349 add SQL Rest API parsing logic
AmplabJenkins commented on pull request #34637: URL: https://github.com/apache/spark/pull/34637#issuecomment-974542125 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145465/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34611: [SPARK-35867][SQL] Enable vectorized read for VectorizedPlainValuesReader.readBooleans
SparkQA commented on pull request #34611: URL: https://github.com/apache/spark/pull/34611#issuecomment-974538272 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49941/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #34637: Spark-37349 add SQL Rest API parsing logic
SparkQA removed a comment on pull request #34637: URL: https://github.com/apache/spark/pull/34637#issuecomment-974452275 **[Test build #145465 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145465/testReport)** for PR 34637 at commit [`141a16f`](https://github.com/apache/spark/commit/141a16f2aa767648982f1df088ec1e4667beb86d). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34637: Spark-37349 add SQL Rest API parsing logic
SparkQA commented on pull request #34637: URL: https://github.com/apache/spark/pull/34637#issuecomment-974537708 **[Test build #145465 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145465/testReport)** for PR 34637 at commit [`141a16f`](https://github.com/apache/spark/commit/141a16f2aa767648982f1df088ec1e4667beb86d). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34607: [SPARK-36038][CORE] Speculation metrics summary at stage level
SparkQA commented on pull request #34607: URL: https://github.com/apache/spark/pull/34607#issuecomment-974537283 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49942/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] JoshRosen closed pull request #34658: [SPARK-37379][SQL] Add tree pattern pruning to CTESubstitution rule
JoshRosen closed pull request #34658: URL: https://github.com/apache/spark/pull/34658 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] yangwwei opened a new pull request #34672: [SPARK-37394] Skip registering to ESS if a customized shuffle manager is configured
yangwwei opened a new pull request #34672: URL: https://github.com/apache/spark/pull/34672 ### What changes were proposed in this pull request? Propose to skip registering with ESS if a customized shuffle manager (Remote Shuffle Service) is configured. Otherwise, when the dynamic allocation is enabled without an external shuffle service in place, the Spark executor still tries to connect to the external shuffle service which gets to a connection refused exception. ### Why are the changes needed? To get dynamic allocation works with a 3rd party remote shuffle service. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Test locally -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #34284: [SPARK-36900][CORE][TEST] Refactor `SPARK-36464: size returns correct positive number even with over 2GB data` to pass with Java 8, 11
dongjoon-hyun commented on pull request #34284: URL: https://github.com/apache/spark/pull/34284#issuecomment-974529623 Hi, All. I'll cherry-pick this test PR to branch-3.2 to improve `branch-3.2` stability. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34655: [SPARK-37380][PYTHON] Miscellaneous Python lint infra cleanup
SparkQA commented on pull request #34655: URL: https://github.com/apache/spark/pull/34655#issuecomment-974529416 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49939/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34670: [SPARK-37388][SQL] Fix NPE in WidthBucket in WholeStageCodegenExec
SparkQA commented on pull request #34670: URL: https://github.com/apache/spark/pull/34670#issuecomment-974529390 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49938/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zero323 commented on pull request #34671: [SPARK-37393][PySpark][ML] Merge {ml, mllib}/common.pyi into common.py
zero323 commented on pull request #34671: URL: https://github.com/apache/spark/pull/34671#issuecomment-974525917 > Just to be clear, are you saying I should split this PR into ml/common.py vs. mllib/common.py? > > And then have an umbrella ticket for adding type annotations to all of ml/, and another one for mllib/? For the context ‒ we're in the middle of the process of inlining hints from stubs to inline hints. At the moment we have two umbrella tickets ‒ SPARK-36845 and SPARK-36845 for SQL and core respectively. We should follow this convention for ml and mllib as well. It should be OK to have two tickets (`ml.common` and `mllib.common`) and resolve both in this PR, since you've already started working on that. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zero323 commented on a change in pull request #34671: [SPARK-37393][PySpark][ML] Merge {ml, mllib}/common.pyi into common.py
zero323 commented on a change in pull request #34671: URL: https://github.com/apache/spark/pull/34671#discussion_r753580341 ## File path: python/mypy.ini ## @@ -41,9 +41,15 @@ disallow_untyped_defs = False [mypy-pyspark.join] disallow_untyped_defs = False +[mypy-pyspark.ml.*] +disallow_untyped_defs = False + Review comment: Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34671: [SPARK-37393][PySpark][ML] Merge {ml, mllib}/common.pyi into common.py
SparkQA commented on pull request #34671: URL: https://github.com/apache/spark/pull/34671#issuecomment-974521227 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49940/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34607: [SPARK-36038][CORE] Speculation metrics summary at stage level
SparkQA commented on pull request #34607: URL: https://github.com/apache/spark/pull/34607#issuecomment-974520884 **[Test build #145470 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145470/testReport)** for PR 34607 at commit [`175d4e6`](https://github.com/apache/spark/commit/175d4e62d6180f416ab40821282ceb8064fef0ee). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34611: [SPARK-35867][SQL] Enable vectorized read for VectorizedPlainValuesReader.readBooleans
SparkQA commented on pull request #34611: URL: https://github.com/apache/spark/pull/34611#issuecomment-974520855 **[Test build #145469 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145469/testReport)** for PR 34611 at commit [`a299f42`](https://github.com/apache/spark/commit/a299f423157e1d18c7a4f18ff4f3695e56228dc0). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34671: [SPARK-37393][PySpark][ML] Merge {ml, mllib}/common.pyi into common.py
AmplabJenkins removed a comment on pull request #34671: URL: https://github.com/apache/spark/pull/34671#issuecomment-974519565 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145468/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34656: [SPARK-37376][SQL] Introduce a new DataSource V2 interface HasPartitionKey
AmplabJenkins commented on pull request #34656: URL: https://github.com/apache/spark/pull/34656#issuecomment-974519564 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145460/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34671: [SPARK-37393][PySpark][ML] Merge {ml, mllib}/common.pyi into common.py
AmplabJenkins commented on pull request #34671: URL: https://github.com/apache/spark/pull/34671#issuecomment-974519565 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145468/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34637: Spark-37349 add SQL Rest API parsing logic
AmplabJenkins removed a comment on pull request #34637: URL: https://github.com/apache/spark/pull/34637#issuecomment-974519563 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49937/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34656: [SPARK-37376][SQL] Introduce a new DataSource V2 interface HasPartitionKey
AmplabJenkins removed a comment on pull request #34656: URL: https://github.com/apache/spark/pull/34656#issuecomment-974519564 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145460/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org