[GitHub] [spark] AmplabJenkins commented on pull request #30177: [SPARK-33277][PYSPARK][SQL] Use ContextAwareIterator to stop consuming after the task ends.
AmplabJenkins commented on pull request #30177: URL: https://github.com/apache/spark/pull/30177#issuecomment-718370669 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30177: [SPARK-33277][PYSPARK][SQL] Use ContextAwareIterator to stop consuming after the task ends.
AmplabJenkins removed a comment on pull request #30177: URL: https://github.com/apache/spark/pull/30177#issuecomment-718370669 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #30177: [SPARK-33277][PYSPARK][SQL] Use ContextAwareIterator to stop consuming after the task ends.
SparkQA removed a comment on pull request #30177: URL: https://github.com/apache/spark/pull/30177#issuecomment-718295002 **[Test build #130389 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130389/testReport)** for PR 30177 at commit [`ac42502`](https://github.com/apache/spark/commit/ac42502576d428325126a38aac8c179dcbc26f88). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30177: [SPARK-33277][PYSPARK][SQL] Use ContextAwareIterator to stop consuming after the task ends.
SparkQA commented on pull request #30177: URL: https://github.com/apache/spark/pull/30177#issuecomment-718369946 **[Test build #130389 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130389/testReport)** for PR 30177 at commit [`ac42502`](https://github.com/apache/spark/commit/ac42502576d428325126a38aac8c179dcbc26f88). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class ContextAwareIterator[IN](iter: Iterator[IN], context: TaskContext) extends Iterator[IN] ` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29247: [SPARK-32446][SHS] Add new executor metrics summary REST APIs
AmplabJenkins removed a comment on pull request #29247: URL: https://github.com/apache/spark/pull/29247#issuecomment-718367667 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29247: [SPARK-32446][SHS] Add new executor metrics summary REST APIs
AmplabJenkins commented on pull request #29247: URL: https://github.com/apache/spark/pull/29247#issuecomment-718367667 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29247: [SPARK-32446][SHS] Add new executor metrics summary REST APIs
SparkQA commented on pull request #29247: URL: https://github.com/apache/spark/pull/29247#issuecomment-718367645 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34997/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #30162: [SPARK-33263][SS] Configurable StateStore compression codec
dongjoon-hyun commented on a change in pull request #30162: URL: https://github.com/apache/spark/pull/30162#discussion_r513980401 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/state/StateStoreCompatibilitySuite.scala ## @@ -0,0 +1,84 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.streaming.state + +import java.io.File + +import org.apache.commons.io.FileUtils + +import org.apache.spark.SparkFunSuite +import org.apache.spark.io.CompressionCodec +import org.apache.spark.sql.catalyst.plans.PlanTestBase +import org.apache.spark.sql.catalyst.streaming.InternalOutputModes.Update +import org.apache.spark.sql.execution.streaming.MemoryStream +import org.apache.spark.sql.functions.count +import org.apache.spark.sql.internal.SQLConf +import org.apache.spark.sql.streaming.StreamTest +import org.apache.spark.util.Utils + +class StateStoreCompatibilitySuite extends StreamTest with StateStoreCodecsTest { + testWithAllCodec( Review comment: Thanks, @HeartSaVioR . This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30177: [SPARK-33277][PYSPARK][SQL] Use ContextAwareIterator to stop consuming after the task ends.
AmplabJenkins removed a comment on pull request #30177: URL: https://github.com/apache/spark/pull/30177#issuecomment-718363821 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30177: [SPARK-33277][PYSPARK][SQL] Use ContextAwareIterator to stop consuming after the task ends.
AmplabJenkins commented on pull request #30177: URL: https://github.com/apache/spark/pull/30177#issuecomment-718363821 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #30177: [SPARK-33277][PYSPARK][SQL] Use ContextAwareIterator to stop consuming after the task ends.
SparkQA removed a comment on pull request #30177: URL: https://github.com/apache/spark/pull/30177#issuecomment-718283783 **[Test build #130388 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130388/testReport)** for PR 30177 at commit [`b34805c`](https://github.com/apache/spark/commit/b34805c9e32fc43011d789e5209ca8f83e646502). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30177: [SPARK-33277][PYSPARK][SQL] Use ContextAwareIterator to stop consuming after the task ends.
SparkQA commented on pull request #30177: URL: https://github.com/apache/spark/pull/30177#issuecomment-718363117 **[Test build #130388 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130388/testReport)** for PR 30177 at commit [`b34805c`](https://github.com/apache/spark/commit/b34805c9e32fc43011d789e5209ca8f83e646502). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
AmplabJenkins removed a comment on pull request #30178: URL: https://github.com/apache/spark/pull/30178#issuecomment-718362027 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
SparkQA commented on pull request #30178: URL: https://github.com/apache/spark/pull/30178#issuecomment-718362021 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34996/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
AmplabJenkins commented on pull request #30178: URL: https://github.com/apache/spark/pull/30178#issuecomment-718362027 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29247: [SPARK-32446][SHS] Add new executor metrics summary REST APIs
SparkQA commented on pull request #29247: URL: https://github.com/apache/spark/pull/29247#issuecomment-718361145 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34997/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30162: [SPARK-33263][SS] Configurable StateStore compression codec
AmplabJenkins removed a comment on pull request #30162: URL: https://github.com/apache/spark/pull/30162#issuecomment-718359146 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #30162: [SPARK-33263][SS] Configurable StateStore compression codec
SparkQA removed a comment on pull request #30162: URL: https://github.com/apache/spark/pull/30162#issuecomment-718281337 **[Test build #130387 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130387/testReport)** for PR 30162 at commit [`c828811`](https://github.com/apache/spark/commit/c8288111064b98107346e93c7686bbd48d5138a1). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30162: [SPARK-33263][SS] Configurable StateStore compression codec
AmplabJenkins commented on pull request #30162: URL: https://github.com/apache/spark/pull/30162#issuecomment-718359146 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30162: [SPARK-33263][SS] Configurable StateStore compression codec
SparkQA commented on pull request #30162: URL: https://github.com/apache/spark/pull/30162#issuecomment-718358492 **[Test build #130387 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130387/testReport)** for PR 30162 at commit [`c828811`](https://github.com/apache/spark/commit/c8288111064b98107346e93c7686bbd48d5138a1). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
SparkQA commented on pull request #30178: URL: https://github.com/apache/spark/pull/30178#issuecomment-718356033 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34996/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29247: [SPARK-32446][SHS] Add new executor metrics summary REST APIs
SparkQA commented on pull request #29247: URL: https://github.com/apache/spark/pull/29247#issuecomment-718348533 **[Test build #130394 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130394/testReport)** for PR 29247 at commit [`1854e74`](https://github.com/apache/spark/commit/1854e7465d5309a382e118bfe3fe7ada5023ef52). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gengliangwang commented on pull request #29247: [SPARK-32446][SHS] Add new executor metrics summary REST APIs
gengliangwang commented on pull request #29247: URL: https://github.com/apache/spark/pull/29247#issuecomment-718347748 retest this please. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30162: [SPARK-33263][SS] Configurable StateStore compression codec
AmplabJenkins removed a comment on pull request #30162: URL: https://github.com/apache/spark/pull/30162#issuecomment-718345319 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30162: [SPARK-33263][SS] Configurable StateStore compression codec
AmplabJenkins commented on pull request #30162: URL: https://github.com/apache/spark/pull/30162#issuecomment-718345319 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30162: [SPARK-33263][SS] Configurable StateStore compression codec
SparkQA commented on pull request #30162: URL: https://github.com/apache/spark/pull/30162#issuecomment-718345308 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34995/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30162: [SPARK-33263][SS] Configurable StateStore compression codec
AmplabJenkins removed a comment on pull request #30162: URL: https://github.com/apache/spark/pull/30162#issuecomment-718344106 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30162: [SPARK-33263][SS] Configurable StateStore compression codec
AmplabJenkins commented on pull request #30162: URL: https://github.com/apache/spark/pull/30162#issuecomment-718344106 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30162: [SPARK-33263][SS] Configurable StateStore compression codec
SparkQA commented on pull request #30162: URL: https://github.com/apache/spark/pull/30162#issuecomment-718343569 **[Test build #130386 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130386/testReport)** for PR 30162 at commit [`ab69ccd`](https://github.com/apache/spark/commit/ab69ccd424d350fc1dbc1d9e7892da5cb2854094). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #30162: [SPARK-33263][SS] Configurable StateStore compression codec
SparkQA removed a comment on pull request #30162: URL: https://github.com/apache/spark/pull/30162#issuecomment-718262767 **[Test build #130386 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130386/testReport)** for PR 30162 at commit [`ab69ccd`](https://github.com/apache/spark/commit/ab69ccd424d350fc1dbc1d9e7892da5cb2854094). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
SparkQA commented on pull request #30178: URL: https://github.com/apache/spark/pull/30178#issuecomment-718342791 **[Test build #130393 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130393/testReport)** for PR 30178 at commit [`181186c`](https://github.com/apache/spark/commit/181186c6cce9c3b4e3061dc84b667ee898dd3f40). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #24246: [SPARK-24252][SQL] Add TableCatalog API
cloud-fan commented on a change in pull request #24246: URL: https://github.com/apache/spark/pull/24246#discussion_r513932838 ## File path: sql/catalyst/src/main/java/org/apache/spark/sql/catalog/v2/TableCatalog.java ## @@ -0,0 +1,131 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalog.v2; + +import org.apache.spark.sql.catalog.v2.expressions.Transform; +import org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException; +import org.apache.spark.sql.catalyst.analysis.NoSuchTableException; +import org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException; +import org.apache.spark.sql.sources.v2.Table; +import org.apache.spark.sql.types.StructType; + +import java.util.Map; + +/** + * Catalog methods for working with Tables. + */ +public interface TableCatalog extends CatalogPlugin { + /** + * List the tables in a namespace from the catalog. + * + * If the catalog supports views, this must return identifiers for only tables and not views. + * + * @param namespace a multi-part namespace + * @return an array of Identifiers for tables + * @throws NoSuchNamespaceException If the namespace does not exist (optional). + */ + Identifier[] listTables(String[] namespace) throws NoSuchNamespaceException; + + /** + * Load table metadata by {@link Identifier identifier} from the catalog. + * + * If the catalog supports views and contains a view for the identifier and not a table, this + * must throw {@link NoSuchTableException}. + * + * @param ident a table identifier + * @return the table's metadata + * @throws NoSuchTableException If the table doesn't exist or is a view + */ + Table loadTable(Identifier ident) throws NoSuchTableException; + + /** + * Invalidate cached table metadata for an {@link Identifier identifier}. + * + * If the table is already loaded or cached, drop cached data. If the table does not exist or is + * not cached, do nothing. Calling this method should not query remote services. + * + * @param ident a table identifier + */ + default void invalidateTable(Identifier ident) { Review comment: Yes, we should add this kind of missing actions into v2 commands as well. I think DROP TABLE has the same issue. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] beliefer opened a new pull request #30178: [SPARK-33278][SQL] Improve the performance for FIRST_VALUE
beliefer opened a new pull request #30178: URL: https://github.com/apache/spark/pull/30178 ### What changes were proposed in this pull request? https://github.com/apache/spark/pull/29800 provides a performance improvement for `NTH_VALUE`. `FIRST_VALUE` also could uses the `UnboundedOffsetWindowFunctionFrame` and `UnboundedPrecedingOffsetWindowFunctionFrame`. ### Why are the changes needed? Improve the performance for `FIRST_VALUE`. ### Does this PR introduce _any_ user-facing change? 'No'. ### How was this patch tested? Jenkins test. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HeartSaVioR commented on pull request #30167: [SPARK-33240][SQL][3.0] Fail fast when fails to instantiate configured v2 session catalog
HeartSaVioR commented on pull request #30167: URL: https://github.com/apache/spark/pull/30167#issuecomment-718339453 The current behavior was already a surprise, otherwise I wouldn't insist strongly and try to fix it. Btw, probably I'd just revive the another thread that "disallow setting custom v2 session catalog until all statements are going through v2 session catalog" instead of this as we are finding more and more issues from there. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30162: [SPARK-33263][SS] Configurable StateStore compression codec
SparkQA commented on pull request #30162: URL: https://github.com/apache/spark/pull/30162#issuecomment-718337953 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34995/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon edited a comment on pull request #30167: [SPARK-33240][SQL][3.0] Fail fast when fails to instantiate configured v2 session catalog
HyukjinKwon edited a comment on pull request #30167: URL: https://github.com/apache/spark/pull/30167#issuecomment-718336795 _In a way_, a user _could_ think it's all internal as long as the tables are created somewhere and they can be read regardless of where the tables are created. And, a user _could_ think "there's no functional difference". I agree ^ this isn't right for the all reasons we discussed in the mailing list and here but I would prefer to avoid such potential surprise in a maintenance release. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #30167: [SPARK-33240][SQL][3.0] Fail fast when fails to instantiate configured v2 session catalog
HyukjinKwon commented on pull request #30167: URL: https://github.com/apache/spark/pull/30167#issuecomment-718336795 _In a way_, someone could think it's all internal as long as the tables are created somewhere and they can be read regardless of where the tables are created. And, someone _could_ think "there's no functional difference". This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30176: [SQL][MINOR] Update from_unixtime doc
AmplabJenkins removed a comment on pull request #30176: URL: https://github.com/apache/spark/pull/30176#issuecomment-718335189 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30156: [SPARK-33248][SQL] Add a configuration to control the legacy behavior of whether need to pad null value when value size less th
AmplabJenkins removed a comment on pull request #30156: URL: https://github.com/apache/spark/pull/30156#issuecomment-718335063 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30176: [SQL][MINOR] Update from_unixtime doc
AmplabJenkins commented on pull request #30176: URL: https://github.com/apache/spark/pull/30176#issuecomment-718335189 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30156: [SPARK-33248][SQL] Add a configuration to control the legacy behavior of whether need to pad null value when value size less then schem
AmplabJenkins commented on pull request #30156: URL: https://github.com/apache/spark/pull/30156#issuecomment-718335063 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30156: [SPARK-33248][SQL] Add a configuration to control the legacy behavior of whether need to pad null value when value size less then schema size
SparkQA commented on pull request #30156: URL: https://github.com/apache/spark/pull/30156#issuecomment-718335052 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34994/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #30176: [SQL][MINOR] Update from_unixtime doc
SparkQA removed a comment on pull request #30176: URL: https://github.com/apache/spark/pull/30176#issuecomment-718252249 **[Test build #130384 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130384/testReport)** for PR 30176 at commit [`25efb4f`](https://github.com/apache/spark/commit/25efb4f80eb33447dd3dc9f93f4b0258ef0912dc). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30176: [SQL][MINOR] Update from_unixtime doc
SparkQA commented on pull request #30176: URL: https://github.com/apache/spark/pull/30176#issuecomment-718334511 **[Test build #130384 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130384/testReport)** for PR 30176 at commit [`25efb4f`](https://github.com/apache/spark/commit/25efb4f80eb33447dd3dc9f93f4b0258ef0912dc). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HeartSaVioR commented on pull request #26935: [SPARK-30294][SS] Explicitly defines read-only StateStore and optimize for HDFSBackedStateStore
HeartSaVioR commented on pull request #26935: URL: https://github.com/apache/spark/pull/26935#issuecomment-718334009 @viirya Would you like to have another eyes on reviewing, or review from @gaborgsomogyi could convince you? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HeartSaVioR edited a comment on pull request #30167: [SPARK-33240][SQL][3.0] Fail fast when fails to instantiate configured v2 session catalog
HeartSaVioR edited a comment on pull request #30167: URL: https://github.com/apache/spark/pull/30167#issuecomment-718330774 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HeartSaVioR edited a comment on pull request #30167: [SPARK-33240][SQL][3.0] Fail fast when fails to instantiate configured v2 session catalog
HeartSaVioR edited a comment on pull request #30167: URL: https://github.com/apache/spark/pull/30167#issuecomment-718331516 Additionally, if the pattern is normal in Spark codebase I think we should revisit - if users configure something (A) and Spark decides to fail back (B), it must be only case where there's no functional difference between A and B (e.g. whole stage codegen failback might be OK as it should ideally only have difference on performance). Otherwise Spark is silently breaking the intention. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HeartSaVioR edited a comment on pull request #30167: [SPARK-33240][SQL][3.0] Fail fast when fails to instantiate configured v2 session catalog
HeartSaVioR edited a comment on pull request #30167: URL: https://github.com/apache/spark/pull/30167#issuecomment-718331516 If the pattern is normal in Spark codebase I think we should revisit - if users configure something (A) and Spark decides to fail back (B), it must be only case where there's no behavioral difference between A and B (e.g. whole stage codegen failback might be OK as it should ideally only have difference on performance). Otherwise Spark is silently breaking the intention. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HeartSaVioR commented on pull request #30167: [SPARK-33240][SQL][3.0] Fail fast when fails to instantiate configured v2 session catalog
HeartSaVioR commented on pull request #30167: URL: https://github.com/apache/spark/pull/30167#issuecomment-718331516 If the pattern is normal in Spark codebase I think we should revisit - if users configure something (A) and Spark decides to fail back (B), it must be only case where there's no behavioral difference between A and B. Otherwise Spark is silently breaking the intention. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HeartSaVioR commented on pull request #30167: [SPARK-33240][SQL][3.0] Fail fast when fails to instantiate configured v2 session catalog
HeartSaVioR commented on pull request #30167: URL: https://github.com/apache/spark/pull/30167#issuecomment-718330774 > For example, for any reason, in a cluster the catalog was set and users used to use it with no problem. User don't want to change their existing workload and code but switch the catalog they are using. So they just reset the configuration from their cluster. After the fix, this case wouldn't work "The catalog was set and users used to use it with no problem" - we are talking about the case catalog was set but it wasn't effectively working as the default catalog (failback) isn't same as they provided, right? My declaration of this is "silently broken" which is "worse" than "not functioning". That is not a kind of "no problem". If you don't agree with the statement, let's not repeat the opinions around us and hear the opinions from broader group. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] waitinfuture commented on pull request #30168: [SPARK-33208][SQL] Update the document of SparkSession#sql
waitinfuture commented on pull request #30168: URL: https://github.com/apache/spark/pull/30168#issuecomment-718329039 > @waitinfuture can you leave a comment in the JIRA so that I can assign the ticket to you? done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HeartSaVioR commented on a change in pull request #30162: [SPARK-33263][SS] Configurable StateStore compression codec
HeartSaVioR commented on a change in pull request #30162: URL: https://github.com/apache/spark/pull/30162#discussion_r513903355 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/state/StateStoreCompatibilitySuite.scala ## @@ -0,0 +1,84 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.streaming.state + +import java.io.File + +import org.apache.commons.io.FileUtils + +import org.apache.spark.SparkFunSuite +import org.apache.spark.io.CompressionCodec +import org.apache.spark.sql.catalyst.plans.PlanTestBase +import org.apache.spark.sql.catalyst.streaming.InternalOutputModes.Update +import org.apache.spark.sql.execution.streaming.MemoryStream +import org.apache.spark.sql.functions.count +import org.apache.spark.sql.internal.SQLConf +import org.apache.spark.sql.streaming.StreamTest +import org.apache.spark.util.Utils + +class StateStoreCompatibilitySuite extends StreamTest with StateStoreCodecsTest { + testWithAllCodec( Review comment: I didn't know we apply all codec test here as well (I guess this case is different from existing one as the default value of the configuration isn't changed). Based on that I feel it becomes a bit ambiguous where is the better place to put StateStoreCodecsTest. Anything of these options 1) here 2) StateStoreSuite 3) even in new file would be fine. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30162: [SPARK-33263][SS] Configurable StateStore compression codec
AmplabJenkins removed a comment on pull request #30162: URL: https://github.com/apache/spark/pull/30162#issuecomment-718327943 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/34993/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30156: [SPARK-33248][SQL] Add a configuration to control the legacy behavior of whether need to pad null value when value size less then schema size
SparkQA commented on pull request #30156: URL: https://github.com/apache/spark/pull/30156#issuecomment-718328603 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34994/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HeartSaVioR commented on a change in pull request #30162: [SPARK-33263][SS] Configurable StateStore compression codec
HeartSaVioR commented on a change in pull request #30162: URL: https://github.com/apache/spark/pull/30162#discussion_r513903355 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/state/StateStoreCompatibilitySuite.scala ## @@ -0,0 +1,84 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.streaming.state + +import java.io.File + +import org.apache.commons.io.FileUtils + +import org.apache.spark.SparkFunSuite +import org.apache.spark.io.CompressionCodec +import org.apache.spark.sql.catalyst.plans.PlanTestBase +import org.apache.spark.sql.catalyst.streaming.InternalOutputModes.Update +import org.apache.spark.sql.execution.streaming.MemoryStream +import org.apache.spark.sql.functions.count +import org.apache.spark.sql.internal.SQLConf +import org.apache.spark.sql.streaming.StreamTest +import org.apache.spark.util.Utils + +class StateStoreCompatibilitySuite extends StreamTest with StateStoreCodecsTest { + testWithAllCodec( Review comment: I didn't know we apply all codec test here as well - if then I feel it becomes a bit ambiguous where is the better place to put StateStoreCodecsTest. Anything of these options 1) here 2) StateStoreSuite 3) even in new file would be fine. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30162: [SPARK-33263][SS] Configurable StateStore compression codec
AmplabJenkins removed a comment on pull request #30162: URL: https://github.com/apache/spark/pull/30162#issuecomment-718327938 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30162: [SPARK-33263][SS] Configurable StateStore compression codec
SparkQA commented on pull request #30162: URL: https://github.com/apache/spark/pull/30162#issuecomment-718327928 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34993/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30162: [SPARK-33263][SS] Configurable StateStore compression codec
AmplabJenkins commented on pull request #30162: URL: https://github.com/apache/spark/pull/30162#issuecomment-718327938 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30162: [SPARK-33263][SS] Configurable StateStore compression codec
SparkQA commented on pull request #30162: URL: https://github.com/apache/spark/pull/30162#issuecomment-718326137 **[Test build #130392 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130392/testReport)** for PR 30162 at commit [`4dc153d`](https://github.com/apache/spark/commit/4dc153db5eeea904e0c2208726cbd41bd514b62f). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #30177: [SPARK-33277][PYSPARK][SQL] Use ContextAwareIterator to stop consuming after the task ends.
HyukjinKwon commented on a change in pull request #30177: URL: https://github.com/apache/spark/pull/30177#discussion_r513898999 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/python/EvalPythonExec.scala ## @@ -137,3 +138,18 @@ trait EvalPythonExec extends UnaryExecNode { } } } + +/** + * A TaskContext aware iterator. + * + * As the Python evaluation consumes the parent iterator in a separate thread, + * it could consume more data from the parent even after the task ends and the parent is closed. + * Thus, we should use ContextAwareIterator to stop consuming after the task ends. + */ +class ContextAwareIterator[IN](iter: Iterator[IN], context: TaskContext) extends Iterator[IN] { + + override def hasNext: Boolean = +!context.isCompleted() && !context.isInterrupted() && iter.hasNext Review comment: BTW, one thing I would like to note that this is not a clean shot. This is rather a bandaid fix because the consumption in the iterator is async-ed from the main task thread. So, the close can happen at any point in the upstream, e.g. in the middle of `hasNext`, and it still can cause the same issue. To completely fix this, IMHO, we should sync completely. Then there's no point of having a separate thread to process Python UDFs. I think the cause is basically similar with that `input_file_name` due to un-sync between this thread and main thread (see https://github.com/apache/spark/pull/24958#issuecomment-511364075). If there's a better option, it'd be great but I think this fix is good enough (given that I see similar approach in `ContinuousQueuedDataReader`). Let me know if I missed something here. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun edited a comment on pull request #30177: [SPARK-33277][PYSPARK][SQL] Use ContextAwareIterator to stop consuming after the task ends.
dongjoon-hyun edited a comment on pull request #30177: URL: https://github.com/apache/spark/pull/30177#issuecomment-718324895 Thanks. Could you update the affected version of SPARK-33277 accordingly? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on a change in pull request #30162: [SPARK-33263][SS] Configurable StateStore compression codec
viirya commented on a change in pull request #30162: URL: https://github.com/apache/spark/pull/30162#discussion_r513897382 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/state/StateStoreCompatibilitySuite.scala ## @@ -0,0 +1,84 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.streaming.state + +import java.io.File + +import org.apache.commons.io.FileUtils + +import org.apache.spark.SparkFunSuite +import org.apache.spark.io.CompressionCodec +import org.apache.spark.sql.catalyst.plans.PlanTestBase +import org.apache.spark.sql.catalyst.streaming.InternalOutputModes.Update +import org.apache.spark.sql.execution.streaming.MemoryStream +import org.apache.spark.sql.functions.count +import org.apache.spark.sql.internal.SQLConf +import org.apache.spark.sql.streaming.StreamTest +import org.apache.spark.util.Utils + +class StateStoreCompatibleSuite extends StreamTest with StateStoreCodecsTest { Review comment: Oops! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #30177: [SPARK-33277][PYSPARK][SQL] Use ContextAwareIterator to stop consuming after the task ends.
dongjoon-hyun commented on pull request #30177: URL: https://github.com/apache/spark/pull/30177#issuecomment-718324895 Thanks. Could you update SPARK-33277 accordingly? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #30177: [SPARK-33277][PYSPARK][SQL] Use ContextAwareIterator to stop consuming after the task ends.
HyukjinKwon commented on pull request #30177: URL: https://github.com/apache/spark/pull/30177#issuecomment-718322801 I believe this is a long standing bug even for 2.4 releases .. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #30177: [SPARK-33277][PYSPARK][SQL] Use ContextAwareIterator to stop consuming after the task ends.
dongjoon-hyun commented on pull request #30177: URL: https://github.com/apache/spark/pull/30177#issuecomment-718321831 Is this only for Apache Spark 3.1 and 3.0, @ueshin and @HyukjinKwon ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30162: [SPARK-33263][SS] Configurable StateStore compression codec
SparkQA commented on pull request #30162: URL: https://github.com/apache/spark/pull/30162#issuecomment-718321187 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34993/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #30162: [SPARK-33263][SS] Configurable StateStore compression codec
dongjoon-hyun commented on a change in pull request #30162: URL: https://github.com/apache/spark/pull/30162#discussion_r513885144 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/state/StateStoreCompatibilitySuite.scala ## @@ -0,0 +1,84 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.streaming.state + +import java.io.File + +import org.apache.commons.io.FileUtils + +import org.apache.spark.SparkFunSuite +import org.apache.spark.io.CompressionCodec +import org.apache.spark.sql.catalyst.plans.PlanTestBase +import org.apache.spark.sql.catalyst.streaming.InternalOutputModes.Update +import org.apache.spark.sql.execution.streaming.MemoryStream +import org.apache.spark.sql.functions.count +import org.apache.spark.sql.internal.SQLConf +import org.apache.spark.sql.streaming.StreamTest +import org.apache.spark.util.Utils + +class StateStoreCompatibleSuite extends StreamTest with StateStoreCodecsTest { Review comment: `StateStoreCompatibleSuite` -> `StateStoreCompatibilitySuite`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30156: [SPARK-33248][SQL] Add a configuration to control the legacy behavior of whether need to pad null value when value size less then schema size
SparkQA commented on pull request #30156: URL: https://github.com/apache/spark/pull/30156#issuecomment-718314992 **[Test build #130391 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130391/testReport)** for PR 30156 at commit [`3148608`](https://github.com/apache/spark/commit/314860863c592563d91fc98cc0a2a4ec06a5e837). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AngersZhuuuu commented on pull request #30156: [SPARK-33248][SQL] Add a configuration to control the legacy behavior of whether need to pad null value when value size less then schema
AngersZh commented on pull request #30156: URL: https://github.com/apache/spark/pull/30156#issuecomment-718314367 > Ah, sorry can you update the conflict in migration guide? @AngersZh Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya edited a comment on pull request #30162: [SPARK-33263][SS] Configurable StateStore compression codec
viirya edited a comment on pull request #30162: URL: https://github.com/apache/spark/pull/30162#issuecomment-718305957 > I'm sorry to try back and forth, but I feel it weird that StateStoreCodecsTest belongs to StateStoreCompatibilitySuite.scala file. Probably we can just inline StateStoreCodecsTest to StateStoreSuite. Inlining `StateStoreCodecsTest` to `StateStoreSuite` means we need extend `StateStoreSuite` at `StateStoreCompatibleSuite`. It will run duplicate tests in `StateStoreSuite`. And `StateStoreCompatibleSuite` also needs to provide implementation for `newStoreProvider`, `newStoreProvider` and `getLatestData` which are not used at all. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30177: [SPARK-33277][PYSPARK][SQL] Use ContextAwareIterator to stop consuming after the task ends.
AmplabJenkins removed a comment on pull request #30177: URL: https://github.com/apache/spark/pull/30177#issuecomment-718312288 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30177: [SPARK-33277][PYSPARK][SQL] Use ContextAwareIterator to stop consuming after the task ends.
AmplabJenkins commented on pull request #30177: URL: https://github.com/apache/spark/pull/30177#issuecomment-718312288 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30177: [SPARK-33277][PYSPARK][SQL] Use ContextAwareIterator to stop consuming after the task ends.
SparkQA commented on pull request #30177: URL: https://github.com/apache/spark/pull/30177#issuecomment-718312272 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34992/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30162: [SPARK-33263][SS] Configurable StateStore compression codec
SparkQA commented on pull request #30162: URL: https://github.com/apache/spark/pull/30162#issuecomment-718308057 **[Test build #130390 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130390/testReport)** for PR 30162 at commit [`8e435a1`](https://github.com/apache/spark/commit/8e435a1a7266b13e9cfe0072c5a808f10c41ca6b). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on pull request #30162: [SPARK-33263][SS] Configurable StateStore compression codec
viirya commented on pull request #30162: URL: https://github.com/apache/spark/pull/30162#issuecomment-718305957 > I'm sorry to try back and forth, but I feel it weird that StateStoreCodecsTest belongs to StateStoreCompatibilitySuite.scala file. Probably we can just inline StateStoreCodecsTest to StateStoreSuite. Inlining `StateStoreCodecsTest` to `StateStoreSuite` means we need extend `StateStoreSuite` at `StateStoreCompatibleSuite`. It will run duplicate tests in `StateStoreSuite`. And `StateStoreCompatibleSuite` also needs to provide implementation for `newStoreProvider`, `newStoreProvider` and `getLatestData`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #30177: [SPARK-33277][PYSPARK][SQL] Use ContextAwareIterator to stop consuming after the task ends.
HyukjinKwon commented on pull request #30177: URL: https://github.com/apache/spark/pull/30177#issuecomment-718305873 cc @BryanCutler for another look if you find some time. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30177: [SPARK-33277][PYSPARK][SQL] Use ContextAwareIterator to stop consuming after the task ends.
SparkQA commented on pull request #30177: URL: https://github.com/apache/spark/pull/30177#issuecomment-718305551 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34992/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30162: [SPARK-33263][SS] Configurable StateStore compression codec
AmplabJenkins removed a comment on pull request #30162: URL: https://github.com/apache/spark/pull/30162#issuecomment-718305004 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/130385/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30162: [SPARK-33263][SS] Configurable StateStore compression codec
AmplabJenkins removed a comment on pull request #30162: URL: https://github.com/apache/spark/pull/30162#issuecomment-718304998 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #30156: [SPARK-33248][SQL] Add a configuration to control the legacy behavior of whether need to pad null value when value size less then schema
HyukjinKwon commented on pull request #30156: URL: https://github.com/apache/spark/pull/30156#issuecomment-718305127 Ah, sorry can you update the conflict in migration guide? @AngersZh This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #30162: [SPARK-33263][SS] Configurable StateStore compression codec
SparkQA removed a comment on pull request #30162: URL: https://github.com/apache/spark/pull/30162#issuecomment-718260070 **[Test build #130385 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130385/testReport)** for PR 30162 at commit [`f16c563`](https://github.com/apache/spark/commit/f16c563cddc24d9df596a7a2b690457b514c8464). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30162: [SPARK-33263][SS] Configurable StateStore compression codec
AmplabJenkins commented on pull request #30162: URL: https://github.com/apache/spark/pull/30162#issuecomment-718304998 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30162: [SPARK-33263][SS] Configurable StateStore compression codec
SparkQA commented on pull request #30162: URL: https://github.com/apache/spark/pull/30162#issuecomment-718304768 **[Test build #130385 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130385/testReport)** for PR 30162 at commit [`f16c563`](https://github.com/apache/spark/commit/f16c563cddc24d9df596a7a2b690457b514c8464). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class StateStoreCompatibleSuite extends StreamTest with StateStoreCodecsTest ` * `trait StateStoreCodecsTest extends SparkFunSuite with PlanTestBase ` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon closed pull request #30172: [SPARK-33270][SQL] Return SQL schema instead of Catalog string from the `SchemaOfJson` expression
HyukjinKwon closed pull request #30172: URL: https://github.com/apache/spark/pull/30172 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon closed pull request #30176: [SQL][MINOR] Update from_unixtime doc
HyukjinKwon closed pull request #30176: URL: https://github.com/apache/spark/pull/30176 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #30172: [SPARK-33270][SQL] Return SQL schema instead of Catalog string from the `SchemaOfJson` expression
HyukjinKwon commented on pull request #30172: URL: https://github.com/apache/spark/pull/30172#issuecomment-718302297 Merged to master. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #30172: [SPARK-33270][SQL] Return SQL schema instead of Catalog string from the `SchemaOfJson` expression
HyukjinKwon commented on a change in pull request #30172: URL: https://github.com/apache/spark/pull/30172#discussion_r513859872 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala ## @@ -801,7 +801,7 @@ case class SchemaOfJson( } } -UTF8String.fromString(dt.catalogString) +UTF8String.fromString(dt.sql) Review comment: Okay .. NVM .. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30177: [SPARK-33277][PYSPARK][SQL] Use ContextAwareIterator to stop consuming after the task ends.
SparkQA commented on pull request #30177: URL: https://github.com/apache/spark/pull/30177#issuecomment-718301803 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34991/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30177: [SPARK-33277][PYSPARK][SQL] Use ContextAwareIterator to stop consuming after the task ends.
AmplabJenkins removed a comment on pull request #30177: URL: https://github.com/apache/spark/pull/30177#issuecomment-718301816 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30177: [SPARK-33277][PYSPARK][SQL] Use ContextAwareIterator to stop consuming after the task ends.
AmplabJenkins commented on pull request #30177: URL: https://github.com/apache/spark/pull/30177#issuecomment-718301816 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #30176: [SQL][MINOR] Update from_unixtime doc
HyukjinKwon commented on pull request #30176: URL: https://github.com/apache/spark/pull/30176#issuecomment-718301543 Merged to master and branch-3.0. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #30174: [SPARK-33271] load HADOOP_HOME and SPARK_DIST_CLASSPATH in class path
HyukjinKwon commented on pull request #30174: URL: https://github.com/apache/spark/pull/30174#issuecomment-718300403 @BigaDev, looks like this is a backport of SPARK-29574. You'll have to pick the commits and PR description as are, and keep the JIRA IDs in the PR title. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon edited a comment on pull request #30174: [SPARK-33271] load HADOOP_HOME and SPARK_DIST_CLASSPATH in class path
HyukjinKwon edited a comment on pull request #30174: URL: https://github.com/apache/spark/pull/30174#issuecomment-718300403 @BigaDev, looks like this is a backport of SPARK-29574. You'll have to pick the commits and keep the PR description as are, and keep the JIRA IDs in the PR title. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30162: [SPARK-33263][SS] Configurable StateStore compression codec
AmplabJenkins removed a comment on pull request #30162: URL: https://github.com/apache/spark/pull/30162#issuecomment-718299797 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30162: [SPARK-33263][SS] Configurable StateStore compression codec
AmplabJenkins commented on pull request #30162: URL: https://github.com/apache/spark/pull/30162#issuecomment-718299797 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30162: [SPARK-33263][SS] Configurable StateStore compression codec
SparkQA commented on pull request #30162: URL: https://github.com/apache/spark/pull/30162#issuecomment-718299791 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34990/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30162: [SPARK-33263][SS] Configurable StateStore compression codec
AmplabJenkins removed a comment on pull request #30162: URL: https://github.com/apache/spark/pull/30162#issuecomment-718296486 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30162: [SPARK-33263][SS] Configurable StateStore compression codec
AmplabJenkins commented on pull request #30162: URL: https://github.com/apache/spark/pull/30162#issuecomment-718296486 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #30162: [SPARK-33263][SS] Configurable StateStore compression codec
SparkQA removed a comment on pull request #30162: URL: https://github.com/apache/spark/pull/30162#issuecomment-718191772 **[Test build #130383 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130383/testReport)** for PR 30162 at commit [`92b0bee`](https://github.com/apache/spark/commit/92b0beefcce229acaa93e3d76b2b7ffa52ae0369). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30162: [SPARK-33263][SS] Configurable StateStore compression codec
SparkQA commented on pull request #30162: URL: https://github.com/apache/spark/pull/30162#issuecomment-718295797 **[Test build #130383 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130383/testReport)** for PR 30162 at commit [`92b0bee`](https://github.com/apache/spark/commit/92b0beefcce229acaa93e3d76b2b7ffa52ae0369). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org