[GitHub] [spark] SparkQA commented on pull request #34070: [SPARK-36840][SQL] Support DPP if there is no selective predicate on the filtering side
SparkQA commented on pull request #34070: URL: https://github.com/apache/spark/pull/34070#issuecomment-976236122 **[Test build #145541 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145541/testReport)** for PR 34070 at commit [`30deb9d`](https://github.com/apache/spark/commit/30deb9d84e56869af0d55b0da6c933462e3e0785). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34611: [SPARK-35867][SQL] Enable vectorized read for VectorizedPlainValuesReader.readBooleans
SparkQA commented on pull request #34611: URL: https://github.com/apache/spark/pull/34611#issuecomment-976235724 **[Test build #145540 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145540/testReport)** for PR 34611 at commit [`af97fb3`](https://github.com/apache/spark/commit/af97fb3a629d07628105868d73ae3ba9d8e6dc90). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34677: [SPARK-37436][PYTHON] Uses Python's standard string formatter for SQL API in pandas API on Spark
AmplabJenkins removed a comment on pull request #34677: URL: https://github.com/apache/spark/pull/34677#issuecomment-976233797 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50007/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34687: [SPARK-36231][PYTHON] Support arithmetic operations of decimal(nan) series
AmplabJenkins removed a comment on pull request #34687: URL: https://github.com/apache/spark/pull/34687#issuecomment-976233532 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145537/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34686: [SPARK-37444][SQL] ALTER NAMESPACE ... SET LOCATION should handle empty location consistently across v1 and v2 command
AmplabJenkins removed a comment on pull request #34686: URL: https://github.com/apache/spark/pull/34686#issuecomment-976233533 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145525/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34688: [WIP][SPARK-32079][PYTHON] Remove namedtuple hack by replace built-in pickle to cloudpickle
AmplabJenkins removed a comment on pull request #34688: URL: https://github.com/apache/spark/pull/34688#issuecomment-976233534 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34677: [SPARK-37436][PYTHON] Uses Python's standard string formatter for SQL API in pandas API on Spark
SparkQA commented on pull request #34677: URL: https://github.com/apache/spark/pull/34677#issuecomment-976233773 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50007/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34677: [SPARK-37436][PYTHON] Uses Python's standard string formatter for SQL API in pandas API on Spark
AmplabJenkins commented on pull request #34677: URL: https://github.com/apache/spark/pull/34677#issuecomment-976233797 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50007/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34687: [SPARK-36231][PYTHON] Support arithmetic operations of decimal(nan) series
AmplabJenkins commented on pull request #34687: URL: https://github.com/apache/spark/pull/34687#issuecomment-976233532 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145537/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34686: [SPARK-37444][SQL] ALTER NAMESPACE ... SET LOCATION should handle empty location consistently across v1 and v2 command
AmplabJenkins commented on pull request #34686: URL: https://github.com/apache/spark/pull/34686#issuecomment-976233533 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145525/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34688: [WIP][SPARK-32079][PYTHON] Remove namedtuple hack by replace built-in pickle to cloudpickle
AmplabJenkins commented on pull request #34688: URL: https://github.com/apache/spark/pull/34688#issuecomment-976233534 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #34686: [SPARK-37444][SQL] ALTER NAMESPACE ... SET LOCATION should handle empty location consistently across v1 and v2 command
SparkQA removed a comment on pull request #34686: URL: https://github.com/apache/spark/pull/34686#issuecomment-976116428 **[Test build #145525 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145525/testReport)** for PR 34686 at commit [`28ce116`](https://github.com/apache/spark/commit/28ce116e2cdd35333aae6f58ed579d18d1989597). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34686: [SPARK-37444][SQL] ALTER NAMESPACE ... SET LOCATION should handle empty location consistently across v1 and v2 command
SparkQA commented on pull request #34686: URL: https://github.com/apache/spark/pull/34686#issuecomment-976230590 **[Test build #145525 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145525/testReport)** for PR 34686 at commit [`28ce116`](https://github.com/apache/spark/commit/28ce116e2cdd35333aae6f58ed579d18d1989597). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `case class AlterDatabaseSetLocationCommand(databaseName: String, location: URI)` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34688: [WIP][SPARK-32079][PYTHON] Remove namedtuple hack by replace built-in pickle to cloudpickle
SparkQA commented on pull request #34688: URL: https://github.com/apache/spark/pull/34688#issuecomment-976228942 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50009/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34687: [SPARK-36231][PYTHON] Support arithmetic operations of decimal(nan) series
SparkQA commented on pull request #34687: URL: https://github.com/apache/spark/pull/34687#issuecomment-976228510 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50010/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] yangwwei commented on pull request #34672: [SPARK-37394][CORE] Skip registering with ESS if a customized shuffle manager is configured
yangwwei commented on pull request #34672: URL: https://github.com/apache/spark/pull/34672#issuecomment-976225163 Thank you @HyukjinKwon , @attilapiros , @tgravescs >What about extending ShuffleManager trait with a new method indicating whether this shuffle manager implementation works with the external shuffle manager or not. It can have a default implementation giving back true and only needed to be overridden when the external shuffle manager is not supported. I really like this idea, thank you @attilapiros. How about adding a new method: `supportExternalShuffleService()`. This method gives each shuffle manager implementation a way to tell if the external shuffle service is needed for this shuffle manager to work. Default it returns true, and then the block manager will register with the external shuffle server; otherwise, that registration can be skipped. >So my first reaction is: you have a 3rd party shuffle manager that is an external shuffle service because it supports dynamic allocation, then why is it failing... is it because you didn't override something, or because you couldn't override something? In this case it's creating a ExternalBlockStoreClient, which I think isn't setup to be overridden. I think it comes down to we just haven't really added support to allow this. We actually found this issue while using [Uber's Remote Shuffle Service](https://github.com/uber/RemoteShuffleService) with DA enabled. This is due to [this part of code](https://github.com/apache/spark/blob/5d3a6573a56f9c00ccc513c8131c037de7d29000/core/src/main/scala/org/apache/spark/storage/BlockManager.scala#L532-L534) being hardcoded to register with the external shuffle service even a 3rd party shuffle service is used. We will need a more general way to handle this. Please let me know your thoughts, thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34688: [WIP][SPARK-32079][PYTHON] Remove namedtuple hack by replace built-in pickle to cloudpickle
SparkQA commented on pull request #34688: URL: https://github.com/apache/spark/pull/34688#issuecomment-976224175 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50006/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #34687: [SPARK-36231][PYTHON] Support arithmetic operations of decimal(nan) series
SparkQA removed a comment on pull request #34687: URL: https://github.com/apache/spark/pull/34687#issuecomment-976207876 **[Test build #145537 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145537/testReport)** for PR 34687 at commit [`a643b5e`](https://github.com/apache/spark/commit/a643b5eff512a397723732c07e51a56aa5044f30). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34687: [SPARK-36231][PYTHON] Support arithmetic operations of decimal(nan) series
SparkQA commented on pull request #34687: URL: https://github.com/apache/spark/pull/34687#issuecomment-976221908 **[Test build #145537 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145537/testReport)** for PR 34687 at commit [`a643b5e`](https://github.com/apache/spark/commit/a643b5eff512a397723732c07e51a56aa5044f30). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #34688: [WIP][SPARK-32079][PYTHON] Remove namedtuple hack by replace built-in pickle to cloudpickle
SparkQA removed a comment on pull request #34688: URL: https://github.com/apache/spark/pull/34688#issuecomment-976207820 **[Test build #145536 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145536/testReport)** for PR 34688 at commit [`8962685`](https://github.com/apache/spark/commit/8962685238818b506aed70baa8d9336f7c8cc472). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] kazuyukitanimura commented on a change in pull request #34611: [SPARK-35867][SQL] Enable vectorized read for VectorizedPlainValuesReader.readBooleans
kazuyukitanimura commented on a change in pull request #34611: URL: https://github.com/apache/spark/pull/34611#discussion_r754852154 ## File path: sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedPlainValuesReader.java ## @@ -53,19 +53,47 @@ public void skip() { throw new UnsupportedOperationException(); } + private void updateCurrentByte() { +try { + currentByte = (byte) in.read(); +} catch (IOException e) { + throw new ParquetDecodingException("Failed to read a byte", e); +} + } + @Override public final void readBooleans(int total, WritableColumnVector c, int rowId) { -// TODO: properly vectorize this -for (int i = 0; i < total; i++) { - c.putBoolean(rowId + i, readBoolean()); +int i = 0; +if (bitOffset > 0) { + i = Math.min(8 - bitOffset, total); Review comment: The code work for `total = 9` and `bitOffset = 0`. ``` public final void readBooleans(... if (bitOffset > 0) { // it will not enter here ... for (; i + 7 < total; i += 8) { updateCurrentByte(); // the whole one byte (8bits) is read here ... if (i < total) { updateCurrentByte(); // the last 1 bit is read here ``` There is already a similar test for `total=8`, `bitOffset=1`. But I added `total = 9` and `bitOffset = 0` too. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34688: [WIP][SPARK-32079][PYTHON] Remove namedtuple hack by replace built-in pickle to cloudpickle
SparkQA commented on pull request #34688: URL: https://github.com/apache/spark/pull/34688#issuecomment-976220161 **[Test build #145536 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145536/testReport)** for PR 34688 at commit [`8962685`](https://github.com/apache/spark/commit/8962685238818b506aed70baa8d9336f7c8cc472). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class CloudPickleSerializer(FramedSerializer):` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34688: [WIP][SPARK-32079][PYTHON] Remove namedtuple hack by replace built-in pickle to cloudpickle
SparkQA commented on pull request #34688: URL: https://github.com/apache/spark/pull/34688#issuecomment-976219716 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50005/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34681: [SPARK-37438][SQL] ANSI mode: Use store assignment rules for resolving function invocation
AmplabJenkins removed a comment on pull request #34681: URL: https://github.com/apache/spark/pull/34681#issuecomment-976212876 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145528/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34681: [SPARK-37438][SQL] ANSI mode: Use store assignment rules for resolving function invocation
AmplabJenkins commented on pull request #34681: URL: https://github.com/apache/spark/pull/34681#issuecomment-976212876 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145528/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34647: [SPARK-36180][SQL] Support TimestampNTZ type in Hive
AmplabJenkins removed a comment on pull request #34647: URL: https://github.com/apache/spark/pull/34647#issuecomment-976211655 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145524/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #34681: [SPARK-37438][SQL] ANSI mode: Use store assignment rules for resolving function invocation
SparkQA removed a comment on pull request #34681: URL: https://github.com/apache/spark/pull/34681#issuecomment-976120474 **[Test build #145528 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145528/testReport)** for PR 34681 at commit [`b7d383e`](https://github.com/apache/spark/commit/b7d383e81ef067477aa11c9d4df40ccb0e8c04e4). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34681: [SPARK-37438][SQL] ANSI mode: Use store assignment rules for resolving function invocation
SparkQA commented on pull request #34681: URL: https://github.com/apache/spark/pull/34681#issuecomment-976212426 **[Test build #145528 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145528/testReport)** for PR 34681 at commit [`b7d383e`](https://github.com/apache/spark/commit/b7d383e81ef067477aa11c9d4df40ccb0e8c04e4). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] kazuyukitanimura commented on a change in pull request #34611: [SPARK-35867][SQL] Enable vectorized read for VectorizedPlainValuesReader.readBooleans
kazuyukitanimura commented on a change in pull request #34611: URL: https://github.com/apache/spark/pull/34611#discussion_r754843844 ## File path: sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedPlainValuesReader.java ## @@ -53,19 +53,50 @@ public void skip() { throw new UnsupportedOperationException(); } + private void updateCurrentByte() { +try { + currentByte = (byte) in.read(); +} catch (IOException e) { + throw new ParquetDecodingException("Failed to read a byte", e); +} + } + @Override public final void readBooleans(int total, WritableColumnVector c, int rowId) { -// TODO: properly vectorize this -for (int i = 0; i < total; i++) { - c.putBoolean(rowId + i, readBoolean()); +int i = 0; +if (bitOffset > 0) { + i = Math.min(8 - bitOffset, total); + c.putBooleans(rowId, i, currentByte, bitOffset); + bitOffset = (bitOffset + i) & 7; +} +for (; i + 7 < total; i += 8) { + updateCurrentByte(); + c.putBooleans(rowId + i, currentByte); +} +if (i < total) { + updateCurrentByte(); + bitOffset = total - i; + c.putBooleans(rowId + i, bitOffset, currentByte, 0); } } @Override public final void skipBooleans(int total) { -// TODO: properly vectorize this -for (int i = 0; i < total; i++) { - readBoolean(); +// Using >>3 instead of /8 below. The difference is important when (total-(8-bitOffset))<0. +// E.g. (-1)>>3=(-1) vs. (-1)/8=0. The latter incorrectly enters the if(numBytesToSkip>=0){. Review comment: Let's say `total=8`, `bitOffset=1`, then there are `(8-bitOffset)=7` bits to skip in the `currentByte`. Now there is still 1 more bit to skip as `total=8`. So `updateCurrentByte()` needs to be called to update the `currentByte` and `bitOffset` will be again `1`. In the future, the rest of the 7bits may be read from the updated `currentByte`. For that reason, we need to go into the if statement when `numBytesToSkip = (8 - (8 - 1)) >> 3 = 1>>3 = 0`. The following is a few-lines longer but equivalent condition. ``` if (numBytesToSkip > 0) { try { in.skipFully(numBytesToSkip); } catch (IOException e) {...} } if (numBytesToSkip >= 0 && bitOffset > 0) { updateCurrentByte(); } ``` The scenario is tested at https://github.com/apache/spark/pull/34611/files#diff-b84cbbb2eadfa9d267b9ab8be2e6be579f28c1813623785c9e667a864f7960e1R194 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34647: [SPARK-36180][SQL] Support TimestampNTZ type in Hive
AmplabJenkins commented on pull request #34647: URL: https://github.com/apache/spark/pull/34647#issuecomment-976211655 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145524/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #34647: [SPARK-36180][SQL] Support TimestampNTZ type in Hive
SparkQA removed a comment on pull request #34647: URL: https://github.com/apache/spark/pull/34647#issuecomment-976098359 **[Test build #145524 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145524/testReport)** for PR 34647 at commit [`99b603f`](https://github.com/apache/spark/commit/99b603f7a5b4a52b44e9ebde94c8b3e526e866c2). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34647: [SPARK-36180][SQL] Support TimestampNTZ type in Hive
SparkQA commented on pull request #34647: URL: https://github.com/apache/spark/pull/34647#issuecomment-976210466 **[Test build #145524 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145524/testReport)** for PR 34647 at commit [`99b603f`](https://github.com/apache/spark/commit/99b603f7a5b4a52b44e9ebde94c8b3e526e866c2). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34611: [SPARK-35867][SQL] Enable vectorized read for VectorizedPlainValuesReader.readBooleans
SparkQA commented on pull request #34611: URL: https://github.com/apache/spark/pull/34611#issuecomment-976209809 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50008/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34689: [SPARK-37445][BUILD] Upgrade hadoop profile to hadoop-3.3 since we support hadoop-3.3 as default now
SparkQA commented on pull request #34689: URL: https://github.com/apache/spark/pull/34689#issuecomment-976209156 **[Test build #145539 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145539/testReport)** for PR 34689 at commit [`6a446eb`](https://github.com/apache/spark/commit/6a446eb47a7810b685b2e6a5adb9f074a3f1b844). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34677: [SPARK-37436][PYTHON] Uses Python's standard string formatter for SQL API in pandas API on Spark
SparkQA commented on pull request #34677: URL: https://github.com/apache/spark/pull/34677#issuecomment-976208891 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50007/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AngersZhuuuu opened a new pull request #34689: [SPARK-37445][BUILD] Upgrade hadoop profile to hadoop-3.3 since we support hadoop-3.3 as default now
AngersZh opened a new pull request #34689: URL: https://github.com/apache/spark/pull/34689 ### What changes were proposed in this pull request? Upgrade hadoop profile to hadoop-3.3 since we support hadoop-3.3 as default now. In current project, deps's path is still hadoop-3.2, it's not correct. ### Why are the changes needed? Upgrade hadoop profile ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Not need -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] kazuyukitanimura commented on a change in pull request #34611: [SPARK-35867][SQL] Enable vectorized read for VectorizedPlainValuesReader.readBooleans
kazuyukitanimura commented on a change in pull request #34611: URL: https://github.com/apache/spark/pull/34611#discussion_r754840444 ## File path: sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedPlainValuesReader.java ## @@ -53,19 +53,47 @@ public void skip() { throw new UnsupportedOperationException(); } + private void updateCurrentByte() { +try { + currentByte = (byte) in.read(); +} catch (IOException e) { + throw new ParquetDecodingException("Failed to read a byte", e); +} + } + @Override public final void readBooleans(int total, WritableColumnVector c, int rowId) { -// TODO: properly vectorize this -for (int i = 0; i < total; i++) { - c.putBoolean(rowId + i, readBoolean()); +int i = 0; +if (bitOffset > 0) { + i = Math.min(8 - bitOffset, total); + c.putBooleans(rowId, i, currentByte, bitOffset); + bitOffset = (bitOffset + i) & 7; +} +for (; i + 7 < total; i += 8) { + updateCurrentByte(); + c.putBooleans(rowId + i, currentByte); +} +if (i < total) { + updateCurrentByte(); + bitOffset = total - i; + c.putBooleans(rowId + i, bitOffset, currentByte, 0); } } @Override public final void skipBooleans(int total) { -// TODO: properly vectorize this -for (int i = 0; i < total; i++) { - readBoolean(); +// using >>3 instead of /8 below since Java division rounds towards zero i.e. (-1)/8=0 Review comment: Thanks. For the record, I will try to explain here. Let's say `total=8`, `bitOffset=1`, then there are `(8-bitOffset)=7` bits to skip in the `currentByte`. Now there is still 1 more bit to skip as `total=8`. So `updateCurrentByte()` needs to be called to update the `currentByte` and `bitOffset` will be again `1`. In the future, the rest of the 7bits may be read from the updated `currentByte`. For that reason, we need to go into the if statement when `numBytesToSkip = (8 - (8 - 1)) >> 3 = 1>>3 = 0`. The following is a few-lines longer but equivalent condition. ``` if (numBytesToSkip > 0) { try { in.skipFully(numBytesToSkip); } catch (IOException e) {...} } if (numBytesToSkip >= 0 && bitOffset > 0) { updateCurrentByte(); } ``` The scenario is tested at https://github.com/apache/spark/pull/34611/files#diff-b84cbbb2eadfa9d267b9ab8be2e6be579f28c1813623785c9e667a864f7960e1R194 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34687: [SPARK-36231][PYTHON] Support arithmetic operations of decimal(nan) series
SparkQA commented on pull request #34687: URL: https://github.com/apache/spark/pull/34687#issuecomment-976207876 **[Test build #145537 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145537/testReport)** for PR 34687 at commit [`a643b5e`](https://github.com/apache/spark/commit/a643b5eff512a397723732c07e51a56aa5044f30). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34677: [SPARK-37436][PYTHON] Uses Python's standard string formatter for SQL API in pandas API on Spark
SparkQA commented on pull request #34677: URL: https://github.com/apache/spark/pull/34677#issuecomment-976207850 **[Test build #145538 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145538/testReport)** for PR 34677 at commit [`0b67651`](https://github.com/apache/spark/commit/0b6765150798799a418d39209b4e5f6d4a16276e). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34688: [WIP][SPARK-32079][PYTHON] Remove namedtuple hack by replace built-in pickle to cloudpickle
SparkQA commented on pull request #34688: URL: https://github.com/apache/spark/pull/34688#issuecomment-976207820 **[Test build #145536 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145536/testReport)** for PR 34688 at commit [`8962685`](https://github.com/apache/spark/commit/8962685238818b506aed70baa8d9336f7c8cc472). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34676: [SPARK-37434][BUILD] Add a new profile to auto disable unsupported UTs on MacOs using Apple Silicon
AmplabJenkins removed a comment on pull request #34676: URL: https://github.com/apache/spark/pull/34676#issuecomment-976207428 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145531/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34688: [WIP][SPARK-32079][PYTHON] Remove namedtuple hack by replace built-in pickle to cloudpickle
AmplabJenkins removed a comment on pull request #34688: URL: https://github.com/apache/spark/pull/34688#issuecomment-976207427 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34688: [WIP][SPARK-32079][PYTHON] Remove namedtuple hack by replace built-in pickle to cloudpickle
AmplabJenkins commented on pull request #34688: URL: https://github.com/apache/spark/pull/34688#issuecomment-976207427 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34676: [SPARK-37434][BUILD] Add a new profile to auto disable unsupported UTs on MacOs using Apple Silicon
AmplabJenkins commented on pull request #34676: URL: https://github.com/apache/spark/pull/34676#issuecomment-976207428 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145531/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34688: [WIP][SPARK-32079][PYTHON] Remove namedtuple hack by replace built-in pickle to cloudpickle
SparkQA commented on pull request #34688: URL: https://github.com/apache/spark/pull/34688#issuecomment-976205593 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50006/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sadikovi commented on a change in pull request #34611: [SPARK-35867][SQL] Enable vectorized read for VectorizedPlainValuesReader.readBooleans
sadikovi commented on a change in pull request #34611: URL: https://github.com/apache/spark/pull/34611#discussion_r754834288 ## File path: sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedPlainValuesReader.java ## @@ -53,19 +53,50 @@ public void skip() { throw new UnsupportedOperationException(); } + private void updateCurrentByte() { +try { + currentByte = (byte) in.read(); +} catch (IOException e) { + throw new ParquetDecodingException("Failed to read a byte", e); +} + } + @Override public final void readBooleans(int total, WritableColumnVector c, int rowId) { -// TODO: properly vectorize this -for (int i = 0; i < total; i++) { - c.putBoolean(rowId + i, readBoolean()); +int i = 0; +if (bitOffset > 0) { + i = Math.min(8 - bitOffset, total); + c.putBooleans(rowId, i, currentByte, bitOffset); + bitOffset = (bitOffset + i) & 7; +} +for (; i + 7 < total; i += 8) { + updateCurrentByte(); + c.putBooleans(rowId + i, currentByte); +} +if (i < total) { + updateCurrentByte(); + bitOffset = total - i; + c.putBooleans(rowId + i, bitOffset, currentByte, 0); } } @Override public final void skipBooleans(int total) { -// TODO: properly vectorize this -for (int i = 0; i < total; i++) { - readBoolean(); +// Using >>3 instead of /8 below. The difference is important when (total-(8-bitOffset))<0. +// E.g. (-1)>>3=(-1) vs. (-1)/8=0. The latter incorrectly enters the if(numBytesToSkip>=0){. Review comment: Why do you even need to enter if numBytesToSkip is 0? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34688: [WIP][SPARK-32079][PYTHON] Remove namedtuple hack by replace built-in pickle to cloudpickle
SparkQA commented on pull request #34688: URL: https://github.com/apache/spark/pull/34688#issuecomment-976201538 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50005/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #34676: [SPARK-37434][BUILD] Add a new profile to auto disable unsupported UTs on MacOs using Apple Silicon
SparkQA removed a comment on pull request #34676: URL: https://github.com/apache/spark/pull/34676#issuecomment-976149315 **[Test build #145531 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145531/testReport)** for PR 34676 at commit [`f87467b`](https://github.com/apache/spark/commit/f87467b48d7989fdd026d0b337de617b3f4f9e6d). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #34688: [WIP][SPARK-32079][PYTHON] Remove namedtuple hack by replace built-in pickle to cloudpickle
SparkQA removed a comment on pull request #34688: URL: https://github.com/apache/spark/pull/34688#issuecomment-976187346 **[Test build #145534 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145534/testReport)** for PR 34688 at commit [`7fe5438`](https://github.com/apache/spark/commit/7fe5438fc98a8419cf62af9934cccac62c57fdac). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34688: [WIP][SPARK-32079][PYTHON] Remove namedtuple hack by replace built-in pickle to cloudpickle
SparkQA commented on pull request #34688: URL: https://github.com/apache/spark/pull/34688#issuecomment-976198971 **[Test build #145534 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145534/testReport)** for PR 34688 at commit [`7fe5438`](https://github.com/apache/spark/commit/7fe5438fc98a8419cf62af9934cccac62c57fdac). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class CloudPickleSerializer(FramedSerializer):` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34676: [SPARK-37434][BUILD] Add a new profile to auto disable unsupported UTs on MacOs using Apple Silicon
SparkQA commented on pull request #34676: URL: https://github.com/apache/spark/pull/34676#issuecomment-976198419 **[Test build #145531 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145531/testReport)** for PR 34676 at commit [`f87467b`](https://github.com/apache/spark/commit/f87467b48d7989fdd026d0b337de617b3f4f9e6d). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sadikovi commented on a change in pull request #34611: [SPARK-35867][SQL] Enable vectorized read for VectorizedPlainValuesReader.readBooleans
sadikovi commented on a change in pull request #34611: URL: https://github.com/apache/spark/pull/34611#discussion_r754830520 ## File path: sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedPlainValuesReader.java ## @@ -53,19 +53,47 @@ public void skip() { throw new UnsupportedOperationException(); } + private void updateCurrentByte() { +try { + currentByte = (byte) in.read(); +} catch (IOException e) { + throw new ParquetDecodingException("Failed to read a byte", e); +} + } + @Override public final void readBooleans(int total, WritableColumnVector c, int rowId) { -// TODO: properly vectorize this -for (int i = 0; i < total; i++) { - c.putBoolean(rowId + i, readBoolean()); +int i = 0; +if (bitOffset > 0) { + i = Math.min(8 - bitOffset, total); Review comment: Does the code work for total = 9 and bitOffset = 0? ## File path: sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedPlainValuesReader.java ## @@ -53,19 +53,47 @@ public void skip() { throw new UnsupportedOperationException(); } + private void updateCurrentByte() { +try { + currentByte = (byte) in.read(); +} catch (IOException e) { + throw new ParquetDecodingException("Failed to read a byte", e); +} + } + @Override public final void readBooleans(int total, WritableColumnVector c, int rowId) { -// TODO: properly vectorize this -for (int i = 0; i < total; i++) { - c.putBoolean(rowId + i, readBoolean()); +int i = 0; +if (bitOffset > 0) { + i = Math.min(8 - bitOffset, total); Review comment: Does the code work for total = 9 and bitOffset = 0? Can you add a test case for this? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #34688: [WIP][SPARK-32079][PYTHON] Remove namedtuple hack by replace built-in pickle to cloudpickle
SparkQA removed a comment on pull request #34688: URL: https://github.com/apache/spark/pull/34688#issuecomment-976186208 **[Test build #145533 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145533/testReport)** for PR 34688 at commit [`a7c71a2`](https://github.com/apache/spark/commit/a7c71a28aed1e9463df7563fb8a590675a6d8417). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34688: [WIP][SPARK-32079][PYTHON] Remove namedtuple hack by replace built-in pickle to cloudpickle
SparkQA commented on pull request #34688: URL: https://github.com/apache/spark/pull/34688#issuecomment-976196451 **[Test build #145533 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145533/testReport)** for PR 34688 at commit [`a7c71a2`](https://github.com/apache/spark/commit/a7c71a28aed1e9463df7563fb8a590675a6d8417). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class CloudPickleSerializer(FramedSerializer):` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] kazuyukitanimura commented on a change in pull request #34611: [SPARK-35867][SQL] Enable vectorized read for VectorizedPlainValuesReader.readBooleans
kazuyukitanimura commented on a change in pull request #34611: URL: https://github.com/apache/spark/pull/34611#discussion_r754828616 ## File path: sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedPlainValuesReader.java ## @@ -53,19 +53,47 @@ public void skip() { throw new UnsupportedOperationException(); } + private void updateCurrentByte() { +try { + currentByte = (byte) in.read(); +} catch (IOException e) { + throw new ParquetDecodingException("Failed to read a byte", e); +} + } + @Override public final void readBooleans(int total, WritableColumnVector c, int rowId) { -// TODO: properly vectorize this -for (int i = 0; i < total; i++) { - c.putBoolean(rowId + i, readBoolean()); +int i = 0; +if (bitOffset > 0) { + i = Math.min(8 - bitOffset, total); Review comment: `readBooleans()` can be called multiple times. The scenario is tested in `ColumnarBatchSuite.scala`. `readBooleans()` also calls `updateCurrentByte()` right before using it. ``` public final void readBooleans(... if (bitOffset > 0) { // means there are still bits to be read in currentByte ... for (; i + 7 < total; i += 8) { updateCurrentByte(); // calling it here ... if (i < total) { updateCurrentByte(); // calling it here ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sadikovi commented on a change in pull request #34611: [SPARK-35867][SQL] Enable vectorized read for VectorizedPlainValuesReader.readBooleans
sadikovi commented on a change in pull request #34611: URL: https://github.com/apache/spark/pull/34611#discussion_r754826874 ## File path: sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedPlainValuesReader.java ## @@ -53,19 +53,47 @@ public void skip() { throw new UnsupportedOperationException(); } + private void updateCurrentByte() { +try { + currentByte = (byte) in.read(); +} catch (IOException e) { + throw new ParquetDecodingException("Failed to read a byte", e); +} + } + @Override public final void readBooleans(int total, WritableColumnVector c, int rowId) { -// TODO: properly vectorize this -for (int i = 0; i < total; i++) { - c.putBoolean(rowId + i, readBoolean()); +int i = 0; +if (bitOffset > 0) { + i = Math.min(8 - bitOffset, total); + c.putBooleans(rowId, i, currentByte, bitOffset); + bitOffset = (bitOffset + i) & 7; +} +for (; i + 7 < total; i += 8) { + updateCurrentByte(); + c.putBooleans(rowId + i, currentByte); +} +if (i < total) { + updateCurrentByte(); + bitOffset = total - i; + c.putBooleans(rowId + i, bitOffset, currentByte, 0); } } @Override public final void skipBooleans(int total) { -// TODO: properly vectorize this -for (int i = 0; i < total; i++) { - readBoolean(); +// using >>3 instead of /8 below since Java division rounds towards zero i.e. (-1)/8=0 Review comment: My concern is that the testing coverage is low and we could introduce subtle bugs that could be difficult to debug. I guess it is fine to keep as is. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sadikovi commented on a change in pull request #34611: [SPARK-35867][SQL] Enable vectorized read for VectorizedPlainValuesReader.readBooleans
sadikovi commented on a change in pull request #34611: URL: https://github.com/apache/spark/pull/34611#discussion_r754826446 ## File path: sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedPlainValuesReader.java ## @@ -53,19 +53,47 @@ public void skip() { throw new UnsupportedOperationException(); } + private void updateCurrentByte() { +try { + currentByte = (byte) in.read(); +} catch (IOException e) { + throw new ParquetDecodingException("Failed to read a byte", e); +} + } + @Override public final void readBooleans(int total, WritableColumnVector c, int rowId) { -// TODO: properly vectorize this -for (int i = 0; i < total; i++) { - c.putBoolean(rowId + i, readBoolean()); +int i = 0; +if (bitOffset > 0) { + i = Math.min(8 - bitOffset, total); + c.putBooleans(rowId, i, currentByte, bitOffset); + bitOffset = (bitOffset + i) & 7; +} +for (; i + 7 < total; i += 8) { + updateCurrentByte(); + c.putBooleans(rowId + i, currentByte); +} +if (i < total) { + updateCurrentByte(); + bitOffset = total - i; + c.putBooleans(rowId + i, bitOffset, currentByte, 0); } } @Override public final void skipBooleans(int total) { -// TODO: properly vectorize this -for (int i = 0; i < total; i++) { - readBoolean(); +// using >>3 instead of /8 below since Java division rounds towards zero i.e. (-1)/8=0 Review comment: Hmm.. Why do you even need to go into the if statement if numBytesToSkip is 0? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34611: [SPARK-35867][SQL] Enable vectorized read for VectorizedPlainValuesReader.readBooleans
SparkQA commented on pull request #34611: URL: https://github.com/apache/spark/pull/34611#issuecomment-976191725 **[Test build #145535 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145535/testReport)** for PR 34611 at commit [`f5327bc`](https://github.com/apache/spark/commit/f5327bc8ba14e8f00f3a296889f4d65792848f68). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #34677: [SPARK-37436][PYTHON] Uses Python's standard string formatter for SQL API in pandas API on Spark
HyukjinKwon commented on pull request #34677: URL: https://github.com/apache/spark/pull/34677#issuecomment-976191464 retest this please -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gengliangwang closed pull request #34681: [SPARK-37438][SQL] ANSI mode: Use store assignment rules for resolving function invocation
gengliangwang closed pull request #34681: URL: https://github.com/apache/spark/pull/34681 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gengliangwang commented on pull request #34681: [SPARK-37438][SQL] ANSI mode: Use store assignment rules for resolving function invocation
gengliangwang commented on pull request #34681: URL: https://github.com/apache/spark/pull/34681#issuecomment-976191032 Merging to master -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] kazuyukitanimura commented on a change in pull request #34611: [SPARK-35867][SQL] Enable vectorized read for VectorizedPlainValuesReader.readBooleans
kazuyukitanimura commented on a change in pull request #34611: URL: https://github.com/apache/spark/pull/34611#discussion_r754823543 ## File path: sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedPlainValuesReader.java ## @@ -53,19 +53,47 @@ public void skip() { throw new UnsupportedOperationException(); } + private void updateCurrentByte() { +try { + currentByte = (byte) in.read(); +} catch (IOException e) { + throw new ParquetDecodingException("Failed to read a byte", e); +} + } + @Override public final void readBooleans(int total, WritableColumnVector c, int rowId) { -// TODO: properly vectorize this -for (int i = 0; i < total; i++) { - c.putBoolean(rowId + i, readBoolean()); +int i = 0; +if (bitOffset > 0) { + i = Math.min(8 - bitOffset, total); + c.putBooleans(rowId, i, currentByte, bitOffset); + bitOffset = (bitOffset + i) & 7; +} +for (; i + 7 < total; i += 8) { + updateCurrentByte(); + c.putBooleans(rowId + i, currentByte); +} +if (i < total) { + updateCurrentByte(); + bitOffset = total - i; + c.putBooleans(rowId + i, bitOffset, currentByte, 0); } } @Override public final void skipBooleans(int total) { -// TODO: properly vectorize this -for (int i = 0; i < total; i++) { - readBoolean(); +// using >>3 instead of /8 below since Java division rounds towards zero i.e. (-1)/8=0 Review comment: Oh I see. The difference is important when `(total-(8-bitOffset))<0.` E.g. `(-1)>>3=(-1)` vs. `(-1)/8=0`. The latter incorrectly enters the `if(numBytesToSkip>=0){`. I updated the comment, hopefully it is clearer now. We should not call ``` if (bitOffset > 0) { updateCurrentByte(); } ``` outside of the `if (numBytesToSkip >= 0) {` clause. That is because `numBytesToSkip<0` <=> `(total-(8-bitOffset))<0` means there will be still unread bits left in `currentByte` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34687: [SPARK-36231][PYTHON] Support arithmetic operations of decimal(nan) series
AmplabJenkins removed a comment on pull request #34687: URL: https://github.com/apache/spark/pull/34687#issuecomment-976188106 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50004/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34687: [SPARK-36231][PYTHON] Support arithmetic operations of decimal(nan) series
AmplabJenkins commented on pull request #34687: URL: https://github.com/apache/spark/pull/34687#issuecomment-976188106 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50004/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34687: [SPARK-36231][PYTHON] Support arithmetic operations of decimal(nan) series
SparkQA commented on pull request #34687: URL: https://github.com/apache/spark/pull/34687#issuecomment-976188085 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50004/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34688: [WIP][SPARK-32079][PYTHON] Remove namedtuple hack by replace built-in pickle to cloudpickle
SparkQA commented on pull request #34688: URL: https://github.com/apache/spark/pull/34688#issuecomment-976187346 **[Test build #145534 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145534/testReport)** for PR 34688 at commit [`7fe5438`](https://github.com/apache/spark/commit/7fe5438fc98a8419cf62af9934cccac62c57fdac). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34677: [SPARK-37436][PYTHON] Uses Python's standard string formatter for SQL API in pandas API on Spark
AmplabJenkins removed a comment on pull request #34677: URL: https://github.com/apache/spark/pull/34677#issuecomment-976186987 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145530/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34677: [SPARK-37436][PYTHON] Uses Python's standard string formatter for SQL API in pandas API on Spark
AmplabJenkins commented on pull request #34677: URL: https://github.com/apache/spark/pull/34677#issuecomment-976186987 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145530/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #34677: [SPARK-37436][PYTHON] Uses Python's standard string formatter for SQL API in pandas API on Spark
SparkQA removed a comment on pull request #34677: URL: https://github.com/apache/spark/pull/34677#issuecomment-976149296 **[Test build #145530 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145530/testReport)** for PR 34677 at commit [`0b67651`](https://github.com/apache/spark/commit/0b6765150798799a418d39209b4e5f6d4a16276e). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34677: [SPARK-37436][PYTHON] Uses Python's standard string formatter for SQL API in pandas API on Spark
SparkQA commented on pull request #34677: URL: https://github.com/apache/spark/pull/34677#issuecomment-976186594 **[Test build #145530 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145530/testReport)** for PR 34677 at commit [`0b67651`](https://github.com/apache/spark/commit/0b6765150798799a418d39209b4e5f6d4a16276e). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class SQLStringFormatter(string.Formatter):` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34688: [WIP][SPARK-32079][PYTHON] Remove namedtuple hack by replace built-in pickle to cloudpickle
SparkQA commented on pull request #34688: URL: https://github.com/apache/spark/pull/34688#issuecomment-976186208 **[Test build #145533 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145533/testReport)** for PR 34688 at commit [`a7c71a2`](https://github.com/apache/spark/commit/a7c71a28aed1e9463df7563fb8a590675a6d8417). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34687: [SPARK-36231][PYTHON] Support arithmetic operations of decimal(nan) series
AmplabJenkins removed a comment on pull request #34687: URL: https://github.com/apache/spark/pull/34687#issuecomment-976185595 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50001/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34677: [SPARK-37436][PYTHON] Uses Python's standard string formatter for SQL API in pandas API on Spark
AmplabJenkins removed a comment on pull request #34677: URL: https://github.com/apache/spark/pull/34677#issuecomment-976185596 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50002/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34676: [SPARK-37434][BUILD] Add a new profile to auto disable unsupported UTs on MacOs using Apple Silicon
AmplabJenkins removed a comment on pull request #34676: URL: https://github.com/apache/spark/pull/34676#issuecomment-976185598 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50003/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34676: [SPARK-37434][BUILD] Add a new profile to auto disable unsupported UTs on MacOs using Apple Silicon
AmplabJenkins commented on pull request #34676: URL: https://github.com/apache/spark/pull/34676#issuecomment-976185598 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50003/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34677: [SPARK-37436][PYTHON] Uses Python's standard string formatter for SQL API in pandas API on Spark
AmplabJenkins commented on pull request #34677: URL: https://github.com/apache/spark/pull/34677#issuecomment-976185596 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50002/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34687: [SPARK-36231][PYTHON] Support arithmetic operations of decimal(nan) series
AmplabJenkins commented on pull request #34687: URL: https://github.com/apache/spark/pull/34687#issuecomment-976185595 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50001/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon opened a new pull request #34688: [WIP][SPARK-32079][PYTHON] Remove namedtuple hack by replace built-in pickle to cloudpickle
HyukjinKwon opened a new pull request #34688: URL: https://github.com/apache/spark/pull/34688 ### What changes were proposed in this pull request? This PR proposes to replace Python's built-in CPickle to CPickle-based cloudpickle (requires Python 3.8+). For Python 3.7 and below, it still uses the legacy built-in CPickle for the performance matter. ### Why are the changes needed? To remove named tuple hack for the issues such as: SPARK-32079, SPARK-22674 and SPARK-27810. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Existing test cases should cover all test cases. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34677: [SPARK-37436][PYTHON] Uses Python's standard string formatter for SQL API in pandas API on Spark
SparkQA commented on pull request #34677: URL: https://github.com/apache/spark/pull/34677#issuecomment-976179811 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50002/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34687: [SPARK-36231][PYTHON] Support arithmetic operations of decimal(nan) series
SparkQA commented on pull request #34687: URL: https://github.com/apache/spark/pull/34687#issuecomment-976178977 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50001/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34676: [SPARK-37434][BUILD] Add a new profile to auto disable unsupported UTs on MacOs using Apple Silicon
SparkQA commented on pull request #34676: URL: https://github.com/apache/spark/pull/34676#issuecomment-976177156 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50003/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34685: [SPARK-37443][PYTHON] Provide a profiler for Python/Pandas UDFs
AmplabJenkins removed a comment on pull request #34685: URL: https://github.com/apache/spark/pull/34685#issuecomment-976172532 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145526/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34685: [SPARK-37443][PYTHON] Provide a profiler for Python/Pandas UDFs
AmplabJenkins commented on pull request #34685: URL: https://github.com/apache/spark/pull/34685#issuecomment-976172532 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145526/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #34685: [SPARK-37443][PYTHON] Provide a profiler for Python/Pandas UDFs
SparkQA removed a comment on pull request #34685: URL: https://github.com/apache/spark/pull/34685#issuecomment-976116444 **[Test build #145526 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145526/testReport)** for PR 34685 at commit [`b9ede68`](https://github.com/apache/spark/commit/b9ede68a8d4d1a49164b2f887ce619c2166cf504). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34685: [SPARK-37443][PYTHON] Provide a profiler for Python/Pandas UDFs
SparkQA commented on pull request #34685: URL: https://github.com/apache/spark/pull/34685#issuecomment-976172067 **[Test build #145526 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145526/testReport)** for PR 34685 at commit [`b9ede68`](https://github.com/apache/spark/commit/b9ede68a8d4d1a49164b2f887ce619c2166cf504). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34681: [SPARK-37438][SQL] ANSI mode: Use store assignment rules for resolving function invocation
AmplabJenkins removed a comment on pull request #34681: URL: https://github.com/apache/spark/pull/34681#issuecomment-976171560 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/5/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34681: [SPARK-37438][SQL] ANSI mode: Use store assignment rules for resolving function invocation
SparkQA commented on pull request #34681: URL: https://github.com/apache/spark/pull/34681#issuecomment-976171544 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/5/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34681: [SPARK-37438][SQL] ANSI mode: Use store assignment rules for resolving function invocation
AmplabJenkins commented on pull request #34681: URL: https://github.com/apache/spark/pull/34681#issuecomment-976171560 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/5/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34687: [SPARK-36231][PYTHON] Support arithmetic operations of decimal(nan) series
SparkQA commented on pull request #34687: URL: https://github.com/apache/spark/pull/34687#issuecomment-976169810 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50004/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sadikovi commented on a change in pull request #34611: [SPARK-35867][SQL] Enable vectorized read for VectorizedPlainValuesReader.readBooleans
sadikovi commented on a change in pull request #34611: URL: https://github.com/apache/spark/pull/34611#discussion_r754804617 ## File path: sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedPlainValuesReader.java ## @@ -53,19 +53,47 @@ public void skip() { throw new UnsupportedOperationException(); } + private void updateCurrentByte() { +try { + currentByte = (byte) in.read(); +} catch (IOException e) { + throw new ParquetDecodingException("Failed to read a byte", e); +} + } + @Override public final void readBooleans(int total, WritableColumnVector c, int rowId) { -// TODO: properly vectorize this -for (int i = 0; i < total; i++) { - c.putBoolean(rowId + i, readBoolean()); +int i = 0; +if (bitOffset > 0) { + i = Math.min(8 - bitOffset, total); Review comment: Yes, I understand but what about `readBooleans()` method? Does it mean that I can only call readBooleans() once? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] kazuyukitanimura commented on a change in pull request #34611: [SPARK-35867][SQL] Enable vectorized read for VectorizedPlainValuesReader.readBooleans
kazuyukitanimura commented on a change in pull request #34611: URL: https://github.com/apache/spark/pull/34611#discussion_r754803738 ## File path: sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedPlainValuesReader.java ## @@ -53,19 +53,47 @@ public void skip() { throw new UnsupportedOperationException(); } + private void updateCurrentByte() { +try { + currentByte = (byte) in.read(); +} catch (IOException e) { + throw new ParquetDecodingException("Failed to read a byte", e); +} + } + @Override public final void readBooleans(int total, WritableColumnVector c, int rowId) { -// TODO: properly vectorize this -for (int i = 0; i < total; i++) { - c.putBoolean(rowId + i, readBoolean()); +int i = 0; +if (bitOffset > 0) { + i = Math.min(8 - bitOffset, total); Review comment: We do not need to call `updateCurrentByte()` here when `total == 8 - bitOffset`. If you scroll down to the one bit version of the reader in the same file, you will find ``` public final boolean readBoolean() { if (bitOffset == 0) { updateCurrentByte(); } ``` It calls `updateCurrentByte` right before reading. So it is always next reader's responsibility to call the method. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34685: [SPARK-37443][PYTHON] Provide a profiler for Python/Pandas UDFs
AmplabJenkins removed a comment on pull request #34685: URL: https://github.com/apache/spark/pull/34685#issuecomment-976168219 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49998/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34686: [SPARK-37444][SQL] ALTER NAMESPACE ... SET LOCATION should handle empty location consistently across v1 and v2 command
AmplabJenkins removed a comment on pull request #34686: URL: https://github.com/apache/spark/pull/34686#issuecomment-976168221 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49997/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34677: [SPARK-37436][PYTHON] Uses Python's standard string formatter for SQL API in pandas API on Spark
AmplabJenkins removed a comment on pull request #34677: URL: https://github.com/apache/spark/pull/34677#issuecomment-976168174 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/4/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34687: [SPARK-36231][PYTHON] Support arithmetic operations of decimal(nan) series
AmplabJenkins removed a comment on pull request #34687: URL: https://github.com/apache/spark/pull/34687#issuecomment-976168217 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34686: [SPARK-37444][SQL] ALTER NAMESPACE ... SET LOCATION should handle empty location consistently across v1 and v2 command
AmplabJenkins commented on pull request #34686: URL: https://github.com/apache/spark/pull/34686#issuecomment-976168221 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49997/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34685: [SPARK-37443][PYTHON] Provide a profiler for Python/Pandas UDFs
AmplabJenkins commented on pull request #34685: URL: https://github.com/apache/spark/pull/34685#issuecomment-976168219 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49998/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34687: [SPARK-36231][PYTHON] Support arithmetic operations of decimal(nan) series
AmplabJenkins commented on pull request #34687: URL: https://github.com/apache/spark/pull/34687#issuecomment-976168218 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34677: [SPARK-37436][PYTHON] Uses Python's standard string formatter for SQL API in pandas API on Spark
AmplabJenkins commented on pull request #34677: URL: https://github.com/apache/spark/pull/34677#issuecomment-976168174 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/4/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34677: [SPARK-37436][PYTHON] Uses Python's standard string formatter for SQL API in pandas API on Spark
SparkQA commented on pull request #34677: URL: https://github.com/apache/spark/pull/34677#issuecomment-976168126 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/4/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org