[GitHub] [spark] SparkQA commented on pull request #34677: [SPARK-37436][PYTHON] Uses Python's standard string formatter for SQL API in pandas API on Spark
SparkQA commented on pull request #34677: URL: https://github.com/apache/spark/pull/34677#issuecomment-978923616 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50092/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan closed pull request #34700: [SPARK-36861][SQL] Use `yyyy-MM-dd` as the date pattern in partition discovery
cloud-fan closed pull request #34700: URL: https://github.com/apache/spark/pull/34700 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on pull request #34700: [SPARK-36861][SQL] Use `yyyy-MM-dd` as the date pattern in partition discovery
cloud-fan commented on pull request #34700: URL: https://github.com/apache/spark/pull/34700#issuecomment-978921472 thanks, merging to master! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #34707: [SPARK-37459][BUILD] Upgrade commons-cli to 1.5.0
SparkQA removed a comment on pull request #34707: URL: https://github.com/apache/spark/pull/34707#issuecomment-978839406 **[Test build #145616 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145616/testReport)** for PR 34707 at commit [`9a39778`](https://github.com/apache/spark/commit/9a397780b3bfa3bf7d45f4b4f1184f8e0d1c0156). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34707: [SPARK-37459][BUILD] Upgrade commons-cli to 1.5.0
SparkQA commented on pull request #34707: URL: https://github.com/apache/spark/pull/34707#issuecomment-978919092 **[Test build #145616 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145616/testReport)** for PR 34707 at commit [`9a39778`](https://github.com/apache/spark/commit/9a397780b3bfa3bf7d45f4b4f1184f8e0d1c0156). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34367: [SPARK-37099][SQL] Impl a rank-based filter to optimize top-k computation
SparkQA commented on pull request #34367: URL: https://github.com/apache/spark/pull/34367#issuecomment-978918945 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50093/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan closed pull request #34610: [SPARK-34332][SQL][TEST] Unify v1 and v2 ALTER TABLE .. SET LOCATION tests
cloud-fan closed pull request #34610: URL: https://github.com/apache/spark/pull/34610 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on pull request #34610: [SPARK-34332][SQL][TEST] Unify v1 and v2 ALTER TABLE .. SET LOCATION tests
cloud-fan commented on pull request #34610: URL: https://github.com/apache/spark/pull/34610#issuecomment-978917541 thanks, merging to master! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] beliefer commented on pull request #34696: [SPARK-37389][SQL][3.1] Check unclosed bracketed comments
beliefer commented on pull request #34696: URL: https://github.com/apache/spark/pull/34696#issuecomment-978916715 @cloud-fan Thank you. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34596: [SPARK-37326][SQL] Support TimestampNTZ in CSV data source
SparkQA commented on pull request #34596: URL: https://github.com/apache/spark/pull/34596#issuecomment-978916245 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50091/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34610: [SPARK-34332][SQL][TEST] Unify v1 and v2 ALTER TABLE .. SET LOCATION tests
SparkQA commented on pull request #34610: URL: https://github.com/apache/spark/pull/34610#issuecomment-978914362 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50089/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sarutak edited a comment on pull request #34593: [SPARK-37324][SQL] Adds support for decimal rounding mode up, down, half_down
sarutak edited a comment on pull request #34593: URL: https://github.com/apache/spark/pull/34593#issuecomment-978897180 I investigated whether the major DBMS and query processing systems (PostgreSQL, BigQuery, Snowflake, MySQL, SAP HANA, MS SQL Server, Firebird, DB2, Teradata, Hive) support the third parameter of `round` but I found only SAP HANA and MS SQL Server do. @sathiyapk How many users need to change the rounding modes and how often the additional rounding modes are used? If users really need `up`, `down` and `half_down`, I think it's not difficult to implement a UDF by using `BigDecimal.setScale`. @HyukjinKwon @gengliangwang What do you think? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sarutak edited a comment on pull request #34593: [SPARK-37324][SQL] Adds support for decimal rounding mode up, down, half_down
sarutak edited a comment on pull request #34593: URL: https://github.com/apache/spark/pull/34593#issuecomment-978897180 I investigated whether the major DBMS and query processing systems (PostgreSQL, BigQuery, Snowflake, MySQL, SAP HANA, MS SQL Server, Firebird, DB2, Teradata, Hive) support the third parameter of `round` but I found only SAP HANA and MS SQL Server do. @sathiyapk How many users need to change the rounding modes and how often the additional rounding modes are used? If users really need `up`, `down` and `half_down`, I think it's not difficult to implement using `BigDecimal.setScale`. @HyukjinKwon @gengliangwang What do you think? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sadikovi commented on pull request #34596: [SPARK-37326][SQL] Support TimestampNTZ in CSV data source
sadikovi commented on pull request #34596: URL: https://github.com/apache/spark/pull/34596#issuecomment-978913178 The benchmark results are fairly the same, there is some variability. I think we are good here, no separate option is required. Without the PR changes bb9e1d92d931a064c52cbc4cc84eaa32528809f0: ``` [info] OpenJDK 64-Bit Server VM 1.8.0_292-8u292-b10-0ubuntu1~18.04-b10 on Linux 5.4.0-1045-aws [info] Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz [info] Write dates and timestamps: Best Time(ms) Avg Time(ms) Stdev(ms)Rate(M/s) Per Row(ns) Relative [info] [info] Create a dataset of timestamps 1170 1233 60 8.5 117.0 1.0X [info] to_csv(timestamp) 9771 9838 58 1.0 977.1 0.1X [info] write timestamps to files 8752 8790 34 1.1 875.2 0.1X [info] Create a dataset of dates 1330 1341 9 7.5 133.0 0.9X [info] to_csv(date) 6502 6518 14 1.5 650.2 0.2X [info] write dates to files 5487 5503 14 1.8 548.7 0.2X [info] OpenJDK 64-Bit Server VM 1.8.0_292-8u292-b10-0ubuntu1~18.04-b10 on Linux 5.4.0-1045-aws [info] Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz [info] Read dates and timestamps:Best Time(ms) Avg Time(ms) Stdev(ms)Rate(M/s) Per Row(ns) Relative [info] [info] read timestamp text from files 1508 1535 26 6.6 150.8 1.0X [info] read timestamps from files24018 24608 531 0.42401.8 0.1X [info] infer timestamps from files 51043 51171 111 0.25104.3 0.0X [info] read date text from files 1437 1451 15 7.0 143.7 1.0X [info] read date from files 9391 9433 51 1.1 939.1 0.2X [info] infer date from files 21983 22029 77 0.52198.3 0.1X [info] timestamp strings 2488 2519 46 4.0 248.8 0.6X [info] parse timestamps from Dataset[String] 27073 27108 33 0.42707.3 0.1X [info] infer timestamps from Dataset[String] 53325 53399 106 0.25332.5 0.0X [info] date strings 2802 2809 6 3.6 280.2 0.5X [info] parse dates from Dataset[String] 11487 11577 96 0.91148.7 0.1X [info] from_csv(timestamp) 25019 25068 55 0.42501.9 0.1X [info] from_csv(date)10394 10431 39 1.01039.4 0.1X ``` With the PR changes: ``` PR changes: [info] OpenJDK 64-Bit Server VM 1.8.0_292-8u292-b10-0ubuntu1~18.04-b10 on Linux 5.4.0-1045-aws [info] Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz [info] Write dates and timestamps: Best Time(ms) Avg Time(ms) Stdev(ms)Rate(M/s) Per Row(ns) Relative [info] [info] Create a dataset of timestamps 1164 1215 44 8.6 116.4 1.0X [info] to_csv(timestamp) 9733 9831 125 1.0 973.3 0.1X [info] write timestamps to files 8810 8832 22 1.1 881.0 0.1X [info] Create a dataset of dates 1339 1348 9 7.5 133.9 0.9X [info] to_csv(date) 6511 6519 12 1.5 651.1 0.2X [info] write dates to files 5488 5500 11 1.8 548.8 0.2X [info] OpenJDK 64-Bit Server
[GitHub] [spark] SparkQA removed a comment on pull request #34634: [SPARK-37357][SQL] Add small partition factor for rebalance partitions
SparkQA removed a comment on pull request #34634: URL: https://github.com/apache/spark/pull/34634#issuecomment-978732278 **[Test build #145601 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145601/testReport)** for PR 34634 at commit [`e13f1fa`](https://github.com/apache/spark/commit/e13f1fa7072f7f7c684ea744299c730c70bdda14). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #34575: [SPARK-37273][SQL] Support hidden file metadata columns in Spark SQL
SparkQA removed a comment on pull request #34575: URL: https://github.com/apache/spark/pull/34575#issuecomment-978732328 **[Test build #145602 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145602/testReport)** for PR 34575 at commit [`2baccdb`](https://github.com/apache/spark/commit/2baccdbaac45f3d1178a917eaa81eee8c00e7d5b). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34700: [SPARK-36861][SQL] Use `yyyy-MM-dd` as the date pattern in partition discovery
SparkQA commented on pull request #34700: URL: https://github.com/apache/spark/pull/34700#issuecomment-978912270 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50090/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gengliangwang commented on pull request #34593: [SPARK-37324][SQL] Adds support for decimal rounding mode up, down, half_down
gengliangwang commented on pull request #34593: URL: https://github.com/apache/spark/pull/34593#issuecomment-978910772 @sarutak yeah I did some investigations too. This feature seems not commonly used. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan closed pull request #34696: [SPARK-37389][SQL][3.1] Check unclosed bracketed comments
cloud-fan closed pull request #34696: URL: https://github.com/apache/spark/pull/34696 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34634: [SPARK-37357][SQL] Add small partition factor for rebalance partitions
SparkQA commented on pull request #34634: URL: https://github.com/apache/spark/pull/34634#issuecomment-978906514 **[Test build #145601 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145601/testReport)** for PR 34634 at commit [`e13f1fa`](https://github.com/apache/spark/commit/e13f1fa7072f7f7c684ea744299c730c70bdda14). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on pull request #34696: [SPARK-37389][SQL][3.1] Check unclosed bracketed comments
cloud-fan commented on pull request #34696: URL: https://github.com/apache/spark/pull/34696#issuecomment-978906527 thanks, merging to 3.1! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] LuciferYang commented on pull request #34676: [SPARK-37434][BUILD] Add a new profile to auto disable unsupported UTs on MacOs using Apple Silicon
LuciferYang commented on pull request #34676: URL: https://github.com/apache/spark/pull/34676#issuecomment-978905172 I respect your choice @sarutak @HyukjinKwon @dongjoon-hyun -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34575: [SPARK-37273][SQL] Support hidden file metadata columns in Spark SQL
SparkQA commented on pull request #34575: URL: https://github.com/apache/spark/pull/34575#issuecomment-978904700 **[Test build #145602 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145602/testReport)** for PR 34575 at commit [`2baccdb`](https://github.com/apache/spark/commit/2baccdbaac45f3d1178a917eaa81eee8c00e7d5b). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AngersZhuuuu commented on pull request #34689: [SPARK-37445][BUILD] Upgrade hadoop profile to hadoop-3.3 since we support hadoop-3.3 as default now
AngersZh commented on pull request #34689: URL: https://github.com/apache/spark/pull/34689#issuecomment-978904604 > LGTM, please fix the conflicts. Done -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan closed pull request #34704: [SPARK-37456][SQL] CREATE NAMESPACE should qualify location for v2 command
cloud-fan closed pull request #34704: URL: https://github.com/apache/spark/pull/34704 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on pull request #34704: [SPARK-37456][SQL] CREATE NAMESPACE should qualify location for v2 command
cloud-fan commented on pull request #34704: URL: https://github.com/apache/spark/pull/34704#issuecomment-978903935 thanks, merging to master! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #34676: [SPARK-37434][BUILD] Add a new profile to auto disable unsupported UTs on MacOs using Apple Silicon
HyukjinKwon commented on pull request #34676: URL: https://github.com/apache/spark/pull/34676#issuecomment-978902684 I am okay .. but let me defer to @dongjoon-hyun -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on pull request #34689: [SPARK-37445][BUILD] Upgrade hadoop profile to hadoop-3.3 since we support hadoop-3.3 as default now
cloud-fan commented on pull request #34689: URL: https://github.com/apache/spark/pull/34689#issuecomment-978902560 LGTM, please fix the conflicts. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sarutak commented on pull request #34676: [SPARK-37434][BUILD] Add a new profile to auto disable unsupported UTs on MacOs using Apple Silicon
sarutak commented on pull request #34676: URL: https://github.com/apache/spark/pull/34676#issuecomment-978899026 The change itself looks fine but we need to discuss this change is really necessary as @HyukjinKwon concerns. @dongjoon-hyun Do you have any opinions? I guess you sometimes run tests with Apple Silicon. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sarutak edited a comment on pull request #34676: [SPARK-37434][BUILD] Add a new profile to auto disable unsupported UTs on MacOs using Apple Silicon
sarutak edited a comment on pull request #34676: URL: https://github.com/apache/spark/pull/34676#issuecomment-978899026 The change itself looks fine but we need to discuss this change is really necessary as @HyukjinKwon concerns. https://github.com/apache/spark/pull/34676#discussion_r753929523 @dongjoon-hyun Do you have any opinions? I guess you sometimes run tests with Apple Silicon. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sadikovi commented on pull request #34596: [SPARK-37326][SQL] Support TimestampNTZ in CSV data source
sadikovi commented on pull request #34596: URL: https://github.com/apache/spark/pull/34596#issuecomment-978898135 I am following up on the benchmark. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sarutak commented on pull request #34593: [SPARK-37324][SQL] Adds support for decimal rounding mode up, down, half_down
sarutak commented on pull request #34593: URL: https://github.com/apache/spark/pull/34593#issuecomment-978897180 I investigated whether the major DBMS and query processing systems (PostgreSQL, BigQuery, Snowflake, MySQL, SAP HANA, MS SQL Server, Firebird, DB2, Teradata, Hive) support the third parameter of `round` but I found only SAP HANA and MS SQL Server do. @sathiyapk How many users need to change the rounding modes and how often they are used? If users really need `up`, `down` and `half_down`, I think it's not difficult to implement using `BigDecimal.setScale`. @HyukjinKwon @gengliangwang What do you think? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34676: [SPARK-37434][BUILD] Add a new profile to auto disable unsupported UTs on MacOs using Apple Silicon
AmplabJenkins commented on pull request #34676: URL: https://github.com/apache/spark/pull/34676#issuecomment-978896363 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145615/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon closed pull request #34707: [SPARK-37459][BUILD] Upgrade commons-cli to 1.5.0
HyukjinKwon closed pull request #34707: URL: https://github.com/apache/spark/pull/34707 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #34707: [SPARK-37459][BUILD] Upgrade commons-cli to 1.5.0
HyukjinKwon commented on pull request #34707: URL: https://github.com/apache/spark/pull/34707#issuecomment-978895583 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #34676: [SPARK-37434][BUILD] Add a new profile to auto disable unsupported UTs on MacOs using Apple Silicon
SparkQA removed a comment on pull request #34676: URL: https://github.com/apache/spark/pull/34676#issuecomment-978808535 **[Test build #145615 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145615/testReport)** for PR 34676 at commit [`3386000`](https://github.com/apache/spark/commit/3386000816c75f75b31a36177bf6bc2a73bfc393). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34676: [SPARK-37434][BUILD] Add a new profile to auto disable unsupported UTs on MacOs using Apple Silicon
SparkQA commented on pull request #34676: URL: https://github.com/apache/spark/pull/34676#issuecomment-978895274 **[Test build #145615 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145615/testReport)** for PR 34676 at commit [`3386000`](https://github.com/apache/spark/commit/3386000816c75f75b31a36177bf6bc2a73bfc393). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34676: [SPARK-37434][BUILD] Add a new profile to auto disable unsupported UTs on MacOs using Apple Silicon
AmplabJenkins removed a comment on pull request #34676: URL: https://github.com/apache/spark/pull/34676#issuecomment-978870967 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34596: [SPARK-37326][SQL] Support TimestampNTZ in CSV data source
SparkQA commented on pull request #34596: URL: https://github.com/apache/spark/pull/34596#issuecomment-978893221 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50091/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34367: [SPARK-37099][SQL] Impl a rank-based filter to optimize top-k computation
SparkQA commented on pull request #34367: URL: https://github.com/apache/spark/pull/34367#issuecomment-978893149 **[Test build #145621 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145621/testReport)** for PR 34367 at commit [`873083f`](https://github.com/apache/spark/commit/873083fe8e55c32ba7310e621430950c0a8328a3). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34677: [SPARK-37436][PYTHON] Uses Python's standard string formatter for SQL API in pandas API on Spark
SparkQA commented on pull request #34677: URL: https://github.com/apache/spark/pull/34677#issuecomment-978892893 **[Test build #145620 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145620/testReport)** for PR 34677 at commit [`36e8f75`](https://github.com/apache/spark/commit/36e8f759ca098340e93b98eaba77f3669a492de8). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34676: [SPARK-37434][BUILD] Add a new profile to auto disable unsupported UTs on MacOs using Apple Silicon
SparkQA commented on pull request #34676: URL: https://github.com/apache/spark/pull/34676#issuecomment-978892647 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50087/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34676: [SPARK-37434][BUILD] Add a new profile to auto disable unsupported UTs on MacOs using Apple Silicon
AmplabJenkins commented on pull request #34676: URL: https://github.com/apache/spark/pull/34676#issuecomment-978892679 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50087/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] beliefer commented on pull request #33588: [SPARK-36346][SQL] Support TimestampNTZ type in Orc file source
beliefer commented on pull request #33588: URL: https://github.com/apache/spark/pull/33588#issuecomment-978892156 @cloud-fan Thank you for ping. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34677: [SPARK-37436][PYTHON] Uses Python's standard string formatter for SQL API in pandas API on Spark
AmplabJenkins removed a comment on pull request #34677: URL: https://github.com/apache/spark/pull/34677#issuecomment-978891398 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145608/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34704: [SPARK-37456][SQL] CREATE NAMESPACE should qualify location for v2 command
AmplabJenkins removed a comment on pull request #34704: URL: https://github.com/apache/spark/pull/34704#issuecomment-978891399 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50086/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34689: [SPARK-37445][BUILD] Upgrade hadoop profile to hadoop-3.3 since we support hadoop-3.3 as default now
AmplabJenkins removed a comment on pull request #34689: URL: https://github.com/apache/spark/pull/34689#issuecomment-978891395 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145606/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34707: [SPARK-37459][BUILD] Upgrade commons-cli to 1.5.0
AmplabJenkins commented on pull request #34707: URL: https://github.com/apache/spark/pull/34707#issuecomment-978891397 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50088/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34689: [SPARK-37445][BUILD] Upgrade hadoop profile to hadoop-3.3 since we support hadoop-3.3 as default now
AmplabJenkins commented on pull request #34689: URL: https://github.com/apache/spark/pull/34689#issuecomment-978891395 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145606/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34704: [SPARK-37456][SQL] CREATE NAMESPACE should qualify location for v2 command
AmplabJenkins commented on pull request #34704: URL: https://github.com/apache/spark/pull/34704#issuecomment-978891399 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50086/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34706: [SPARK-37458][SS] Remove unnecessary SerializeFromObject from the plan of foreachBatch
AmplabJenkins removed a comment on pull request #34706: URL: https://github.com/apache/spark/pull/34706#issuecomment-978891396 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50085/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34677: [SPARK-37436][PYTHON] Uses Python's standard string formatter for SQL API in pandas API on Spark
AmplabJenkins commented on pull request #34677: URL: https://github.com/apache/spark/pull/34677#issuecomment-978891398 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145608/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34707: [SPARK-37459][BUILD] Upgrade commons-cli to 1.5.0
AmplabJenkins removed a comment on pull request #34707: URL: https://github.com/apache/spark/pull/34707#issuecomment-978891397 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50088/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34706: [SPARK-37458][SS] Remove unnecessary SerializeFromObject from the plan of foreachBatch
AmplabJenkins commented on pull request #34706: URL: https://github.com/apache/spark/pull/34706#issuecomment-978891396 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50085/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34707: [SPARK-37459][BUILD] Upgrade commons-cli to 1.5.0
SparkQA commented on pull request #34707: URL: https://github.com/apache/spark/pull/34707#issuecomment-978891075 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50088/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] beliefer commented on pull request #34696: [SPARK-37389][SQL][3.1] Check unclosed bracketed comments
beliefer commented on pull request #34696: URL: https://github.com/apache/spark/pull/34696#issuecomment-978890666 ping @cloud-fan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HeartSaVioR commented on pull request #34706: [SPARK-37458][SS] Remove unnecessary SerializeFromObject from the plan of foreachBatch
HeartSaVioR commented on pull request #34706: URL: https://github.com/apache/spark/pull/34706#issuecomment-978889873 cc. @tdas @zsxwing @viirya @xuanyuanking -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34610: [SPARK-34332][SQL][TEST] Unify v1 and v2 ALTER TABLE .. SET LOCATION tests
SparkQA commented on pull request #34610: URL: https://github.com/apache/spark/pull/34610#issuecomment-97970 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50089/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34700: [SPARK-36861][SQL] Use `yyyy-MM-dd` as the date pattern in partition discovery
SparkQA commented on pull request #34700: URL: https://github.com/apache/spark/pull/34700#issuecomment-97654 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50090/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zhengruifeng edited a comment on pull request #34367: [SPARK-37099][SQL] Impl a rank-based filter to optimize top-k computation
zhengruifeng edited a comment on pull request #34367: URL: https://github.com/apache/spark/pull/34367#issuecomment-978884601 @wangyum this PR was updated to support `rank` and `dense_rank` ``` scala> spark.conf.set("spark.sql.rankLimit.enabled", "true") scala> spark.sql("""SELECT a, b, rank() OVER (PARTITION BY a ORDER BY b) as rk FROM VALUES ('A1', 1), ('A1', 1), ('A2', 3), ('A1', 1) tab(a, b);""").where("rk = 1").queryExecution.optimizedPlan res1: org.apache.spark.sql.catalyst.plans.logical.LogicalPlan = Filter (rk#0 = 1) +- Window [rank(b#4) windowspecdefinition(a#3, b#4 ASC NULLS FIRST, specifiedwindowframe(RowFrame, unboundedpreceding$(), currentrow$())) AS rk#0], [a#3], [b#4 ASC NULLS FIRST] +- RankLimit [a#3], [b#4 ASC NULLS FIRST], rank(b#4), 1 +- LocalRelation [a#3, b#4] scala> spark.sql("""SELECT a, b, rank() OVER (PARTITION BY a ORDER BY b) as rk FROM VALUES ('A1', 1), ('A1', 1), ('A2', 3), ('A1', 1) tab(a, b);""").where("rk = 1").show +---+---+---+ | a| b| rk| +---+---+---+ | A1| 1| 1| | A1| 1| 1| | A1| 1| 1| | A2| 3| 1| +---+---+---+ ``` ![image](https://user-images.githubusercontent.com/7322292/143392842-c046c52d-a31d-4af9-aed9-ef16714ebb45.png) ``` == Physical Plan == AdaptiveSparkPlan (17) +- == Final Plan == * Project (10) +- * Filter (9) +- Window (8) +- * Sort (7) +- AQEShuffleRead (6) +- ShuffleQueryStage (5) +- Exchange (4) +- RankLimit (3) +- * Sort (2) +- * LocalTableScan (1) +- == Initial Plan == Project (16) +- Filter (15) +- Window (14) +- Sort (13) +- Exchange (12) +- RankLimit (11) +- Sort (2) +- LocalTableScan (1) (1) LocalTableScan [codegen id : 1] Output [2]: [a#17, b#18] Arguments: [a#17, b#18] (2) Sort [codegen id : 1] Input [2]: [a#17, b#18] Arguments: [a#17 ASC NULLS FIRST, b#18 ASC NULLS FIRST], false, 0 (3) RankLimit Input [2]: [a#17, b#18] Arguments: [a#17], [b#18 ASC NULLS FIRST], rank(b#18), 1 (4) Exchange Input [2]: [a#17, b#18] Arguments: hashpartitioning(a#17, 200), ENSURE_REQUIREMENTS, [id=#37] (5) ShuffleQueryStage Output [2]: [a#17, b#18] Arguments: 0 (6) AQEShuffleRead Input [2]: [a#17, b#18] Arguments: coalesced (7) Sort [codegen id : 2] Input [2]: [a#17, b#18] Arguments: [a#17 ASC NULLS FIRST, b#18 ASC NULLS FIRST], false, 0 (8) Window Input [2]: [a#17, b#18] Arguments: [rank(b#18) windowspecdefinition(a#17, b#18 ASC NULLS FIRST, specifiedwindowframe(RowFrame, unboundedpreceding$(), currentrow$())) AS rk#14], [a#17], [b#18 ASC NULLS FIRST] (9) Filter [codegen id : 3] Input [3]: [a#17, b#18, rk#14] Condition : (rk#14 = 1) (10) Project [codegen id : 3] Output [3]: [a#17, cast(b#18 as string) AS b#32, cast(rk#14 as string) AS rk#33] Input [3]: [a#17, b#18, rk#14] (11) RankLimit Input [2]: [a#17, b#18] Arguments: [a#17], [b#18 ASC NULLS FIRST], rank(b#18), 1 (12) Exchange Input [2]: [a#17, b#18] Arguments: hashpartitioning(a#17, 200), ENSURE_REQUIREMENTS, [id=#23] (13) Sort Input [2]: [a#17, b#18] Arguments: [a#17 ASC NULLS FIRST, b#18 ASC NULLS FIRST], false, 0 (14) Window Input [2]: [a#17, b#18] Arguments: [rank(b#18) windowspecdefinition(a#17, b#18 ASC NULLS FIRST, specifiedwindowframe(RowFrame, unboundedpreceding$(), currentrow$())) AS rk#14], [a#17], [b#18 ASC NULLS FIRST] (15) Filter Input [3]: [a#17, b#18, rk#14] Condition : (rk#14 = 1) (16) Project Output [3]: [a#17, cast(b#18 as string) AS b#32, cast(rk#14 as string) AS rk#33] Input [3]: [a#17, b#18, rk#14] (17) AdaptiveSparkPlan Output [3]: [a#17, b#32, rk#33] Arguments: isFinalPlan=true ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zhengruifeng commented on pull request #34367: [SPARK-37099][SQL] Impl a rank-based filter to optimize top-k computation
zhengruifeng commented on pull request #34367: URL: https://github.com/apache/spark/pull/34367#issuecomment-978884601 @wangyum this PR was updated to support `rank` and `dense_rank` ``` scala> spark.conf.set("spark.sql.rankLimit.enabled", "true") scala> spark.sql("""SELECT a, b, rank() OVER (PARTITION BY a ORDER BY b) as rk FROM VALUES ('A1', 1), ('A1', 1), ('A2', 3), ('A1', 1) tab(a, b);""").where("rk = 1").queryExecution.optimizedPlan res1: org.apache.spark.sql.catalyst.plans.logical.LogicalPlan = Filter (rk#0 = 1) +- Window [rank(b#4) windowspecdefinition(a#3, b#4 ASC NULLS FIRST, specifiedwindowframe(RowFrame, unboundedpreceding$(), currentrow$())) AS rk#0], [a#3], [b#4 ASC NULLS FIRST] +- RankLimit [a#3], [b#4 ASC NULLS FIRST], rank(b#4), 1 +- LocalRelation [a#3, b#4] scala> spark.sql("""SELECT a, b, rank() OVER (PARTITION BY a ORDER BY b) as rk FROM VALUES ('A1', 1), ('A1', 1), ('A2', 3), ('A1', 1) tab(a, b);""").where("rk = 1").show +---+---+---+ | a| b| rk| +---+---+---+ | A1| 1| 1| | A1| 1| 1| | A1| 1| 1| | A2| 3| 1| +---+---+---+ ``` ![image](https://user-images.githubusercontent.com/7322292/143392842-c046c52d-a31d-4af9-aed9-ef16714ebb45.png) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zhengruifeng commented on a change in pull request #34367: [SPARK-37099][SQL] Impl a rank-based filter to optimize top-k computation
zhengruifeng commented on a change in pull request #34367: URL: https://github.com/apache/spark/pull/34367#discussion_r756544659 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/window/RankLimitExec.scala ## @@ -0,0 +1,93 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + + +package org.apache.spark.sql.execution.window + +import org.apache.spark.rdd.RDD +import org.apache.spark.sql.catalyst.InternalRow +import org.apache.spark.sql.catalyst.expressions._ +import org.apache.spark.sql.catalyst.expressions.codegen.LazilyGeneratedOrdering +import org.apache.spark.sql.catalyst.plans.physical._ +import org.apache.spark.sql.execution.{GroupedIterator, SparkPlan, UnaryExecNode} +import org.apache.spark.util.collection.Utils + +/** + * This operator is designed to filter out unnecessary rows before WindowExec, + * for top-k computation. + * @param partitionSpec Should be the same as [[WindowExec#partitionSpec]] + * @param orderSpec Should be the same as [[WindowExec#orderSpec]] + */ +case class RankLimitExec( +partitionSpec: Seq[Expression], +orderSpec: Seq[SortOrder], +limit: Int, +child: SparkPlan) extends UnaryExecNode { + assert(orderSpec.nonEmpty && limit > 0) + + // apply Utils.takeOrdered when partitionSpec is empty and row_number is used. + private def applyTakeOrdered: Boolean = partitionSpec.isEmpty + + override def output: Seq[Attribute] = child.output + + override def requiredChildOrdering: Seq[Seq[SortOrder]] = { +if (applyTakeOrdered) { + super.requiredChildOrdering +} else { + // Should be the same as [[WindowExec#requiredChildOrdering]] + Seq(partitionSpec.map(SortOrder(_, Ascending)) ++ orderSpec) +} + } + + override def outputOrdering: Seq[SortOrder] = { +if (applyTakeOrdered) { + orderSpec +} else { + child.outputOrdering +} + } + + override def outputPartitioning: Partitioning = child.outputPartitioning + + // TODO: support rank and dense_rank Review comment: within each maptask: 1, sort the partion by `(a,b)`; 2, group the iterator by `a`: `GroupedIterator.apply(stream, partitionSpec, output)` 3, within each group, group again by `b`, compute the `rk`, and skip if `rk` > 1. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34706: [SPARK-37458][SS] Remove unnecessary SerializeFromObject from the plan of foreachBatch
SparkQA commented on pull request #34706: URL: https://github.com/apache/spark/pull/34706#issuecomment-978882761 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50085/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sadikovi edited a comment on pull request #34596: [SPARK-37326][SQL] Support TimestampNTZ in CSV data source
sadikovi edited a comment on pull request #34596: URL: https://github.com/apache/spark/pull/34596#issuecomment-978881155 @gengliangwang @cloud-fan I updated the code, can you review again? Thank you. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #34689: [SPARK-37445][BUILD] Upgrade hadoop profile to hadoop-3.3 since we support hadoop-3.3 as default now
SparkQA removed a comment on pull request #34689: URL: https://github.com/apache/spark/pull/34689#issuecomment-978777539 **[Test build #145606 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145606/testReport)** for PR 34689 at commit [`3e132f3`](https://github.com/apache/spark/commit/3e132f3e2c9a5d9ffbaad75d3bf939f91f4ff43f). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #34677: [SPARK-37436][PYTHON] Uses Python's standard string formatter for SQL API in pandas API on Spark
SparkQA removed a comment on pull request #34677: URL: https://github.com/apache/spark/pull/34677#issuecomment-978777602 **[Test build #145608 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145608/testReport)** for PR 34677 at commit [`0f84b8c`](https://github.com/apache/spark/commit/0f84b8c5e9b028ceb142ebec1c8540e0943899cf). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sadikovi commented on pull request #34596: [SPARK-37326][SQL] Support TimestampNTZ in CSV data source
sadikovi commented on pull request #34596: URL: https://github.com/apache/spark/pull/34596#issuecomment-978881155 @gengliangwang @cloud-fan I updated the code, can you review again? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #34685: [SPARK-37443][PYTHON] Provide a profiler for Python/Pandas UDFs
SparkQA removed a comment on pull request #34685: URL: https://github.com/apache/spark/pull/34685#issuecomment-978779266 **[Test build #145612 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145612/testReport)** for PR 34685 at commit [`0a2a313`](https://github.com/apache/spark/commit/0a2a31322bb43845bf530182cde769f0d89f825b). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gengliangwang commented on pull request #34593: [SPARK-37324][SQL] Adds support for decimal rounding mode up, down, half_down
gengliangwang commented on pull request #34593: URL: https://github.com/apache/spark/pull/34593#issuecomment-978880111 ``` Sql Server does something similar to this : ROUND ( numeric_expression , length [ ,function ] ) REF : https://docs.microsoft.com/en-us/sql/t-sql/functions/round-transact-sql?view=sql-server-ver15 ``` This doesn't seem not related to the proposal in this PR. Is there any other DBMS supporting the rounding mode as 3rd parameter? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34704: [SPARK-37456][SQL] CREATE NAMESPACE should qualify location for v2 command
SparkQA commented on pull request #34704: URL: https://github.com/apache/spark/pull/34704#issuecomment-978877379 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50086/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34677: [SPARK-37436][PYTHON] Uses Python's standard string formatter for SQL API in pandas API on Spark
SparkQA commented on pull request #34677: URL: https://github.com/apache/spark/pull/34677#issuecomment-978876498 **[Test build #145608 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145608/testReport)** for PR 34677 at commit [`0f84b8c`](https://github.com/apache/spark/commit/0f84b8c5e9b028ceb142ebec1c8540e0943899cf). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34689: [SPARK-37445][BUILD] Upgrade hadoop profile to hadoop-3.3 since we support hadoop-3.3 as default now
SparkQA commented on pull request #34689: URL: https://github.com/apache/spark/pull/34689#issuecomment-978876318 **[Test build #145606 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145606/testReport)** for PR 34689 at commit [`3e132f3`](https://github.com/apache/spark/commit/3e132f3e2c9a5d9ffbaad75d3bf939f91f4ff43f). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sadikovi commented on pull request #34659: [SPARK-34863][SQL] Support complex types for Parquet vectorized reader
sadikovi commented on pull request #34659: URL: https://github.com/apache/spark/pull/34659#issuecomment-978875609 @sunchao Let me know if you would like me to revisit the code in this PR. Thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34685: [SPARK-37443][PYTHON] Provide a profiler for Python/Pandas UDFs
AmplabJenkins commented on pull request #34685: URL: https://github.com/apache/spark/pull/34685#issuecomment-978873405 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145612/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34685: [SPARK-37443][PYTHON] Provide a profiler for Python/Pandas UDFs
SparkQA commented on pull request #34685: URL: https://github.com/apache/spark/pull/34685#issuecomment-978872419 **[Test build #145612 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145612/testReport)** for PR 34685 at commit [`0a2a313`](https://github.com/apache/spark/commit/0a2a31322bb43845bf530182cde769f0d89f825b). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34707: [SPARK-37459][BUILD] Upgrade commons-cli to 1.5.0
SparkQA commented on pull request #34707: URL: https://github.com/apache/spark/pull/34707#issuecomment-978871652 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50088/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34676: [SPARK-37434][BUILD] Add a new profile to auto disable unsupported UTs on MacOs using Apple Silicon
AmplabJenkins commented on pull request #34676: URL: https://github.com/apache/spark/pull/34676#issuecomment-978870967 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50083/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34676: [SPARK-37434][BUILD] Add a new profile to auto disable unsupported UTs on MacOs using Apple Silicon
SparkQA commented on pull request #34676: URL: https://github.com/apache/spark/pull/34676#issuecomment-978870946 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50083/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sadikovi commented on a change in pull request #34596: [SPARK-37326][SQL] Support TimestampNTZ in CSV data source
sadikovi commented on a change in pull request #34596: URL: https://github.com/apache/spark/pull/34596#discussion_r756596801 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/CSVOptions.scala ## @@ -164,6 +164,10 @@ class CSVOptions( s"${DateFormatter.defaultPattern}'T'HH:mm:ss[.SSS][XXX]" }) + val timestampNTZFormatInRead: Option[String] = parameters.get("timestampNTZFormat") Review comment: Updated! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34596: [SPARK-37326][SQL] Support TimestampNTZ in CSV data source
SparkQA commented on pull request #34596: URL: https://github.com/apache/spark/pull/34596#issuecomment-978870015 **[Test build #145619 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145619/testReport)** for PR 34596 at commit [`662460f`](https://github.com/apache/spark/commit/662460f8c8a365d51fc634c1d8cc37e61e7aef08). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on pull request #33588: [SPARK-36346][SQL] Support TimestampNTZ type in Orc file source
cloud-fan commented on pull request #33588: URL: https://github.com/apache/spark/pull/33588#issuecomment-978869923 BTW, seems the same problem happens in the old timestamp ltz as well. We should have a separate PR to fix it and backport it. also cc @gengliangwang @MaxGekk -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on pull request #33588: [SPARK-36346][SQL] Support TimestampNTZ type in Orc file source
cloud-fan commented on pull request #33588: URL: https://github.com/apache/spark/pull/33588#issuecomment-978869440 @bersprockets good catch! I hope ORC can allow us to write int64 directly as timestamp, like Parquet does. Otherwise, we probably have to do some shifting as you suggested. @beliefer can you look into https://github.com/apache/spark/compare/master...bersprockets:orc_ntz_issue_play and make a fix? Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AngersZhuuuu commented on pull request #34689: [SPARK-37445][BUILD] Upgrade hadoop profile to hadoop-3.3 since we support hadoop-3.3 as default now
AngersZh commented on pull request #34689: URL: https://github.com/apache/spark/pull/34689#issuecomment-978869261 > how about the file name of the final release binaries? Also ping @vanzin , since this part of code is changed by him. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AngersZhuuuu commented on pull request #34689: [SPARK-37445][BUILD] Upgrade hadoop profile to hadoop-3.3 since we support hadoop-3.3 as default now
AngersZh commented on pull request #34689: URL: https://github.com/apache/spark/pull/34689#issuecomment-978867355 > how about the file name of the final release binaries? In `make-distribution.sh` , package name is passed by option `--name` ``` if [ "$MAKE_TGZ" == "true" ]; then TARDIR_NAME=spark-$VERSION-bin-$NAME TARDIR="$SPARK_HOME/$TARDIR_NAME" rm -rf "$TARDIR" cp -r "$DISTDIR" "$TARDIR" tar czf "spark-$VERSION-bin-$NAME.tgz" -C "$SPARK_HOME" "$TARDIR_NAME" rm -rf "$TARDIR" fi ``` In `release-build.sh`, it use parameter in `BINARY_PKGS_ARGS` ``` if [[ $PUBLISH_SCALA_2_12 = 1 ]]; then echo "Packages to build: ${!BINARY_PKGS_ARGS[@]}" for key in ${!BINARY_PKGS_ARGS[@]}; do args=${BINARY_PKGS_ARGS[$key]} extra=${BINARY_PKGS_EXTRA[$key]} if ! make_binary_release "$key" "$SCALA_2_12_PROFILES $args" "$extra" "2.12"; then error "Failed to build $key package. Check logs for details." fi done fi ``` In this PR I have change the `BINARY_PKGS_ARGS `. So the release package name should be `hadoop3.3` I have no committer authority, so can't do an end to end test. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34700: [SPARK-36861][SQL] Use `yyyy-MM-dd` as the date pattern in partition discovery
SparkQA commented on pull request #34700: URL: https://github.com/apache/spark/pull/34700#issuecomment-978866793 **[Test build #145618 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145618/testReport)** for PR 34700 at commit [`505f8ff`](https://github.com/apache/spark/commit/505f8ffbba076f3bd2f61480ebb608785e00fc3e). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34676: [SPARK-37434][BUILD] Add a new profile to auto disable unsupported UTs on MacOs using Apple Silicon
SparkQA commented on pull request #34676: URL: https://github.com/apache/spark/pull/34676#issuecomment-978866335 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50087/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gengliangwang commented on a change in pull request #34593: [SPARK-37324][SQL] Adds support for decimal rounding mode up, down, half_down
gengliangwang commented on a change in pull request #34593: URL: https://github.com/apache/spark/pull/34593#discussion_r756592780 ## File path: sql/core/src/test/resources/sql-functions/sql-expression-schema.md ## @@ -242,7 +242,7 @@ | org.apache.spark.sql.catalyst.expressions.Reverse | reverse | SELECT reverse('Spark SQL') | struct | | org.apache.spark.sql.catalyst.expressions.Right | right | SELECT right('Spark SQL', 3) | struct | | org.apache.spark.sql.catalyst.expressions.Rint | rint | SELECT rint(12.3456) | struct | -| org.apache.spark.sql.catalyst.expressions.Round | round | SELECT round(2.5, 0) | struct | +| org.apache.spark.sql.catalyst.expressions.Round | round | SELECT round(2.5, 0) | struct | Review comment: Not every DBMS systems supports the third parameter. Some of the external connectors are relying on the SQL representation string of expressions. So, shall we hide the 3rd parameter if it is the default "half_up" ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34685: [SPARK-37443][PYTHON] Provide a profiler for Python/Pandas UDFs
AmplabJenkins removed a comment on pull request #34685: URL: https://github.com/apache/spark/pull/34685#issuecomment-978865309 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50084/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34677: [SPARK-37436][PYTHON] Uses Python's standard string formatter for SQL API in pandas API on Spark
AmplabJenkins removed a comment on pull request #34677: URL: https://github.com/apache/spark/pull/34677#issuecomment-978865308 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50082/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34688: [SPARK-32079][PYTHON] Remove namedtuple hack by replacing built-in pickle to cloudpickle
AmplabJenkins removed a comment on pull request #34688: URL: https://github.com/apache/spark/pull/34688#issuecomment-978865311 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50081/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34696: [SPARK-37389][SQL][3.1] Check unclosed bracketed comments
AmplabJenkins removed a comment on pull request #34696: URL: https://github.com/apache/spark/pull/34696#issuecomment-978865305 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145599/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34677: [SPARK-37436][PYTHON] Uses Python's standard string formatter for SQL API in pandas API on Spark
AmplabJenkins commented on pull request #34677: URL: https://github.com/apache/spark/pull/34677#issuecomment-978865308 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50082/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34696: [SPARK-37389][SQL][3.1] Check unclosed bracketed comments
AmplabJenkins commented on pull request #34696: URL: https://github.com/apache/spark/pull/34696#issuecomment-978865305 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145599/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34688: [SPARK-32079][PYTHON] Remove namedtuple hack by replacing built-in pickle to cloudpickle
AmplabJenkins commented on pull request #34688: URL: https://github.com/apache/spark/pull/34688#issuecomment-978865311 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50081/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34685: [SPARK-37443][PYTHON] Provide a profiler for Python/Pandas UDFs
AmplabJenkins commented on pull request #34685: URL: https://github.com/apache/spark/pull/34685#issuecomment-978865309 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50084/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on pull request #34704: [SPARK-37456][SQL] CREATE NAMESPACE should qualify location for v2 command
cloud-fan commented on pull request #34704: URL: https://github.com/apache/spark/pull/34704#issuecomment-978864964 retest please -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34685: [SPARK-37443][PYTHON] Provide a profiler for Python/Pandas UDFs
SparkQA commented on pull request #34685: URL: https://github.com/apache/spark/pull/34685#issuecomment-978863985 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50084/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] MaxGekk commented on a change in pull request #34700: [SPARK-36861][SQL] Use `yyyy-MM-dd` as the date pattern in partition discovery
MaxGekk commented on a change in pull request #34700: URL: https://github.com/apache/spark/pull/34700#discussion_r756589929 ## File path: mllib/src/test/scala/org/apache/spark/ml/source/image/ImageFileFormatSuite.scala ## @@ -96,14 +95,14 @@ class ImageFileFormatSuite extends SparkFunSuite with MLlibTestSparkContext { .collect() assert(Set(result: _*) === Set( - Row("29.5.a_b_EGDP022204.jpg", "kittens", Date.valueOf("2018-01-01")), Review comment: I reverted changes made by https://github.com/apache/spark/pull/33709/files#r688851936. Now the test looks the same as in `branch-3.2`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34706: [SPARK-37458][SS] Remove unnecessary SerializeFromObject from the plan of foreachBatch
SparkQA commented on pull request #34706: URL: https://github.com/apache/spark/pull/34706#issuecomment-97886 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50085/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34677: [SPARK-37436][PYTHON] Uses Python's standard string formatter for SQL API in pandas API on Spark
SparkQA commented on pull request #34677: URL: https://github.com/apache/spark/pull/34677#issuecomment-978862257 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50082/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org