[GitHub] [spark] attilapiros commented on pull request #38348: [SPARK-40884][BUILD] Upgrade fabric8io - `kubernetes-client` to 6.2.0
attilapiros commented on PR #38348: URL: https://github.com/apache/spark/pull/38348#issuecomment-1288004072 ok to test -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HeartSaVioR commented on a diff in pull request #38288: [SPARK-40821][SQL][CORE][PYTHON][SS] Introduce window_time function to extract event time from the window column
HeartSaVioR commented on code in PR #38288: URL: https://github.com/apache/spark/pull/38288#discussion_r1002572937 ## sql/core/src/test/scala/org/apache/spark/sql/DataFrameTimeWindowingSuite.scala: ## @@ -575,4 +575,64 @@ class DataFrameTimeWindowingSuite extends QueryTest with SharedSparkSession { validateWindowColumnInSchema(schema2, "window") } } + + test("window_time function on raw window column") { +val df = Seq( + ("2016-03-27 19:38:18"), ("2016-03-27 19:39:25") +).toDF("time") + +checkAnswer( + df.select(window($"time", "10 seconds").as("window")) +.select( + $"window.end".cast("string"), + window_time($"window").cast("string") +), + Seq( +Row("2016-03-27 19:38:20", "2016-03-27 19:38:19.99"), +Row("2016-03-27 19:39:30", "2016-03-27 19:39:29.99") + ) +) + } + + test("2 window_time functions on raw window column") { Review Comment: The actual test code which fails due to the rule is following: ``` test("2 window_time functions on raw window column") { val df = Seq( ("2016-03-27 19:38:18"), ("2016-03-27 19:39:25") ).toDF("time") val df2 = df .withColumn("time2", expr("time - INTERVAL 5 minutes")) .select(window($"time", "10 seconds", "5 seconds").as("window1"), $"time2") .select($"window1", window($"time2", "10 seconds", "5 seconds").as("window2")) /* unresolved operator 'Project [window1#10.end AS end#19, unresolvedalias(window_time(window1#10), Some(org.apache.spark.sql.Column$$Lambda$1637/93974967@5d7dd549)), window2#15.end AS end#20, unresolvedalias(window_time(window2#15), Some(org.apache.spark.sql.Column$$Lambda$1637/93974967@5d7dd549))]; 'Project [window1#10.end AS end#19, unresolvedalias(window_time(window1#10), Some(org.apache.spark.sql.Column$$Lambda$1637/93974967@5d7dd549)), window2#15.end AS end#20, unresolvedalias(window_time(window2#15), Some(org.apache.spark.sql.Column$$Lambda$1637/93974967@5d7dd549))] +- Project [window1#10, window#16 AS window2#15] +- Filter isnotnull(cast(time2#6 as timestamp)) +- Expand [[named_struct(start, precisetimestampconversion(((precisetimestampconversion(cast(time2#6 as timestamp), TimestampType, LongType) - (((precisetimestampconversion(cast(time2#6 as timestamp), TimestampType, LongType) - 0) + 500) % 500)) - 0), LongType, TimestampType), end, precisetimestampconversionprecisetimestampconversion(cast(time2#6 as timestamp), TimestampType, LongType) - (((precisetimestampconversion(cast(time2#6 as timestamp), TimestampType, LongType) - 0) + 500) % 500)) - 0) + 1000), LongType, TimestampType)), window1#10, time2#6], [named_struct(start, precisetimestampconversion(((precisetimestampconversion(cast(time2#6 as timestamp), TimestampType, LongType) - (((precisetimestampconversion(cast(time2#6 as timestamp), TimestampType, LongType) - 0) + 500) % 500)) - 500), LongType, TimestampType), end, precisetimestampconversionprecisetimestampconversion(cast(time2#6 as timestamp), TimestampType, LongType) - (((p recisetimestampconversion(cast(time2#6 as timestamp), TimestampType, LongType) - 0) + 500) % 500)) - 500) + 1000), LongType, TimestampType)), window1#10, time2#6]], [window#16, window1#10, time2#6] +- Project [window#11 AS window1#10, time2#6] +- Filter isnotnull(cast(time#4 as timestamp)) +- Expand [[named_struct(start, precisetimestampconversion(((precisetimestampconversion(cast(time#4 as timestamp), TimestampType, LongType) - (((precisetimestampconversion(cast(time#4 as timestamp), TimestampType, LongType) - 0) + 500) % 500)) - 0), LongType, TimestampType), end, precisetimestampconversionprecisetimestampconversion(cast(time#4 as timestamp), TimestampType, LongType) - (((precisetimestampconversion(cast(time#4 as timestamp), TimestampType, LongType) - 0) + 500) % 500)) - 0) + 1000), LongType, TimestampType)), time#4, time2#6], [named_struct(start, precisetimestampconversion(((precisetimestampconversion(cast(time#4 as timestamp), TimestampType, LongType) - (((precisetimestampconversion(cast(time#4 as timestamp), TimestampType, LongType) - 0) + 500) % 500)) - 500), LongType, TimestampType), end, precisetimestampconversionprecisetimestampconversion(cast(time#4 as timestamp), TimestampType, LongType) - (((pre cisetimestampconversion(cast(time#4 as timestamp), TimestampType, LongType) - 0) + 500) % 500)) - 500) + 1000), LongType, TimestampType)), time#4, time2#6]], [window#11, time#4, time2#6] +- Project [time#4, cast(time#4 - INTERVAL '05' MINUTE as string) AS time2#6] +- Project [value#1 AS time#4]
[GitHub] [spark] wangyum closed pull request #38349: Pull request
wangyum closed pull request #38349: Pull request URL: https://github.com/apache/spark/pull/38349 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] LuciferYang commented on pull request #38328: [SPARK-40863][BUILD] Upgrade dropwizard metrics 4.2.12
LuciferYang commented on PR #38328: URL: https://github.com/apache/spark/pull/38328#issuecomment-1287995156 Thanks @srowen @HyukjinKwon -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] LuciferYang commented on pull request #38329: [SPARK-40865][BUILD] Upgrade jodatime to 2.12.0
LuciferYang commented on PR #38329: URL: https://github.com/apache/spark/pull/38329#issuecomment-1287995115 Thanks @srowen @HyukjinKwon -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] panbingkun commented on pull request #38351: [SPARK-40391][SQL][TESTS] Test the error class UNSUPPORTED_FEATURE.JDBC_TRANSACTION
panbingkun commented on PR #38351: URL: https://github.com/apache/spark/pull/38351#issuecomment-1287979218 > @panbingkun Could you fix scala style failure: > > ``` > [error] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryExecutionErrorsSuite.scala:34:0: org.apache.spark.sql.execution.datasources.jdbc. is in wrong order relative to > ``` Done. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] Yikun opened a new pull request, #38354: [SPARK-40882][INFRA] Upgrade actions/setup-java to v3 with distribution specified
Yikun opened a new pull request, #38354: URL: https://github.com/apache/spark/pull/38354 ### What changes were proposed in this pull request? Upgrade actions/setup-java to v3 with distribution specified ### Why are the changes needed? - The `distribution` is required after v2, now just keep `zulu` (same distribution with v1): https://github.com/actions/setup-java/releases/tag/v2.0.0 - https://github.com/actions/setup-java/releases/tag/v3.0.0: Upgrade node - https://github.com/actions/setup-java/releases/tag/v3.6.0: Cleanup set-output warning ### Does this PR introduce _any_ user-facing change? No,dev only ### How was this patch tested? CI passed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] Yikun opened a new pull request, #38353: [SPARK-40881][INFRA] Upgrade actions/cache to v3 and actions/upload-artifact to v3
Yikun opened a new pull request, #38353: URL: https://github.com/apache/spark/pull/38353 ### What changes were proposed in this pull request? Upgrade actions/cache to v3 and actions/upload-artifact to v3 ### Why are the changes needed? - Since actions/cache@v3: support from node 12 -> node 16. - Since actions/upload-artifact@v3: cleanup `set-output` warning ### Does this PR introduce _any_ user-facing change? No, dev only ### How was this patch tested? CI passed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zhengruifeng commented on pull request #38346: [SPARK-40880][SQL] Reimplement `summary` with dataframe operations
zhengruifeng commented on PR #38346: URL: https://github.com/apache/spark/pull/38346#issuecomment-1287965144 cc @HyukjinKwon -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] github-actions[bot] commented on pull request #36332: [SPARK-36730][SQL] Use V2 Filter in V2 file source
github-actions[bot] commented on PR #36332: URL: https://github.com/apache/spark/pull/36332#issuecomment-1287960765 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] github-actions[bot] commented on pull request #37053: [SPARK-39452][GraphX] Extend EdgePartition1D with Destination based Strategy
github-actions[bot] commented on PR #37053: URL: https://github.com/apache/spark/pull/37053#issuecomment-1287960756 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] github-actions[bot] commented on pull request #37183: [SPARK-39770][SQL][AVRO] Support Avro schema evolution
github-actions[bot] commented on PR #37183: URL: https://github.com/apache/spark/pull/37183#issuecomment-1287960748 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #38345: [SPARK-40879][CONNECT] Support Join UsingColumns in proto
AmplabJenkins commented on PR #38345: URL: https://github.com/apache/spark/pull/38345#issuecomment-1287953624 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] srowen commented on pull request #38228: [SPARK-40739][SPARK-40738] Fixes for cygwin/msys2/mingw sbt build and bash scripts
srowen commented on PR #38228: URL: https://github.com/apache/spark/pull/38228#issuecomment-1287945182 Seems OK now -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #38347: [SPARK-40883][CONNECT] Support Range in Connect proto
AmplabJenkins commented on PR #38347: URL: https://github.com/apache/spark/pull/38347#issuecomment-1287931807 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #38348: [SPARK-40884][BUILD] Upgrade fabric8io - `kubernetes-client` to 6.2.0
AmplabJenkins commented on PR #38348: URL: https://github.com/apache/spark/pull/38348#issuecomment-1287931800 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #38349: Pull request
AmplabJenkins commented on PR #38349: URL: https://github.com/apache/spark/pull/38349#issuecomment-1287931788 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #38350: [WIP][SPARK-40752][SQL] Migrate type check failures of misc expressions onto error classes
AmplabJenkins commented on PR #38350: URL: https://github.com/apache/spark/pull/38350#issuecomment-1287931775 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #38352: [SPARK-40801][BUILD][3.2] Upgrade `Apache commons-text` to 1.10
AmplabJenkins commented on PR #38352: URL: https://github.com/apache/spark/pull/38352#issuecomment-1287931763 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #38351: [SPARK-40391][SQL][TESTS] Test the error class UNSUPPORTED_FEATURE.JDBC_TRANSACTION
AmplabJenkins commented on PR #38351: URL: https://github.com/apache/spark/pull/38351#issuecomment-1287931766 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HeartSaVioR commented on a diff in pull request #38288: [SPARK-40821][SQL][CORE][PYTHON][SS] Introduce window_time function to extract event time from the window column
HeartSaVioR commented on code in PR #38288: URL: https://github.com/apache/spark/pull/38288#discussion_r1002574932 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala: ## @@ -4201,6 +4219,73 @@ object SessionWindowing extends Rule[LogicalPlan] { } } +/** + * Resolves the window_time expression which extracts the correct window time from the + * window column generated as the output of the window aggregating operators. The + * window column is of type struct { start: TimestampType, end: TimestampType }. + * The correct window time for further aggregations is window.end - 1. + * */ +object ResolveWindowTime extends Rule[LogicalPlan] { + override def apply(plan: LogicalPlan): LogicalPlan = plan.resolveOperatorsUp { +case p: LogicalPlan if p.children.size == 1 => + val child = p.children.head + val windowTimeExpressions = +p.expressions.flatMap(_.collect { case w: WindowTime => w }).toSet + + if (windowTimeExpressions.size == 1 && Review Comment: I left a comment to workaround the ops checker error and still hit this condition. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HeartSaVioR commented on a diff in pull request #38288: [SPARK-40821][SQL][CORE][PYTHON][SS] Introduce window_time function to extract event time from the window column
HeartSaVioR commented on code in PR #38288: URL: https://github.com/apache/spark/pull/38288#discussion_r1002572937 ## sql/core/src/test/scala/org/apache/spark/sql/DataFrameTimeWindowingSuite.scala: ## @@ -575,4 +575,64 @@ class DataFrameTimeWindowingSuite extends QueryTest with SharedSparkSession { validateWindowColumnInSchema(schema2, "window") } } + + test("window_time function on raw window column") { +val df = Seq( + ("2016-03-27 19:38:18"), ("2016-03-27 19:39:25") +).toDF("time") + +checkAnswer( + df.select(window($"time", "10 seconds").as("window")) +.select( + $"window.end".cast("string"), + window_time($"window").cast("string") +), + Seq( +Row("2016-03-27 19:38:20", "2016-03-27 19:38:19.99"), +Row("2016-03-27 19:39:30", "2016-03-27 19:39:29.99") + ) +) + } + + test("2 window_time functions on raw window column") { Review Comment: The actual test code which fails due to the rule is following: ``` test("2 window_time functions on raw window column") { val df = Seq( ("2016-03-27 19:38:18"), ("2016-03-27 19:39:25") ).toDF("time") val df2 = df .withColumn("time2", expr("time - INTERVAL 5 minutes")) .select(window($"time", "10 seconds", "5 seconds").as("window1"), $"time2") .select($"window1", window($"time2", "10 seconds", "5 seconds").as("window2")) /* unresolved operator 'Project [window1#10.end AS end#19, unresolvedalias(window_time(window1#10), Some(org.apache.spark.sql.Column$$Lambda$1637/93974967@5d7dd549)), window2#15.end AS end#20, unresolvedalias(window_time(window2#15), Some(org.apache.spark.sql.Column$$Lambda$1637/93974967@5d7dd549))]; 'Project [window1#10.end AS end#19, unresolvedalias(window_time(window1#10), Some(org.apache.spark.sql.Column$$Lambda$1637/93974967@5d7dd549)), window2#15.end AS end#20, unresolvedalias(window_time(window2#15), Some(org.apache.spark.sql.Column$$Lambda$1637/93974967@5d7dd549))] +- Project [window1#10, window#16 AS window2#15] +- Filter isnotnull(cast(time2#6 as timestamp)) +- Expand [[named_struct(start, precisetimestampconversion(((precisetimestampconversion(cast(time2#6 as timestamp), TimestampType, LongType) - (((precisetimestampconversion(cast(time2#6 as timestamp), TimestampType, LongType) - 0) + 500) % 500)) - 0), LongType, TimestampType), end, precisetimestampconversionprecisetimestampconversion(cast(time2#6 as timestamp), TimestampType, LongType) - (((precisetimestampconversion(cast(time2#6 as timestamp), TimestampType, LongType) - 0) + 500) % 500)) - 0) + 1000), LongType, TimestampType)), window1#10, time2#6], [named_struct(start, precisetimestampconversion(((precisetimestampconversion(cast(time2#6 as timestamp), TimestampType, LongType) - (((precisetimestampconversion(cast(time2#6 as timestamp), TimestampType, LongType) - 0) + 500) % 500)) - 500), LongType, TimestampType), end, precisetimestampconversionprecisetimestampconversion(cast(time2#6 as timestamp), TimestampType, LongType) - (((p recisetimestampconversion(cast(time2#6 as timestamp), TimestampType, LongType) - 0) + 500) % 500)) - 500) + 1000), LongType, TimestampType)), window1#10, time2#6]], [window#16, window1#10, time2#6] +- Project [window#11 AS window1#10, time2#6] +- Filter isnotnull(cast(time#4 as timestamp)) +- Expand [[named_struct(start, precisetimestampconversion(((precisetimestampconversion(cast(time#4 as timestamp), TimestampType, LongType) - (((precisetimestampconversion(cast(time#4 as timestamp), TimestampType, LongType) - 0) + 500) % 500)) - 0), LongType, TimestampType), end, precisetimestampconversionprecisetimestampconversion(cast(time#4 as timestamp), TimestampType, LongType) - (((precisetimestampconversion(cast(time#4 as timestamp), TimestampType, LongType) - 0) + 500) % 500)) - 0) + 1000), LongType, TimestampType)), time#4, time2#6], [named_struct(start, precisetimestampconversion(((precisetimestampconversion(cast(time#4 as timestamp), TimestampType, LongType) - (((precisetimestampconversion(cast(time#4 as timestamp), TimestampType, LongType) - 0) + 500) % 500)) - 500), LongType, TimestampType), end, precisetimestampconversionprecisetimestampconversion(cast(time#4 as timestamp), TimestampType, LongType) - (((pre cisetimestampconversion(cast(time#4 as timestamp), TimestampType, LongType) - 0) + 500) % 500)) - 500) + 1000), LongType, TimestampType)), time#4, time2#6]], [window#11, time#4, time2#6] +- Project [time#4, cast(time#4 - INTERVAL '05' MINUTE as string) AS time2#6] +- Project [value#1 AS time#4]
[GitHub] [spark] HeartSaVioR commented on a diff in pull request #38288: [SPARK-40821][SQL][CORE][PYTHON][SS] Introduce window_time function to extract event time from the window column
HeartSaVioR commented on code in PR #38288: URL: https://github.com/apache/spark/pull/38288#discussion_r1002572937 ## sql/core/src/test/scala/org/apache/spark/sql/DataFrameTimeWindowingSuite.scala: ## @@ -575,4 +575,64 @@ class DataFrameTimeWindowingSuite extends QueryTest with SharedSparkSession { validateWindowColumnInSchema(schema2, "window") } } + + test("window_time function on raw window column") { +val df = Seq( + ("2016-03-27 19:38:18"), ("2016-03-27 19:39:25") +).toDF("time") + +checkAnswer( + df.select(window($"time", "10 seconds").as("window")) +.select( + $"window.end".cast("string"), + window_time($"window").cast("string") +), + Seq( +Row("2016-03-27 19:38:20", "2016-03-27 19:38:19.99"), +Row("2016-03-27 19:39:30", "2016-03-27 19:39:29.99") + ) +) + } + + test("2 window_time functions on raw window column") { Review Comment: The actual test code which fails due to the rule is following: ``` test("2 window_time functions on raw window column") { val df = Seq( ("2016-03-27 19:38:18"), ("2016-03-27 19:39:25") ).toDF("time") val df2 = df .withColumn("time2", expr("time - INTERVAL 5 minutes")) .select(window($"time", "10 seconds", "5 seconds").as("window1"), $"time2") .select($"window1", window($"time2", "10 seconds", "5 seconds").as("window2")) /* unresolved operator 'Project [window1#10.end AS end#19, unresolvedalias(window_time(window1#10), Some(org.apache.spark.sql.Column$$Lambda$1637/93974967@5d7dd549)), window2#15.end AS end#20, unresolvedalias(window_time(window2#15), Some(org.apache.spark.sql.Column$$Lambda$1637/93974967@5d7dd549))]; 'Project [window1#10.end AS end#19, unresolvedalias(window_time(window1#10), Some(org.apache.spark.sql.Column$$Lambda$1637/93974967@5d7dd549)), window2#15.end AS end#20, unresolvedalias(window_time(window2#15), Some(org.apache.spark.sql.Column$$Lambda$1637/93974967@5d7dd549))] +- Project [window1#10, window#16 AS window2#15] +- Filter isnotnull(cast(time2#6 as timestamp)) +- Expand [[named_struct(start, precisetimestampconversion(((precisetimestampconversion(cast(time2#6 as timestamp), TimestampType, LongType) - (((precisetimestampconversion(cast(time2#6 as timestamp), TimestampType, LongType) - 0) + 500) % 500)) - 0), LongType, TimestampType), end, precisetimestampconversionprecisetimestampconversion(cast(time2#6 as timestamp), TimestampType, LongType) - (((precisetimestampconversion(cast(time2#6 as timestamp), TimestampType, LongType) - 0) + 500) % 500)) - 0) + 1000), LongType, TimestampType)), window1#10, time2#6], [named_struct(start, precisetimestampconversion(((precisetimestampconversion(cast(time2#6 as timestamp), TimestampType, LongType) - (((precisetimestampconversion(cast(time2#6 as timestamp), TimestampType, LongType) - 0) + 500) % 500)) - 500), LongType, TimestampType), end, precisetimestampconversionprecisetimestampconversion(cast(time2#6 as timestamp), TimestampType, LongType) - (((p recisetimestampconversion(cast(time2#6 as timestamp), TimestampType, LongType) - 0) + 500) % 500)) - 500) + 1000), LongType, TimestampType)), window1#10, time2#6]], [window#16, window1#10, time2#6] +- Project [window#11 AS window1#10, time2#6] +- Filter isnotnull(cast(time#4 as timestamp)) +- Expand [[named_struct(start, precisetimestampconversion(((precisetimestampconversion(cast(time#4 as timestamp), TimestampType, LongType) - (((precisetimestampconversion(cast(time#4 as timestamp), TimestampType, LongType) - 0) + 500) % 500)) - 0), LongType, TimestampType), end, precisetimestampconversionprecisetimestampconversion(cast(time#4 as timestamp), TimestampType, LongType) - (((precisetimestampconversion(cast(time#4 as timestamp), TimestampType, LongType) - 0) + 500) % 500)) - 0) + 1000), LongType, TimestampType)), time#4, time2#6], [named_struct(start, precisetimestampconversion(((precisetimestampconversion(cast(time#4 as timestamp), TimestampType, LongType) - (((precisetimestampconversion(cast(time#4 as timestamp), TimestampType, LongType) - 0) + 500) % 500)) - 500), LongType, TimestampType), end, precisetimestampconversionprecisetimestampconversion(cast(time#4 as timestamp), TimestampType, LongType) - (((pre cisetimestampconversion(cast(time#4 as timestamp), TimestampType, LongType) - 0) + 500) % 500)) - 500) + 1000), LongType, TimestampType)), time#4, time2#6]], [window#11, time#4, time2#6] +- Project [time#4, cast(time#4 - INTERVAL '05' MINUTE as string) AS time2#6] +- Project [value#1 AS time#4]
[GitHub] [spark] HeartSaVioR commented on a diff in pull request #38288: [SPARK-40821][SQL][CORE][PYTHON][SS] Introduce window_time function to extract event time from the window column
HeartSaVioR commented on code in PR #38288: URL: https://github.com/apache/spark/pull/38288#discussion_r1002572937 ## sql/core/src/test/scala/org/apache/spark/sql/DataFrameTimeWindowingSuite.scala: ## @@ -575,4 +575,64 @@ class DataFrameTimeWindowingSuite extends QueryTest with SharedSparkSession { validateWindowColumnInSchema(schema2, "window") } } + + test("window_time function on raw window column") { +val df = Seq( + ("2016-03-27 19:38:18"), ("2016-03-27 19:39:25") +).toDF("time") + +checkAnswer( + df.select(window($"time", "10 seconds").as("window")) +.select( + $"window.end".cast("string"), + window_time($"window").cast("string") +), + Seq( +Row("2016-03-27 19:38:20", "2016-03-27 19:38:19.99"), +Row("2016-03-27 19:39:30", "2016-03-27 19:39:29.99") + ) +) + } + + test("2 window_time functions on raw window column") { Review Comment: The actual test code which fails due to the rule is following: ``` test("2 window_time functions on raw window column") { val df = Seq( ("2016-03-27 19:38:18"), ("2016-03-27 19:39:25") ).toDF("time") val df2 = df .withColumn("time2", expr("time - INTERVAL 5 minutes")) .select(window($"time", "10 seconds", "5 seconds").as("window1"), $"time2") .select($"window1", window($"time2", "10 seconds", "5 seconds").as("window2")) /* unresolved operator 'Project [window1#10.end AS end#19, unresolvedalias(window_time(window1#10), Some(org.apache.spark.sql.Column$$Lambda$1637/93974967@5d7dd549)), window2#15.end AS end#20, unresolvedalias(window_time(window2#15), Some(org.apache.spark.sql.Column$$Lambda$1637/93974967@5d7dd549))]; 'Project [window1#10.end AS end#19, unresolvedalias(window_time(window1#10), Some(org.apache.spark.sql.Column$$Lambda$1637/93974967@5d7dd549)), window2#15.end AS end#20, unresolvedalias(window_time(window2#15), Some(org.apache.spark.sql.Column$$Lambda$1637/93974967@5d7dd549))] +- Project [window1#10, window#16 AS window2#15] +- Filter isnotnull(cast(time2#6 as timestamp)) +- Expand [[named_struct(start, precisetimestampconversion(((precisetimestampconversion(cast(time2#6 as timestamp), TimestampType, LongType) - (((precisetimestampconversion(cast(time2#6 as timestamp), TimestampType, LongType) - 0) + 500) % 500)) - 0), LongType, TimestampType), end, precisetimestampconversionprecisetimestampconversion(cast(time2#6 as timestamp), TimestampType, LongType) - (((precisetimestampconversion(cast(time2#6 as timestamp), TimestampType, LongType) - 0) + 500) % 500)) - 0) + 1000), LongType, TimestampType)), window1#10, time2#6], [named_struct(start, precisetimestampconversion(((precisetimestampconversion(cast(time2#6 as timestamp), TimestampType, LongType) - (((precisetimestampconversion(cast(time2#6 as timestamp), TimestampType, LongType) - 0) + 500) % 500)) - 500), LongType, TimestampType), end, precisetimestampconversionprecisetimestampconversion(cast(time2#6 as timestamp), TimestampType, LongType) - (((p recisetimestampconversion(cast(time2#6 as timestamp), TimestampType, LongType) - 0) + 500) % 500)) - 500) + 1000), LongType, TimestampType)), window1#10, time2#6]], [window#16, window1#10, time2#6] +- Project [window#11 AS window1#10, time2#6] +- Filter isnotnull(cast(time#4 as timestamp)) +- Expand [[named_struct(start, precisetimestampconversion(((precisetimestampconversion(cast(time#4 as timestamp), TimestampType, LongType) - (((precisetimestampconversion(cast(time#4 as timestamp), TimestampType, LongType) - 0) + 500) % 500)) - 0), LongType, TimestampType), end, precisetimestampconversionprecisetimestampconversion(cast(time#4 as timestamp), TimestampType, LongType) - (((precisetimestampconversion(cast(time#4 as timestamp), TimestampType, LongType) - 0) + 500) % 500)) - 0) + 1000), LongType, TimestampType)), time#4, time2#6], [named_struct(start, precisetimestampconversion(((precisetimestampconversion(cast(time#4 as timestamp), TimestampType, LongType) - (((precisetimestampconversion(cast(time#4 as timestamp), TimestampType, LongType) - 0) + 500) % 500)) - 500), LongType, TimestampType), end, precisetimestampconversionprecisetimestampconversion(cast(time#4 as timestamp), TimestampType, LongType) - (((pre cisetimestampconversion(cast(time#4 as timestamp), TimestampType, LongType) - 0) + 500) % 500)) - 500) + 1000), LongType, TimestampType)), time#4, time2#6]], [window#11, time#4, time2#6] +- Project [time#4, cast(time#4 - INTERVAL '05' MINUTE as string) AS time2#6] +- Project [value#1 AS time#4]
[GitHub] [spark] amaliujia commented on pull request #38347: [SPARK-40883][CONNECT] Support Range in Connect proto
amaliujia commented on PR #38347: URL: https://github.com/apache/spark/pull/38347#issuecomment-1287884286 R: @cloud-fan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] MaxGekk commented on pull request #38351: [SPARK-40391][SQL][TESTS] Test the error class UNSUPPORTED_FEATURE.JDBC_TRANSACTION
MaxGekk commented on PR #38351: URL: https://github.com/apache/spark/pull/38351#issuecomment-1287882749 @panbingkun Could you fix scala style failure: ``` [error] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryExecutionErrorsSuite.scala:34:0: org.apache.spark.sql.execution.datasources.jdbc. is in wrong order relative to ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] bjornjorgensen commented on pull request #38352: [SPARK-40801][BUILD][3.2] Upgrade `Apache commons-text` to 1.10
bjornjorgensen commented on PR #38352: URL: https://github.com/apache/spark/pull/38352#issuecomment-1287875700 @wangyum -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] bjornjorgensen opened a new pull request, #38352: [SPARK-40801][BUILD][3.2] Upgrade `Apache commons-text` to 1.10
bjornjorgensen opened a new pull request, #38352: URL: https://github.com/apache/spark/pull/38352 ### What changes were proposed in this pull request? Upgrade Apache commons-text from 1.6 to 1.10.0 ### Why are the changes needed? [CVE-2022-42889](https://nvd.nist.gov/vuln/detail/CVE-2022-42889) this is a [9.8 CRITICAL](https://nvd.nist.gov/vuln-metrics/cvss/v3-calculator?name=CVE-2022-42889=AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H=3.1=NIST) ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass GA -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] alex-balikov commented on a diff in pull request #38288: [SPARK-40821][SQL][CORE][PYTHON][SS] Introduce window_time function to extract event time from the window column
alex-balikov commented on code in PR #38288: URL: https://github.com/apache/spark/pull/38288#discussion_r1002539045 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala: ## @@ -4201,6 +4219,73 @@ object SessionWindowing extends Rule[LogicalPlan] { } } +/** + * Resolves the window_time expression which extracts the correct window time from the + * window column generated as the output of the window aggregating operators. The + * window column is of type struct { start: TimestampType, end: TimestampType }. + * The correct window time for further aggregations is window.end - 1. + * */ +object ResolveWindowTime extends Rule[LogicalPlan] { + override def apply(plan: LogicalPlan): LogicalPlan = plan.resolveOperatorsUp { +case p: LogicalPlan if p.children.size == 1 => + val child = p.children.head + val windowTimeExpressions = +p.expressions.flatMap(_.collect { case w: WindowTime => w }).toSet + + if (windowTimeExpressions.size == 1 && Review Comment: Modified the test. Indeed the scenario fails with the unsupported ops checker error. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] alex-balikov commented on a diff in pull request #38288: [SPARK-40821][SQL][CORE][PYTHON][SS] Introduce window_time function to extract event time from the window column
alex-balikov commented on code in PR #38288: URL: https://github.com/apache/spark/pull/38288#discussion_r1002536939 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala: ## @@ -4201,6 +4219,73 @@ object SessionWindowing extends Rule[LogicalPlan] { } } +/** + * Resolves the window_time expression which extracts the correct window time from the + * window column generated as the output of the window aggregating operators. The + * window column is of type struct { start: TimestampType, end: TimestampType }. + * The correct window time for further aggregations is window.end - 1. + * */ +object ResolveWindowTime extends Rule[LogicalPlan] { + override def apply(plan: LogicalPlan): LogicalPlan = plan.resolveOperatorsUp { +case p: LogicalPlan if p.children.size == 1 => + val child = p.children.head + val windowTimeExpressions = +p.expressions.flatMap(_.collect { case w: WindowTime => w }).toSet + + if (windowTimeExpressions.size == 1 && +windowTimeExpressions.head.windowColumn.resolved && +windowTimeExpressions.head.checkInputDataTypes().isSuccess) { + +val windowTime = windowTimeExpressions.head + +val metadata = windowTime.windowColumn match { + case a: Attribute => a.metadata + case _ => Metadata.empty +} + +if (!metadata.contains(TimeWindow.marker) && + !metadata.contains(SessionWindow.marker)) { + // FIXME: error framework? + throw new AnalysisException("The input is not a correct window column!") +} + +val newMetadata = new MetadataBuilder() + .withMetadata(metadata) + .remove(TimeWindow.marker) + .remove(SessionWindow.marker) + .build() + +val attr = AttributeReference( + "window_time", windowTime.dataType, metadata = newMetadata)() + +// NOTE: "window.end" is "exclusive" upper bound of window, so if we use this value as +// it is, it is going to be bound to the different window even if we apply the same window +// spec. Decrease 1 microsecond from window.end to let the window_time be bound to the +// correct window range. +val subtractExpr = +PreciseTimestampConversion( + Subtract(PreciseTimestampConversion( +// FIXME: better handling of window.end +GetStructField(windowTime.windowColumn, 1), +windowTime.dataType, LongType), Literal(1L)), + LongType, + windowTime.dataType) + +// FIXME: Can there already be a window_time column? Will this lead to conflict? Review Comment: removed the comment -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] panbingkun commented on pull request #38351: [SPARK-40391][SQL][TESTS] Test the error class UNSUPPORTED_FEATURE.JDBC_TRANSACTION
panbingkun commented on PR #38351: URL: https://github.com/apache/spark/pull/38351#issuecomment-1287833044 cc @MaxGekk -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] panbingkun commented on a diff in pull request #38351: [SPARK-40391][SQL][TESTS] Test the error class UNSUPPORTED_FEATURE.JDBC_TRANSACTION
panbingkun commented on code in PR #38351: URL: https://github.com/apache/spark/pull/38351#discussion_r1002513617 ## sql/core/src/test/resources/mockito-extensions/org.mockito.plugins.MockMaker: ## @@ -0,0 +1 @@ +mock-maker-inline Review Comment: https://www.baeldung.com/mockito-final#configure-mocktio Configure Mockito for Final Methods and Classes https://user-images.githubusercontent.com/15246973/197348725-9a192ef9-b002-4d1b-be1f-3cff13c937f4.png;> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] panbingkun opened a new pull request, #38351: [SPARK-40391][SQL][TESTS] Test the error class UNSUPPORTED_FEATURE.JDBC_TRANSACTION
panbingkun opened a new pull request, #38351: URL: https://github.com/apache/spark/pull/38351 ### What changes were proposed in this pull request? This pr aims to Add one test for the error class UNSUPPORTED_FEATURE.JDBC_TRANSACTION to QueryExecutionErrorsSuite. ### Why are the changes needed? Add one test for the error class UNSUPPORTED_FEATURE.JDBC_TRANSACTION to QueryExecutionErrorsSuite. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? - Manual test: ``` ./build/sbt "sql/testOnly *QueryExecutionErrorsSuite*" ``` All tests passed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] wangyum commented on pull request #38262: [SPARK-40801][BUILD] Upgrade `Apache commons-text` to 1.10
wangyum commented on PR #38262: URL: https://github.com/apache/spark/pull/38262#issuecomment-1287810903 @bjornjorgensen +1. Please backport this branch-3.2. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] MaxGekk commented on pull request #38237: [SPARK-40760][SQL] Migrate type check failures of interval expressions onto error classes
MaxGekk commented on PR #38237: URL: https://github.com/apache/spark/pull/38237#issuecomment-1287797217 @cloud-fan Could you approve the PR, please. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] MaxGekk commented on pull request #38273: [SPARK-37945][SQL][CORE] Use error classes in the execution errors of arithmetic ops
MaxGekk commented on PR #38273: URL: https://github.com/apache/spark/pull/38273#issuecomment-1287797132 @khalidmammadov Could you re-trigger tests/builds by merging the recent master, please. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] MaxGekk closed pull request #38319: [SPARK-40856][SQL] Update the error template of WRONG_NUM_PARAMS
MaxGekk closed pull request #38319: [SPARK-40856][SQL] Update the error template of WRONG_NUM_PARAMS URL: https://github.com/apache/spark/pull/38319 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] MaxGekk commented on pull request #38319: [SPARK-40856][SQL] Update the error template of WRONG_NUM_PARAMS
MaxGekk commented on PR #38319: URL: https://github.com/apache/spark/pull/38319#issuecomment-1287796420 +1, LGTM. Merging to master. Thank you, @panbingkun. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #38337: [SPARK-39404][SS][3.3] Minor fix for querying _metadata in streaming
AmplabJenkins commented on PR #38337: URL: https://github.com/apache/spark/pull/38337#issuecomment-1287769832 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] panbingkun opened a new pull request, #38350: [SPARK-40752][SQL] Migrate type check failures of misc expressions onto error classes
panbingkun opened a new pull request, #38350: URL: https://github.com/apache/spark/pull/38350 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HeartSaVioR closed pull request #38337: [SPARK-39404][SS][3.3] Minor fix for querying _metadata in streaming
HeartSaVioR closed pull request #38337: [SPARK-39404][SS][3.3] Minor fix for querying _metadata in streaming URL: https://github.com/apache/spark/pull/38337 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HeartSaVioR commented on pull request #38337: [SPARK-39404][SS][3.3] Minor fix for querying _metadata in streaming
HeartSaVioR commented on PR #38337: URL: https://github.com/apache/spark/pull/38337#issuecomment-1287754620 Thanks! Merging to 3.3. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HeartSaVioR commented on pull request #38337: [SPARK-39404][SS][3.3] Minor fix for querying _metadata in streaming
HeartSaVioR commented on PR #38337: URL: https://github.com/apache/spark/pull/38337#issuecomment-1287754383 https://github.com/Yaohua628/spark/runs/9040960126 Build passed - it looks to be not reflected. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] kingofDaniel opened a new pull request, #38349: Pull request
kingofDaniel opened a new pull request, #38349: URL: https://github.com/apache/spark/pull/38349 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] bjornjorgensen commented on pull request #38348: [SPARK-40884][BUILD] Upgrade fabric8io - `kubernetes-client` to 6.2.0
bjornjorgensen commented on PR #38348: URL: https://github.com/apache/spark/pull/38348#issuecomment-1287683204 @dongjoon-hyun -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] bjornjorgensen opened a new pull request, #38348: [SPARK-40884][BUILD] Upgrade fabric8io - `kubernetes-client` to 6.2.0
bjornjorgensen opened a new pull request, #38348: URL: https://github.com/apache/spark/pull/38348 ### What changes were proposed in this pull request? Upgrade fabric8io - kubernetes-client from 6.1.0 to 6.2.0 ### Why are the changes needed? [Release notes](https://github.com/fabric8io/kubernetes-client/releases/tag/v6.2.0) [Snakeyaml version should be updated to mitigate CVE-2022-28857](https://github.com/fabric8io/kubernetes-client/issues/4383) ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass GA -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] amaliujia opened a new pull request, #38347: [SPARK-40883][CONNECT] Support Range in Connect proto
amaliujia opened a new pull request, #38347: URL: https://github.com/apache/spark/pull/38347 ### What changes were proposed in this pull request? 1. Support `Range` in Connect proto. 2. Refactor `SparkConnectDeduplicateSuite` to `SparkConnectSessionBasedSuite` ### Why are the changes needed? Improve API coverage. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? UT -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] peter-toth commented on pull request #38334: [SPARK-40874][PYTHON] Fix broadcasts in Python UDFs when encryption enabled
peter-toth commented on PR #38334: URL: https://github.com/apache/spark/pull/38334#issuecomment-1287665617 Thanks @HyukjinKwon for the quick review! The bug was introduced in https://github.com/apache/spark/commit/58419b92673c46911c25bc6c6b13397f880c6424#diff-ed4fb5ce30273e8eefcc7d4b0152ea7a60fb4f8f709d4da8ea1ab56aeda26001R307-R323 in Spark 3.0. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] Yikun commented on pull request #38341: [SPARK-40871][INFRA] Upgrade actions/github-script to v6 and fix notify workflow
Yikun commented on PR #38341: URL: https://github.com/apache/spark/pull/38341#issuecomment-1287659874 @HyukjinKwon Thanks, it works: https://github.com/apache/spark/pull/38343 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] Yikun commented on pull request #38343: [DONT MERGE] Test upgrade v6
Yikun commented on PR #38343: URL: https://github.com/apache/spark/pull/38343#issuecomment-1287659814 https://user-images.githubusercontent.com/1736354/197326477-f024d29c-83e8-4c50-8348-0ef6e9b368ef.png;> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] Yikun closed pull request #38343: [DONT MERGE] Test upgrade v6
Yikun closed pull request #38343: [DONT MERGE] Test upgrade v6 URL: https://github.com/apache/spark/pull/38343 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] Yikun commented on pull request #38342: [SPARK-40870][INFRA] Upgrade docker actions to cleanup warning
Yikun commented on PR #38342: URL: https://github.com/apache/spark/pull/38342#issuecomment-1287659692 @HyukjinKwon Thanks, merge to master (3.4.0). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] Yikun closed pull request #38342: [SPARK-40870][INFRA] Upgrade docker actions to cleanup warning
Yikun closed pull request #38342: [SPARK-40870][INFRA] Upgrade docker actions to cleanup warning URL: https://github.com/apache/spark/pull/38342 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zhengruifeng opened a new pull request, #38346: [SPARK-40880][SQL] Reimplement `summary` with dataframe operations
zhengruifeng opened a new pull request, #38346: URL: https://github.com/apache/spark/pull/38346 ### What changes were proposed in this pull request? Reimplement `summary` with dataframe operations ### Why are the changes needed? 1, do not truncate the sql plan; 2, enable sql optimization like column pruning: ``` scala> val df = spark.range(0, 3, 1, 10).withColumn("value", lit("str")) df: org.apache.spark.sql.DataFrame = [id: bigint, value: string] scala> df.summary("max", "50%").show +---+---+-+ |summary| id|value| +---+---+-+ |max| 2| str| |50%| 1| null| +---+---+-+ scala> df.summary("max", "50%").select("id").show +---+ | id| +---+ | 2| | 1| +---+ scala> df.summary("max", "50%").select("id").queryExecution.optimizedPlan res4: org.apache.spark.sql.catalyst.plans.logical.LogicalPlan = Project [element_at(id#367, summary#376, None, false) AS id#371] +- Generate explode([max,50%]), false, [summary#376] +- Aggregate [map(max, cast(max(id#153L) as string), 50%, cast(percentile_approx(id#153L, [0.5], 1, 0, 0)[0] as string)) AS id#367] +- Range (0, 3, step=1, splits=Some(10)) ``` ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? existing UTs and manually check -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org