[GitHub] [spark] attilapiros commented on pull request #38348: [SPARK-40884][BUILD] Upgrade fabric8io - `kubernetes-client` to 6.2.0

2022-10-22 Thread GitBox


attilapiros commented on PR #38348:
URL: https://github.com/apache/spark/pull/38348#issuecomment-1288004072

   ok to test


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HeartSaVioR commented on a diff in pull request #38288: [SPARK-40821][SQL][CORE][PYTHON][SS] Introduce window_time function to extract event time from the window column

2022-10-22 Thread GitBox


HeartSaVioR commented on code in PR #38288:
URL: https://github.com/apache/spark/pull/38288#discussion_r1002572937


##
sql/core/src/test/scala/org/apache/spark/sql/DataFrameTimeWindowingSuite.scala:
##
@@ -575,4 +575,64 @@ class DataFrameTimeWindowingSuite extends QueryTest with 
SharedSparkSession {
   validateWindowColumnInSchema(schema2, "window")
 }
   }
+
+  test("window_time function on raw window column") {
+val df = Seq(
+  ("2016-03-27 19:38:18"), ("2016-03-27 19:39:25")
+).toDF("time")
+
+checkAnswer(
+  df.select(window($"time", "10 seconds").as("window"))
+.select(
+  $"window.end".cast("string"),
+  window_time($"window").cast("string")
+),
+  Seq(
+Row("2016-03-27 19:38:20", "2016-03-27 19:38:19.99"),
+Row("2016-03-27 19:39:30", "2016-03-27 19:39:29.99")
+  )
+)
+  }
+
+  test("2 window_time functions on raw window column") {

Review Comment:
   The actual test code which fails due to the rule is following:
   
   ```
 test("2 window_time functions on raw window column") {
   val df = Seq(
 ("2016-03-27 19:38:18"), ("2016-03-27 19:39:25")
   ).toDF("time")
   
   val df2 = df
 .withColumn("time2", expr("time - INTERVAL 5 minutes"))
 .select(window($"time", "10 seconds", "5 seconds").as("window1"), 
$"time2")
 .select($"window1", window($"time2", "10 seconds", "5 
seconds").as("window2"))
   
   /*
 unresolved operator 'Project [window1#10.end AS end#19, 
unresolvedalias(window_time(window1#10), 
Some(org.apache.spark.sql.Column$$Lambda$1637/93974967@5d7dd549)), 
window2#15.end AS end#20, unresolvedalias(window_time(window2#15), 
Some(org.apache.spark.sql.Column$$Lambda$1637/93974967@5d7dd549))];
 'Project [window1#10.end AS end#19, 
unresolvedalias(window_time(window1#10), 
Some(org.apache.spark.sql.Column$$Lambda$1637/93974967@5d7dd549)), 
window2#15.end AS end#20, unresolvedalias(window_time(window2#15), 
Some(org.apache.spark.sql.Column$$Lambda$1637/93974967@5d7dd549))]
 +- Project [window1#10, window#16 AS window2#15]
+- Filter isnotnull(cast(time2#6 as timestamp))
   +- Expand [[named_struct(start, 
precisetimestampconversion(((precisetimestampconversion(cast(time2#6 as 
timestamp), TimestampType, LongType) - 
(((precisetimestampconversion(cast(time2#6 as timestamp), TimestampType, 
LongType) - 0) + 500) % 500)) - 0), LongType, TimestampType), end, 
precisetimestampconversionprecisetimestampconversion(cast(time2#6 as 
timestamp), TimestampType, LongType) - 
(((precisetimestampconversion(cast(time2#6 as timestamp), TimestampType, 
LongType) - 0) + 500) % 500)) - 0) + 1000), LongType, 
TimestampType)), window1#10, time2#6], [named_struct(start, 
precisetimestampconversion(((precisetimestampconversion(cast(time2#6 as 
timestamp), TimestampType, LongType) - 
(((precisetimestampconversion(cast(time2#6 as timestamp), TimestampType, 
LongType) - 0) + 500) % 500)) - 500), LongType, TimestampType), 
end, precisetimestampconversionprecisetimestampconversion(cast(time2#6 as 
timestamp), TimestampType, LongType) - (((p
 recisetimestampconversion(cast(time2#6 as timestamp), TimestampType, LongType) 
- 0) + 500) % 500)) - 500) + 1000), LongType, TimestampType)), 
window1#10, time2#6]], [window#16, window1#10, time2#6]
  +- Project [window#11 AS window1#10, time2#6]
 +- Filter isnotnull(cast(time#4 as timestamp))
+- Expand [[named_struct(start, 
precisetimestampconversion(((precisetimestampconversion(cast(time#4 as 
timestamp), TimestampType, LongType) - 
(((precisetimestampconversion(cast(time#4 as timestamp), TimestampType, 
LongType) - 0) + 500) % 500)) - 0), LongType, TimestampType), end, 
precisetimestampconversionprecisetimestampconversion(cast(time#4 as 
timestamp), TimestampType, LongType) - 
(((precisetimestampconversion(cast(time#4 as timestamp), TimestampType, 
LongType) - 0) + 500) % 500)) - 0) + 1000), LongType, 
TimestampType)), time#4, time2#6], [named_struct(start, 
precisetimestampconversion(((precisetimestampconversion(cast(time#4 as 
timestamp), TimestampType, LongType) - 
(((precisetimestampconversion(cast(time#4 as timestamp), TimestampType, 
LongType) - 0) + 500) % 500)) - 500), LongType, TimestampType), 
end, precisetimestampconversionprecisetimestampconversion(cast(time#4 as 
timestamp), TimestampType, LongType) - (((pre
 cisetimestampconversion(cast(time#4 as timestamp), TimestampType, LongType) - 
0) + 500) % 500)) - 500) + 1000), LongType, TimestampType)), 
time#4, time2#6]], [window#11, time#4, time2#6]
   +- Project [time#4, cast(time#4 - INTERVAL '05' 
MINUTE as string) AS time2#6]
  +- Project [value#1 AS time#4]

[GitHub] [spark] wangyum closed pull request #38349: Pull request

2022-10-22 Thread GitBox


wangyum closed pull request #38349: Pull request 
URL: https://github.com/apache/spark/pull/38349


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] LuciferYang commented on pull request #38328: [SPARK-40863][BUILD] Upgrade dropwizard metrics 4.2.12

2022-10-22 Thread GitBox


LuciferYang commented on PR #38328:
URL: https://github.com/apache/spark/pull/38328#issuecomment-1287995156

   Thanks @srowen @HyukjinKwon 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] LuciferYang commented on pull request #38329: [SPARK-40865][BUILD] Upgrade jodatime to 2.12.0

2022-10-22 Thread GitBox


LuciferYang commented on PR #38329:
URL: https://github.com/apache/spark/pull/38329#issuecomment-1287995115

   Thanks @srowen @HyukjinKwon 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] panbingkun commented on pull request #38351: [SPARK-40391][SQL][TESTS] Test the error class UNSUPPORTED_FEATURE.JDBC_TRANSACTION

2022-10-22 Thread GitBox


panbingkun commented on PR #38351:
URL: https://github.com/apache/spark/pull/38351#issuecomment-1287979218

   > @panbingkun Could you fix scala style failure:
   > 
   > ```
   > [error] 
/home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryExecutionErrorsSuite.scala:34:0:
 org.apache.spark.sql.execution.datasources.jdbc. is in wrong order relative to 
   > ```
   
   Done.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] Yikun opened a new pull request, #38354: [SPARK-40882][INFRA] Upgrade actions/setup-java to v3 with distribution specified

2022-10-22 Thread GitBox


Yikun opened a new pull request, #38354:
URL: https://github.com/apache/spark/pull/38354

   ### What changes were proposed in this pull request?
   Upgrade actions/setup-java to v3 with distribution specified
   
   
   ### Why are the changes needed?
   
   - The `distribution` is required after v2, now just keep `zulu` (same 
distribution with v1): https://github.com/actions/setup-java/releases/tag/v2.0.0
   - https://github.com/actions/setup-java/releases/tag/v3.0.0: Upgrade node
   - https://github.com/actions/setup-java/releases/tag/v3.6.0: Cleanup 
set-output warning
   
   ### Does this PR introduce _any_ user-facing change?
   No,dev only
   
   
   ### How was this patch tested?
   CI passed


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] Yikun opened a new pull request, #38353: [SPARK-40881][INFRA] Upgrade actions/cache to v3 and actions/upload-artifact to v3

2022-10-22 Thread GitBox


Yikun opened a new pull request, #38353:
URL: https://github.com/apache/spark/pull/38353

   ### What changes were proposed in this pull request?
   Upgrade actions/cache to v3 and actions/upload-artifact to v3
   
   ### Why are the changes needed?
   - Since actions/cache@v3: support from node 12 -> node 16.
   - Since actions/upload-artifact@v3: cleanup `set-output` warning
   
   ### Does this PR introduce _any_ user-facing change?
   No, dev only
   
   ### How was this patch tested?
   CI passed


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zhengruifeng commented on pull request #38346: [SPARK-40880][SQL] Reimplement `summary` with dataframe operations

2022-10-22 Thread GitBox


zhengruifeng commented on PR #38346:
URL: https://github.com/apache/spark/pull/38346#issuecomment-1287965144

   cc @HyukjinKwon 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] github-actions[bot] commented on pull request #36332: [SPARK-36730][SQL] Use V2 Filter in V2 file source

2022-10-22 Thread GitBox


github-actions[bot] commented on PR #36332:
URL: https://github.com/apache/spark/pull/36332#issuecomment-1287960765

   We're closing this PR because it hasn't been updated in a while. This isn't 
a judgement on the merit of the PR in any way. It's just a way of keeping the 
PR queue manageable.
   If you'd like to revive this PR, please reopen it and ask a committer to 
remove the Stale tag!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] github-actions[bot] commented on pull request #37053: [SPARK-39452][GraphX] Extend EdgePartition1D with Destination based Strategy

2022-10-22 Thread GitBox


github-actions[bot] commented on PR #37053:
URL: https://github.com/apache/spark/pull/37053#issuecomment-1287960756

   We're closing this PR because it hasn't been updated in a while. This isn't 
a judgement on the merit of the PR in any way. It's just a way of keeping the 
PR queue manageable.
   If you'd like to revive this PR, please reopen it and ask a committer to 
remove the Stale tag!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] github-actions[bot] commented on pull request #37183: [SPARK-39770][SQL][AVRO] Support Avro schema evolution

2022-10-22 Thread GitBox


github-actions[bot] commented on PR #37183:
URL: https://github.com/apache/spark/pull/37183#issuecomment-1287960748

   We're closing this PR because it hasn't been updated in a while. This isn't 
a judgement on the merit of the PR in any way. It's just a way of keeping the 
PR queue manageable.
   If you'd like to revive this PR, please reopen it and ask a committer to 
remove the Stale tag!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #38345: [SPARK-40879][CONNECT] Support Join UsingColumns in proto

2022-10-22 Thread GitBox


AmplabJenkins commented on PR #38345:
URL: https://github.com/apache/spark/pull/38345#issuecomment-1287953624

   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] srowen commented on pull request #38228: [SPARK-40739][SPARK-40738] Fixes for cygwin/msys2/mingw sbt build and bash scripts

2022-10-22 Thread GitBox


srowen commented on PR #38228:
URL: https://github.com/apache/spark/pull/38228#issuecomment-1287945182

   Seems OK now


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #38347: [SPARK-40883][CONNECT] Support Range in Connect proto

2022-10-22 Thread GitBox


AmplabJenkins commented on PR #38347:
URL: https://github.com/apache/spark/pull/38347#issuecomment-1287931807

   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #38348: [SPARK-40884][BUILD] Upgrade fabric8io - `kubernetes-client` to 6.2.0

2022-10-22 Thread GitBox


AmplabJenkins commented on PR #38348:
URL: https://github.com/apache/spark/pull/38348#issuecomment-1287931800

   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #38349: Pull request

2022-10-22 Thread GitBox


AmplabJenkins commented on PR #38349:
URL: https://github.com/apache/spark/pull/38349#issuecomment-1287931788

   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #38350: [WIP][SPARK-40752][SQL] Migrate type check failures of misc expressions onto error classes

2022-10-22 Thread GitBox


AmplabJenkins commented on PR #38350:
URL: https://github.com/apache/spark/pull/38350#issuecomment-1287931775

   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #38352: [SPARK-40801][BUILD][3.2] Upgrade `Apache commons-text` to 1.10

2022-10-22 Thread GitBox


AmplabJenkins commented on PR #38352:
URL: https://github.com/apache/spark/pull/38352#issuecomment-1287931763

   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #38351: [SPARK-40391][SQL][TESTS] Test the error class UNSUPPORTED_FEATURE.JDBC_TRANSACTION

2022-10-22 Thread GitBox


AmplabJenkins commented on PR #38351:
URL: https://github.com/apache/spark/pull/38351#issuecomment-1287931766

   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HeartSaVioR commented on a diff in pull request #38288: [SPARK-40821][SQL][CORE][PYTHON][SS] Introduce window_time function to extract event time from the window column

2022-10-22 Thread GitBox


HeartSaVioR commented on code in PR #38288:
URL: https://github.com/apache/spark/pull/38288#discussion_r1002574932


##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala:
##
@@ -4201,6 +4219,73 @@ object SessionWindowing extends Rule[LogicalPlan] {
   }
 }
 
+/**
+ * Resolves the window_time expression which extracts the correct window time 
from the
+ * window column generated as the output of the window aggregating operators. 
The
+ * window column is of type struct { start: TimestampType, end: TimestampType 
}.
+ * The correct window time for further aggregations is window.end - 1.
+ * */
+object ResolveWindowTime extends Rule[LogicalPlan] {
+  override def apply(plan: LogicalPlan): LogicalPlan = plan.resolveOperatorsUp 
{
+case p: LogicalPlan if p.children.size == 1 =>
+  val child = p.children.head
+  val windowTimeExpressions =
+p.expressions.flatMap(_.collect { case w: WindowTime => w }).toSet
+
+  if (windowTimeExpressions.size == 1 &&

Review Comment:
   I left a comment to workaround the ops checker error and still hit this 
condition.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HeartSaVioR commented on a diff in pull request #38288: [SPARK-40821][SQL][CORE][PYTHON][SS] Introduce window_time function to extract event time from the window column

2022-10-22 Thread GitBox


HeartSaVioR commented on code in PR #38288:
URL: https://github.com/apache/spark/pull/38288#discussion_r1002572937


##
sql/core/src/test/scala/org/apache/spark/sql/DataFrameTimeWindowingSuite.scala:
##
@@ -575,4 +575,64 @@ class DataFrameTimeWindowingSuite extends QueryTest with 
SharedSparkSession {
   validateWindowColumnInSchema(schema2, "window")
 }
   }
+
+  test("window_time function on raw window column") {
+val df = Seq(
+  ("2016-03-27 19:38:18"), ("2016-03-27 19:39:25")
+).toDF("time")
+
+checkAnswer(
+  df.select(window($"time", "10 seconds").as("window"))
+.select(
+  $"window.end".cast("string"),
+  window_time($"window").cast("string")
+),
+  Seq(
+Row("2016-03-27 19:38:20", "2016-03-27 19:38:19.99"),
+Row("2016-03-27 19:39:30", "2016-03-27 19:39:29.99")
+  )
+)
+  }
+
+  test("2 window_time functions on raw window column") {

Review Comment:
   The actual test code which fails due to the rule is following:
   
   ```
 test("2 window_time functions on raw window column") {
   val df = Seq(
 ("2016-03-27 19:38:18"), ("2016-03-27 19:39:25")
   ).toDF("time")
   
   val df2 = df
 .withColumn("time2", expr("time - INTERVAL 5 minutes"))
 .select(window($"time", "10 seconds", "5 seconds").as("window1"), 
$"time2")
 .select($"window1", window($"time2", "10 seconds", "5 
seconds").as("window2"))
   
   /*
 unresolved operator 'Project [window1#10.end AS end#19, 
unresolvedalias(window_time(window1#10), 
Some(org.apache.spark.sql.Column$$Lambda$1637/93974967@5d7dd549)), 
window2#15.end AS end#20, unresolvedalias(window_time(window2#15), 
Some(org.apache.spark.sql.Column$$Lambda$1637/93974967@5d7dd549))];
 'Project [window1#10.end AS end#19, 
unresolvedalias(window_time(window1#10), 
Some(org.apache.spark.sql.Column$$Lambda$1637/93974967@5d7dd549)), 
window2#15.end AS end#20, unresolvedalias(window_time(window2#15), 
Some(org.apache.spark.sql.Column$$Lambda$1637/93974967@5d7dd549))]
 +- Project [window1#10, window#16 AS window2#15]
+- Filter isnotnull(cast(time2#6 as timestamp))
   +- Expand [[named_struct(start, 
precisetimestampconversion(((precisetimestampconversion(cast(time2#6 as 
timestamp), TimestampType, LongType) - 
(((precisetimestampconversion(cast(time2#6 as timestamp), TimestampType, 
LongType) - 0) + 500) % 500)) - 0), LongType, TimestampType), end, 
precisetimestampconversionprecisetimestampconversion(cast(time2#6 as 
timestamp), TimestampType, LongType) - 
(((precisetimestampconversion(cast(time2#6 as timestamp), TimestampType, 
LongType) - 0) + 500) % 500)) - 0) + 1000), LongType, 
TimestampType)), window1#10, time2#6], [named_struct(start, 
precisetimestampconversion(((precisetimestampconversion(cast(time2#6 as 
timestamp), TimestampType, LongType) - 
(((precisetimestampconversion(cast(time2#6 as timestamp), TimestampType, 
LongType) - 0) + 500) % 500)) - 500), LongType, TimestampType), 
end, precisetimestampconversionprecisetimestampconversion(cast(time2#6 as 
timestamp), TimestampType, LongType) - (((p
 recisetimestampconversion(cast(time2#6 as timestamp), TimestampType, LongType) 
- 0) + 500) % 500)) - 500) + 1000), LongType, TimestampType)), 
window1#10, time2#6]], [window#16, window1#10, time2#6]
  +- Project [window#11 AS window1#10, time2#6]
 +- Filter isnotnull(cast(time#4 as timestamp))
+- Expand [[named_struct(start, 
precisetimestampconversion(((precisetimestampconversion(cast(time#4 as 
timestamp), TimestampType, LongType) - 
(((precisetimestampconversion(cast(time#4 as timestamp), TimestampType, 
LongType) - 0) + 500) % 500)) - 0), LongType, TimestampType), end, 
precisetimestampconversionprecisetimestampconversion(cast(time#4 as 
timestamp), TimestampType, LongType) - 
(((precisetimestampconversion(cast(time#4 as timestamp), TimestampType, 
LongType) - 0) + 500) % 500)) - 0) + 1000), LongType, 
TimestampType)), time#4, time2#6], [named_struct(start, 
precisetimestampconversion(((precisetimestampconversion(cast(time#4 as 
timestamp), TimestampType, LongType) - 
(((precisetimestampconversion(cast(time#4 as timestamp), TimestampType, 
LongType) - 0) + 500) % 500)) - 500), LongType, TimestampType), 
end, precisetimestampconversionprecisetimestampconversion(cast(time#4 as 
timestamp), TimestampType, LongType) - (((pre
 cisetimestampconversion(cast(time#4 as timestamp), TimestampType, LongType) - 
0) + 500) % 500)) - 500) + 1000), LongType, TimestampType)), 
time#4, time2#6]], [window#11, time#4, time2#6]
   +- Project [time#4, cast(time#4 - INTERVAL '05' 
MINUTE as string) AS time2#6]
  +- Project [value#1 AS time#4]

[GitHub] [spark] HeartSaVioR commented on a diff in pull request #38288: [SPARK-40821][SQL][CORE][PYTHON][SS] Introduce window_time function to extract event time from the window column

2022-10-22 Thread GitBox


HeartSaVioR commented on code in PR #38288:
URL: https://github.com/apache/spark/pull/38288#discussion_r1002572937


##
sql/core/src/test/scala/org/apache/spark/sql/DataFrameTimeWindowingSuite.scala:
##
@@ -575,4 +575,64 @@ class DataFrameTimeWindowingSuite extends QueryTest with 
SharedSparkSession {
   validateWindowColumnInSchema(schema2, "window")
 }
   }
+
+  test("window_time function on raw window column") {
+val df = Seq(
+  ("2016-03-27 19:38:18"), ("2016-03-27 19:39:25")
+).toDF("time")
+
+checkAnswer(
+  df.select(window($"time", "10 seconds").as("window"))
+.select(
+  $"window.end".cast("string"),
+  window_time($"window").cast("string")
+),
+  Seq(
+Row("2016-03-27 19:38:20", "2016-03-27 19:38:19.99"),
+Row("2016-03-27 19:39:30", "2016-03-27 19:39:29.99")
+  )
+)
+  }
+
+  test("2 window_time functions on raw window column") {

Review Comment:
   The actual test code which fails due to the rule is following:
   
   ```
 test("2 window_time functions on raw window column") {
   val df = Seq(
 ("2016-03-27 19:38:18"), ("2016-03-27 19:39:25")
   ).toDF("time")
   
   val df2 = df
 .withColumn("time2", expr("time - INTERVAL 5 minutes"))
 .select(window($"time", "10 seconds", "5 seconds").as("window1"), 
$"time2")
 .select($"window1", window($"time2", "10 seconds", "5 
seconds").as("window2"))
   
   /*
 unresolved operator 'Project [window1#10.end AS end#19, 
unresolvedalias(window_time(window1#10), 
Some(org.apache.spark.sql.Column$$Lambda$1637/93974967@5d7dd549)), 
window2#15.end AS end#20, unresolvedalias(window_time(window2#15), 
Some(org.apache.spark.sql.Column$$Lambda$1637/93974967@5d7dd549))];
 'Project [window1#10.end AS end#19, 
unresolvedalias(window_time(window1#10), 
Some(org.apache.spark.sql.Column$$Lambda$1637/93974967@5d7dd549)), 
window2#15.end AS end#20, unresolvedalias(window_time(window2#15), 
Some(org.apache.spark.sql.Column$$Lambda$1637/93974967@5d7dd549))]
 +- Project [window1#10, window#16 AS window2#15]
+- Filter isnotnull(cast(time2#6 as timestamp))
   +- Expand [[named_struct(start, 
precisetimestampconversion(((precisetimestampconversion(cast(time2#6 as 
timestamp), TimestampType, LongType) - 
(((precisetimestampconversion(cast(time2#6 as timestamp), TimestampType, 
LongType) - 0) + 500) % 500)) - 0), LongType, TimestampType), end, 
precisetimestampconversionprecisetimestampconversion(cast(time2#6 as 
timestamp), TimestampType, LongType) - 
(((precisetimestampconversion(cast(time2#6 as timestamp), TimestampType, 
LongType) - 0) + 500) % 500)) - 0) + 1000), LongType, 
TimestampType)), window1#10, time2#6], [named_struct(start, 
precisetimestampconversion(((precisetimestampconversion(cast(time2#6 as 
timestamp), TimestampType, LongType) - 
(((precisetimestampconversion(cast(time2#6 as timestamp), TimestampType, 
LongType) - 0) + 500) % 500)) - 500), LongType, TimestampType), 
end, precisetimestampconversionprecisetimestampconversion(cast(time2#6 as 
timestamp), TimestampType, LongType) - (((p
 recisetimestampconversion(cast(time2#6 as timestamp), TimestampType, LongType) 
- 0) + 500) % 500)) - 500) + 1000), LongType, TimestampType)), 
window1#10, time2#6]], [window#16, window1#10, time2#6]
  +- Project [window#11 AS window1#10, time2#6]
 +- Filter isnotnull(cast(time#4 as timestamp))
+- Expand [[named_struct(start, 
precisetimestampconversion(((precisetimestampconversion(cast(time#4 as 
timestamp), TimestampType, LongType) - 
(((precisetimestampconversion(cast(time#4 as timestamp), TimestampType, 
LongType) - 0) + 500) % 500)) - 0), LongType, TimestampType), end, 
precisetimestampconversionprecisetimestampconversion(cast(time#4 as 
timestamp), TimestampType, LongType) - 
(((precisetimestampconversion(cast(time#4 as timestamp), TimestampType, 
LongType) - 0) + 500) % 500)) - 0) + 1000), LongType, 
TimestampType)), time#4, time2#6], [named_struct(start, 
precisetimestampconversion(((precisetimestampconversion(cast(time#4 as 
timestamp), TimestampType, LongType) - 
(((precisetimestampconversion(cast(time#4 as timestamp), TimestampType, 
LongType) - 0) + 500) % 500)) - 500), LongType, TimestampType), 
end, precisetimestampconversionprecisetimestampconversion(cast(time#4 as 
timestamp), TimestampType, LongType) - (((pre
 cisetimestampconversion(cast(time#4 as timestamp), TimestampType, LongType) - 
0) + 500) % 500)) - 500) + 1000), LongType, TimestampType)), 
time#4, time2#6]], [window#11, time#4, time2#6]
   +- Project [time#4, cast(time#4 - INTERVAL '05' 
MINUTE as string) AS time2#6]
  +- Project [value#1 AS time#4]

[GitHub] [spark] HeartSaVioR commented on a diff in pull request #38288: [SPARK-40821][SQL][CORE][PYTHON][SS] Introduce window_time function to extract event time from the window column

2022-10-22 Thread GitBox


HeartSaVioR commented on code in PR #38288:
URL: https://github.com/apache/spark/pull/38288#discussion_r1002572937


##
sql/core/src/test/scala/org/apache/spark/sql/DataFrameTimeWindowingSuite.scala:
##
@@ -575,4 +575,64 @@ class DataFrameTimeWindowingSuite extends QueryTest with 
SharedSparkSession {
   validateWindowColumnInSchema(schema2, "window")
 }
   }
+
+  test("window_time function on raw window column") {
+val df = Seq(
+  ("2016-03-27 19:38:18"), ("2016-03-27 19:39:25")
+).toDF("time")
+
+checkAnswer(
+  df.select(window($"time", "10 seconds").as("window"))
+.select(
+  $"window.end".cast("string"),
+  window_time($"window").cast("string")
+),
+  Seq(
+Row("2016-03-27 19:38:20", "2016-03-27 19:38:19.99"),
+Row("2016-03-27 19:39:30", "2016-03-27 19:39:29.99")
+  )
+)
+  }
+
+  test("2 window_time functions on raw window column") {

Review Comment:
   The actual test code which fails due to the rule is following:
   
   ```
 test("2 window_time functions on raw window column") {
   val df = Seq(
 ("2016-03-27 19:38:18"), ("2016-03-27 19:39:25")
   ).toDF("time")
   
   val df2 = df
 .withColumn("time2", expr("time - INTERVAL 5 minutes"))
 .select(window($"time", "10 seconds", "5 seconds").as("window1"), 
$"time2")
 .select($"window1", window($"time2", "10 seconds", "5 
seconds").as("window2"))
   
   /*
 unresolved operator 'Project [window1#10.end AS end#19, 
unresolvedalias(window_time(window1#10), 
Some(org.apache.spark.sql.Column$$Lambda$1637/93974967@5d7dd549)), 
window2#15.end AS end#20, unresolvedalias(window_time(window2#15), 
Some(org.apache.spark.sql.Column$$Lambda$1637/93974967@5d7dd549))];
 'Project [window1#10.end AS end#19, 
unresolvedalias(window_time(window1#10), 
Some(org.apache.spark.sql.Column$$Lambda$1637/93974967@5d7dd549)), 
window2#15.end AS end#20, unresolvedalias(window_time(window2#15), 
Some(org.apache.spark.sql.Column$$Lambda$1637/93974967@5d7dd549))]
 +- Project [window1#10, window#16 AS window2#15]
+- Filter isnotnull(cast(time2#6 as timestamp))
   +- Expand [[named_struct(start, 
precisetimestampconversion(((precisetimestampconversion(cast(time2#6 as 
timestamp), TimestampType, LongType) - 
(((precisetimestampconversion(cast(time2#6 as timestamp), TimestampType, 
LongType) - 0) + 500) % 500)) - 0), LongType, TimestampType), end, 
precisetimestampconversionprecisetimestampconversion(cast(time2#6 as 
timestamp), TimestampType, LongType) - 
(((precisetimestampconversion(cast(time2#6 as timestamp), TimestampType, 
LongType) - 0) + 500) % 500)) - 0) + 1000), LongType, 
TimestampType)), window1#10, time2#6], [named_struct(start, 
precisetimestampconversion(((precisetimestampconversion(cast(time2#6 as 
timestamp), TimestampType, LongType) - 
(((precisetimestampconversion(cast(time2#6 as timestamp), TimestampType, 
LongType) - 0) + 500) % 500)) - 500), LongType, TimestampType), 
end, precisetimestampconversionprecisetimestampconversion(cast(time2#6 as 
timestamp), TimestampType, LongType) - (((p
 recisetimestampconversion(cast(time2#6 as timestamp), TimestampType, LongType) 
- 0) + 500) % 500)) - 500) + 1000), LongType, TimestampType)), 
window1#10, time2#6]], [window#16, window1#10, time2#6]
  +- Project [window#11 AS window1#10, time2#6]
 +- Filter isnotnull(cast(time#4 as timestamp))
+- Expand [[named_struct(start, 
precisetimestampconversion(((precisetimestampconversion(cast(time#4 as 
timestamp), TimestampType, LongType) - 
(((precisetimestampconversion(cast(time#4 as timestamp), TimestampType, 
LongType) - 0) + 500) % 500)) - 0), LongType, TimestampType), end, 
precisetimestampconversionprecisetimestampconversion(cast(time#4 as 
timestamp), TimestampType, LongType) - 
(((precisetimestampconversion(cast(time#4 as timestamp), TimestampType, 
LongType) - 0) + 500) % 500)) - 0) + 1000), LongType, 
TimestampType)), time#4, time2#6], [named_struct(start, 
precisetimestampconversion(((precisetimestampconversion(cast(time#4 as 
timestamp), TimestampType, LongType) - 
(((precisetimestampconversion(cast(time#4 as timestamp), TimestampType, 
LongType) - 0) + 500) % 500)) - 500), LongType, TimestampType), 
end, precisetimestampconversionprecisetimestampconversion(cast(time#4 as 
timestamp), TimestampType, LongType) - (((pre
 cisetimestampconversion(cast(time#4 as timestamp), TimestampType, LongType) - 
0) + 500) % 500)) - 500) + 1000), LongType, TimestampType)), 
time#4, time2#6]], [window#11, time#4, time2#6]
   +- Project [time#4, cast(time#4 - INTERVAL '05' 
MINUTE as string) AS time2#6]
  +- Project [value#1 AS time#4]

[GitHub] [spark] amaliujia commented on pull request #38347: [SPARK-40883][CONNECT] Support Range in Connect proto

2022-10-22 Thread GitBox


amaliujia commented on PR #38347:
URL: https://github.com/apache/spark/pull/38347#issuecomment-1287884286

   R: @cloud-fan 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] MaxGekk commented on pull request #38351: [SPARK-40391][SQL][TESTS] Test the error class UNSUPPORTED_FEATURE.JDBC_TRANSACTION

2022-10-22 Thread GitBox


MaxGekk commented on PR #38351:
URL: https://github.com/apache/spark/pull/38351#issuecomment-1287882749

   @panbingkun Could you fix scala style failure:
   ```
   [error] 
/home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryExecutionErrorsSuite.scala:34:0:
 org.apache.spark.sql.execution.datasources.jdbc. is in wrong order relative to 
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] bjornjorgensen commented on pull request #38352: [SPARK-40801][BUILD][3.2] Upgrade `Apache commons-text` to 1.10

2022-10-22 Thread GitBox


bjornjorgensen commented on PR #38352:
URL: https://github.com/apache/spark/pull/38352#issuecomment-1287875700

   @wangyum 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] bjornjorgensen opened a new pull request, #38352: [SPARK-40801][BUILD][3.2] Upgrade `Apache commons-text` to 1.10

2022-10-22 Thread GitBox


bjornjorgensen opened a new pull request, #38352:
URL: https://github.com/apache/spark/pull/38352

   
   ### What changes were proposed in this pull request?
   Upgrade Apache commons-text from 1.6 to 1.10.0
   
   
   ### Why are the changes needed?
   [CVE-2022-42889](https://nvd.nist.gov/vuln/detail/CVE-2022-42889) 
   this is a [9.8 
CRITICAL](https://nvd.nist.gov/vuln-metrics/cvss/v3-calculator?name=CVE-2022-42889=AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H=3.1=NIST)
   
   ### Does this PR introduce _any_ user-facing change?
   No.
   
   
   ### How was this patch tested?
   Pass GA
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] alex-balikov commented on a diff in pull request #38288: [SPARK-40821][SQL][CORE][PYTHON][SS] Introduce window_time function to extract event time from the window column

2022-10-22 Thread GitBox


alex-balikov commented on code in PR #38288:
URL: https://github.com/apache/spark/pull/38288#discussion_r1002539045


##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala:
##
@@ -4201,6 +4219,73 @@ object SessionWindowing extends Rule[LogicalPlan] {
   }
 }
 
+/**
+ * Resolves the window_time expression which extracts the correct window time 
from the
+ * window column generated as the output of the window aggregating operators. 
The
+ * window column is of type struct { start: TimestampType, end: TimestampType 
}.
+ * The correct window time for further aggregations is window.end - 1.
+ * */
+object ResolveWindowTime extends Rule[LogicalPlan] {
+  override def apply(plan: LogicalPlan): LogicalPlan = plan.resolveOperatorsUp 
{
+case p: LogicalPlan if p.children.size == 1 =>
+  val child = p.children.head
+  val windowTimeExpressions =
+p.expressions.flatMap(_.collect { case w: WindowTime => w }).toSet
+
+  if (windowTimeExpressions.size == 1 &&

Review Comment:
   Modified the test. Indeed the scenario fails with the unsupported ops 
checker error.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] alex-balikov commented on a diff in pull request #38288: [SPARK-40821][SQL][CORE][PYTHON][SS] Introduce window_time function to extract event time from the window column

2022-10-22 Thread GitBox


alex-balikov commented on code in PR #38288:
URL: https://github.com/apache/spark/pull/38288#discussion_r1002536939


##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala:
##
@@ -4201,6 +4219,73 @@ object SessionWindowing extends Rule[LogicalPlan] {
   }
 }
 
+/**
+ * Resolves the window_time expression which extracts the correct window time 
from the
+ * window column generated as the output of the window aggregating operators. 
The
+ * window column is of type struct { start: TimestampType, end: TimestampType 
}.
+ * The correct window time for further aggregations is window.end - 1.
+ * */
+object ResolveWindowTime extends Rule[LogicalPlan] {
+  override def apply(plan: LogicalPlan): LogicalPlan = plan.resolveOperatorsUp 
{
+case p: LogicalPlan if p.children.size == 1 =>
+  val child = p.children.head
+  val windowTimeExpressions =
+p.expressions.flatMap(_.collect { case w: WindowTime => w }).toSet
+
+  if (windowTimeExpressions.size == 1 &&
+windowTimeExpressions.head.windowColumn.resolved &&
+windowTimeExpressions.head.checkInputDataTypes().isSuccess) {
+
+val windowTime = windowTimeExpressions.head
+
+val metadata = windowTime.windowColumn match {
+  case a: Attribute => a.metadata
+  case _ => Metadata.empty
+}
+
+if (!metadata.contains(TimeWindow.marker) &&
+  !metadata.contains(SessionWindow.marker)) {
+  // FIXME: error framework?
+  throw new AnalysisException("The input is not a correct window 
column!")
+}
+
+val newMetadata = new MetadataBuilder()
+  .withMetadata(metadata)
+  .remove(TimeWindow.marker)
+  .remove(SessionWindow.marker)
+  .build()
+
+val attr = AttributeReference(
+  "window_time", windowTime.dataType, metadata = newMetadata)()
+
+// NOTE: "window.end" is "exclusive" upper bound of window, so if we 
use this value as
+// it is, it is going to be bound to the different window even if we 
apply the same window
+// spec. Decrease 1 microsecond from window.end to let the window_time 
be bound to the
+// correct window range.
+val subtractExpr =
+PreciseTimestampConversion(
+  Subtract(PreciseTimestampConversion(
+// FIXME: better handling of window.end
+GetStructField(windowTime.windowColumn, 1),
+windowTime.dataType, LongType), Literal(1L)),
+  LongType,
+  windowTime.dataType)
+
+// FIXME: Can there already be a window_time column? Will this lead to 
conflict?

Review Comment:
   removed the comment



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] panbingkun commented on pull request #38351: [SPARK-40391][SQL][TESTS] Test the error class UNSUPPORTED_FEATURE.JDBC_TRANSACTION

2022-10-22 Thread GitBox


panbingkun commented on PR #38351:
URL: https://github.com/apache/spark/pull/38351#issuecomment-1287833044

   cc @MaxGekk 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] panbingkun commented on a diff in pull request #38351: [SPARK-40391][SQL][TESTS] Test the error class UNSUPPORTED_FEATURE.JDBC_TRANSACTION

2022-10-22 Thread GitBox


panbingkun commented on code in PR #38351:
URL: https://github.com/apache/spark/pull/38351#discussion_r1002513617


##
sql/core/src/test/resources/mockito-extensions/org.mockito.plugins.MockMaker:
##
@@ -0,0 +1 @@
+mock-maker-inline

Review Comment:
   https://www.baeldung.com/mockito-final#configure-mocktio
   Configure Mockito for Final Methods and Classes
   https://user-images.githubusercontent.com/15246973/197348725-9a192ef9-b002-4d1b-be1f-3cff13c937f4.png;>
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] panbingkun opened a new pull request, #38351: [SPARK-40391][SQL][TESTS] Test the error class UNSUPPORTED_FEATURE.JDBC_TRANSACTION

2022-10-22 Thread GitBox


panbingkun opened a new pull request, #38351:
URL: https://github.com/apache/spark/pull/38351

   ### What changes were proposed in this pull request?
   This pr aims to Add one test for the error class 
UNSUPPORTED_FEATURE.JDBC_TRANSACTION to QueryExecutionErrorsSuite.
   
   ### Why are the changes needed?
   Add one test for the error class UNSUPPORTED_FEATURE.JDBC_TRANSACTION to 
QueryExecutionErrorsSuite.
   
   
   ### Does this PR introduce _any_ user-facing change?
   No.
   
   ### How was this patch tested?
   - Manual test:
   
   ```
   ./build/sbt "sql/testOnly *QueryExecutionErrorsSuite*"
   ```
   
   All tests passed.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] wangyum commented on pull request #38262: [SPARK-40801][BUILD] Upgrade `Apache commons-text` to 1.10

2022-10-22 Thread GitBox


wangyum commented on PR #38262:
URL: https://github.com/apache/spark/pull/38262#issuecomment-1287810903

   @bjornjorgensen +1. Please backport this branch-3.2.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] MaxGekk commented on pull request #38237: [SPARK-40760][SQL] Migrate type check failures of interval expressions onto error classes

2022-10-22 Thread GitBox


MaxGekk commented on PR #38237:
URL: https://github.com/apache/spark/pull/38237#issuecomment-1287797217

   @cloud-fan Could you approve the PR, please.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] MaxGekk commented on pull request #38273: [SPARK-37945][SQL][CORE] Use error classes in the execution errors of arithmetic ops

2022-10-22 Thread GitBox


MaxGekk commented on PR #38273:
URL: https://github.com/apache/spark/pull/38273#issuecomment-1287797132

   @khalidmammadov Could you re-trigger tests/builds by merging the recent 
master, please.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] MaxGekk closed pull request #38319: [SPARK-40856][SQL] Update the error template of WRONG_NUM_PARAMS

2022-10-22 Thread GitBox


MaxGekk closed pull request #38319: [SPARK-40856][SQL] Update the error 
template of WRONG_NUM_PARAMS
URL: https://github.com/apache/spark/pull/38319


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] MaxGekk commented on pull request #38319: [SPARK-40856][SQL] Update the error template of WRONG_NUM_PARAMS

2022-10-22 Thread GitBox


MaxGekk commented on PR #38319:
URL: https://github.com/apache/spark/pull/38319#issuecomment-1287796420

   +1, LGTM. Merging to master.
   Thank you, @panbingkun.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #38337: [SPARK-39404][SS][3.3] Minor fix for querying _metadata in streaming

2022-10-22 Thread GitBox


AmplabJenkins commented on PR #38337:
URL: https://github.com/apache/spark/pull/38337#issuecomment-1287769832

   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] panbingkun opened a new pull request, #38350: [SPARK-40752][SQL] Migrate type check failures of misc expressions onto error classes

2022-10-22 Thread GitBox


panbingkun opened a new pull request, #38350:
URL: https://github.com/apache/spark/pull/38350

   ### What changes were proposed in this pull request?
   
   
   ### Why are the changes needed?
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   ### How was this patch tested?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HeartSaVioR closed pull request #38337: [SPARK-39404][SS][3.3] Minor fix for querying _metadata in streaming

2022-10-22 Thread GitBox


HeartSaVioR closed pull request #38337: [SPARK-39404][SS][3.3] Minor fix for 
querying _metadata in streaming
URL: https://github.com/apache/spark/pull/38337


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HeartSaVioR commented on pull request #38337: [SPARK-39404][SS][3.3] Minor fix for querying _metadata in streaming

2022-10-22 Thread GitBox


HeartSaVioR commented on PR #38337:
URL: https://github.com/apache/spark/pull/38337#issuecomment-1287754620

   Thanks! Merging to 3.3.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HeartSaVioR commented on pull request #38337: [SPARK-39404][SS][3.3] Minor fix for querying _metadata in streaming

2022-10-22 Thread GitBox


HeartSaVioR commented on PR #38337:
URL: https://github.com/apache/spark/pull/38337#issuecomment-1287754383

   https://github.com/Yaohua628/spark/runs/9040960126
   
   Build passed - it looks to be not reflected.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] kingofDaniel opened a new pull request, #38349: Pull request

2022-10-22 Thread GitBox


kingofDaniel opened a new pull request, #38349:
URL: https://github.com/apache/spark/pull/38349

   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] bjornjorgensen commented on pull request #38348: [SPARK-40884][BUILD] Upgrade fabric8io - `kubernetes-client` to 6.2.0

2022-10-22 Thread GitBox


bjornjorgensen commented on PR #38348:
URL: https://github.com/apache/spark/pull/38348#issuecomment-1287683204

   @dongjoon-hyun 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] bjornjorgensen opened a new pull request, #38348: [SPARK-40884][BUILD] Upgrade fabric8io - `kubernetes-client` to 6.2.0

2022-10-22 Thread GitBox


bjornjorgensen opened a new pull request, #38348:
URL: https://github.com/apache/spark/pull/38348

   ### What changes were proposed in this pull request?
   Upgrade fabric8io - kubernetes-client from 6.1.0 to 6.2.0
   
   ### Why are the changes needed?
   
   [Release 
notes](https://github.com/fabric8io/kubernetes-client/releases/tag/v6.2.0)
   [Snakeyaml version should be updated to mitigate 
CVE-2022-28857](https://github.com/fabric8io/kubernetes-client/issues/4383)
   
   
   ### Does this PR introduce _any_ user-facing change?
   No.
   
   
   ### How was this patch tested?
   Pass GA
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] amaliujia opened a new pull request, #38347: [SPARK-40883][CONNECT] Support Range in Connect proto

2022-10-22 Thread GitBox


amaliujia opened a new pull request, #38347:
URL: https://github.com/apache/spark/pull/38347

   
   
   ### What changes were proposed in this pull request?
   
   1. Support `Range` in Connect proto.
   2. Refactor `SparkConnectDeduplicateSuite`  to 
`SparkConnectSessionBasedSuite`
   
   ### Why are the changes needed?
   
   Improve API coverage.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No
   
   ### How was this patch tested?
   
   UT


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] peter-toth commented on pull request #38334: [SPARK-40874][PYTHON] Fix broadcasts in Python UDFs when encryption enabled

2022-10-22 Thread GitBox


peter-toth commented on PR #38334:
URL: https://github.com/apache/spark/pull/38334#issuecomment-1287665617

   Thanks @HyukjinKwon for the quick review!
   
   The bug was introduced in 
https://github.com/apache/spark/commit/58419b92673c46911c25bc6c6b13397f880c6424#diff-ed4fb5ce30273e8eefcc7d4b0152ea7a60fb4f8f709d4da8ea1ab56aeda26001R307-R323
 in Spark 3.0.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] Yikun commented on pull request #38341: [SPARK-40871][INFRA] Upgrade actions/github-script to v6 and fix notify workflow

2022-10-22 Thread GitBox


Yikun commented on PR #38341:
URL: https://github.com/apache/spark/pull/38341#issuecomment-1287659874

   @HyukjinKwon Thanks, it works: https://github.com/apache/spark/pull/38343


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] Yikun commented on pull request #38343: [DONT MERGE] Test upgrade v6

2022-10-22 Thread GitBox


Yikun commented on PR #38343:
URL: https://github.com/apache/spark/pull/38343#issuecomment-1287659814

   https://user-images.githubusercontent.com/1736354/197326477-f024d29c-83e8-4c50-8348-0ef6e9b368ef.png;>
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] Yikun closed pull request #38343: [DONT MERGE] Test upgrade v6

2022-10-22 Thread GitBox


Yikun closed pull request #38343: [DONT MERGE] Test upgrade v6
URL: https://github.com/apache/spark/pull/38343


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] Yikun commented on pull request #38342: [SPARK-40870][INFRA] Upgrade docker actions to cleanup warning

2022-10-22 Thread GitBox


Yikun commented on PR #38342:
URL: https://github.com/apache/spark/pull/38342#issuecomment-1287659692

   @HyukjinKwon Thanks, merge to master (3.4.0).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] Yikun closed pull request #38342: [SPARK-40870][INFRA] Upgrade docker actions to cleanup warning

2022-10-22 Thread GitBox


Yikun closed pull request #38342: [SPARK-40870][INFRA] Upgrade docker actions 
to cleanup warning
URL: https://github.com/apache/spark/pull/38342


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zhengruifeng opened a new pull request, #38346: [SPARK-40880][SQL] Reimplement `summary` with dataframe operations

2022-10-22 Thread GitBox


zhengruifeng opened a new pull request, #38346:
URL: https://github.com/apache/spark/pull/38346

   ### What changes were proposed in this pull request?
   Reimplement `summary` with dataframe operations
   
   ### Why are the changes needed?
   1, do not truncate the sql plan;
   2, enable sql optimization like column pruning:
   
   ``` 
   scala> val df = spark.range(0, 3, 1, 10).withColumn("value", lit("str"))
   df: org.apache.spark.sql.DataFrame = [id: bigint, value: string]
   
   scala> df.summary("max", "50%").show
   +---+---+-+
   |summary| id|value|
   +---+---+-+
   |max|  2|  str|
   |50%|  1| null|
   +---+---+-+
   
   
   scala> df.summary("max", "50%").select("id").show
   +---+
   | id|
   +---+
   |  2|
   |  1|
   +---+
   
   
   scala> df.summary("max", "50%").select("id").queryExecution.optimizedPlan
   res4: org.apache.spark.sql.catalyst.plans.logical.LogicalPlan =
   Project [element_at(id#367, summary#376, None, false) AS id#371]
   +- Generate explode([max,50%]), false, [summary#376]
  +- Aggregate [map(max, cast(max(id#153L) as string), 50%, 
cast(percentile_approx(id#153L, [0.5], 1, 0, 0)[0] as string)) AS id#367]
 +- Range (0, 3, step=1, splits=Some(10))
   
   
   ```
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   
   ### How was this patch tested?
   existing UTs and manually check


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org