Re: [PR] [SPARK-49700][CONNECT][SQL] Unified Scala Interface for Connect and Classic [spark]

2024-11-20 Thread via GitHub


hvanhovell commented on PR #48818:
URL: https://github.com/apache/spark/pull/48818#issuecomment-2488858590

   @cloud-fan they all use SparkSession in their in their interface. 
SparkSession also changes.
   
   For the record I am fine going either way. The SparkSession interface works 
in most of the cases, and where it does not work you can import the 
ClassicConversions object. This would make it easier for some to migrate. 
However, I do think there is a benefit in being more clear here, because the 
only implementation of SparkSession that you will see is the classic one.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-49700][CONNECT][SQL] Unified Scala Interface for Connect and Classic [spark]

2024-11-20 Thread via GitHub


cloud-fan commented on PR #48818:
URL: https://github.com/apache/spark/pull/48818#issuecomment-2487832869

   > Please note that this does change a number of developer APIs:
   > Streaming Sink
   > Streaming Source
   > FileFormat
   > Command
   
   How is FileFormat and Command related to the DataFrame API?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-49700][CONNECT][SQL] Unified Scala Interface for Connect and Classic [spark]

2024-11-18 Thread via GitHub


hvanhovell commented on PR #48818:
URL: https://github.com/apache/spark/pull/48818#issuecomment-2484599750

   cc @HyukjinKwon @cloud-fan @dongjoon-hyun this is the last big change. PTAL.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-49700][CONNECT][SQL] Unified Scala Interface for Connect and Classic [spark]

2024-11-14 Thread via GitHub


hvanhovell commented on PR #48818:
URL: https://github.com/apache/spark/pull/48818#issuecomment-2477752757

   For anyone interested. This is mostly ready for review. I still looking at 
some of the developer API, most of it uses classic now, however in most cases 
the interfaces should suffice. Moving them back might make the lives of spark 
library developers at bit easier.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-49700][CONNECT][SQL] Unified Scala Interface for Connect and Classic [spark]

2024-11-12 Thread via GitHub


hvanhovell commented on code in PR #48818:
URL: https://github.com/apache/spark/pull/48818#discussion_r1839175358


##
sql/core/src/test/scala/org/apache/spark/sql/SparkSessionBuilderSuite.scala:
##
@@ -330,8 +330,10 @@ class SparkSessionBuilderSuite extends SparkFunSuite with 
Eventually {
   .set(wh, "./data1")
   .set(td, "bob")
 
-val sc = new SparkContext(conf)
+// This creates an active SparkContext, which will be picked up the 
session.
+new SparkContext(conf)
 
+// The only way this could have ever worked if this is effectively a no-op.

Review Comment:
   Will remove in the next iteration.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-49700][CONNECT][SQL] Unified Scala Interface for Connect and Classic [spark]

2024-11-12 Thread via GitHub


hvanhovell commented on code in PR #48818:
URL: https://github.com/apache/spark/pull/48818#discussion_r1839174515


##
connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/connect/client/SparkConnectClientSuite.scala:
##


Review Comment:
   I can spin this off in a separate PR.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-49700][CONNECT][SQL] Unified Scala Interface for Connect and Classic [spark]

2024-11-12 Thread via GitHub


hvanhovell commented on code in PR #48818:
URL: https://github.com/apache/spark/pull/48818#discussion_r1838627147


##
sql/core/src/test/scala/org/apache/spark/sql/SparkSessionBuilderSuite.scala:
##
@@ -330,8 +333,6 @@ class SparkSessionBuilderSuite extends SparkFunSuite with 
Eventually {
   .set(wh, "./data1")
   .set(td, "bob")
 
-val sc = new SparkContext(conf)

Review Comment:
   Oops...



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-49700][CONNECT][SQL] Unified Scala Interface for Connect and Classic [spark]

2024-11-12 Thread via GitHub


hvanhovell commented on code in PR #48818:
URL: https://github.com/apache/spark/pull/48818#discussion_r1838391574


##
sql/core/src/main/scala/org/apache/spark/sql/classic/Dataset.scala:
##
@@ -1244,24 +1243,6 @@ class Dataset[T] private[sql](
 select(replacedAndExistingColumns ++ newColumns : _*)
   }
 
-  /** @inheritdoc */
-  private[spark] def withColumns(

Review Comment:
   Move to super class.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-49700][CONNECT][SQL] Unified Scala Interface for Connect and Classic [spark]

2024-11-12 Thread via GitHub


hvanhovell commented on code in PR #48818:
URL: https://github.com/apache/spark/pull/48818#discussion_r1838391574


##
sql/core/src/main/scala/org/apache/spark/sql/classic/Dataset.scala:
##
@@ -1244,24 +1243,6 @@ class Dataset[T] private[sql](
 select(replacedAndExistingColumns ++ newColumns : _*)
   }
 
-  /** @inheritdoc */
-  private[spark] def withColumns(

Review Comment:
   Moved to super class.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-49700][CONNECT][SQL] Unified Scala Interface for Connect and Classic [spark]

2024-11-12 Thread via GitHub


hvanhovell commented on code in PR #48818:
URL: https://github.com/apache/spark/pull/48818#discussion_r1838392272


##
sql/core/src/main/scala/org/apache/spark/sql/classic/SparkSession.scala:
##
@@ -693,21 +693,6 @@ class SparkSession private(
 }.toImmutableArraySeq
   }
 
-  /**
-   * Execute a block of code with this session set as the active session, and 
restore the
-   * previous session on completion.
-   */
-  private[sql] def withActive[T](block: => T): T = {

Review Comment:
   Moved to super class.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-49700][CONNECT][SQL] Unified Scala Interface for Connect and Classic [spark]

2024-11-12 Thread via GitHub


hvanhovell commented on code in PR #48818:
URL: https://github.com/apache/spark/pull/48818#discussion_r1838393149


##
sql/api/src/main/scala/org/apache/spark/sql/SparkSession.scala:
##
@@ -822,36 +937,50 @@ object SparkSession extends SparkSessionCompanion {
 private[sql] abstract class SparkSessionCompanion {
   private[sql] type Session <: SparkSession
 
+  /** The active SparkSession for the current thread. */
+  private val activeThreadSession = new InheritableThreadLocal[Session]

Review Comment:
   This is done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-49700][CONNECT][SQL] Unified Scala Interface for Connect and Classic [spark]

2024-11-12 Thread via GitHub


hvanhovell commented on code in PR #48818:
URL: https://github.com/apache/spark/pull/48818#discussion_r1836775290


##
sql/connect/common/src/main/scala/org/apache/spark/sql/connect/UDFRegistration.scala:
##
@@ -39,4 +39,10 @@ class UDFRegistration(session: SparkSession) extends 
api.UDFRegistration {
 session.registerUdf(UdfToProtoUtils.toProto(named))
 named
   }
+
+  /** @inheritdoc */
+  override def register(
+  name: String,
+  udaf: UserDefinedAggregateFunction): UserDefinedAggregateFunction =
+throw ConnectClientUnsupportedErrors.registerUdaf()

Review Comment:
   We will (probably) add support in a follow-up.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-49700][CONNECT][SQL] Unified Scala Interface for Connect and Classic [spark]

2024-11-11 Thread via GitHub


hvanhovell commented on code in PR #48818:
URL: https://github.com/apache/spark/pull/48818#discussion_r1836914439


##
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/Source.scala:
##


Review Comment:
   This makes Source use the classic API.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-49700][CONNECT][SQL] Unified Scala Interface for Connect and Classic [spark]

2024-11-11 Thread via GitHub


hvanhovell commented on code in PR #48818:
URL: https://github.com/apache/spark/pull/48818#discussion_r1836914439


##
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/Source.scala:
##


Review Comment:
   This make Source use the classic API.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-49700][CONNECT][SQL] Unified Scala Interface for Connect and Classic [spark]

2024-11-11 Thread via GitHub


hvanhovell commented on code in PR #48818:
URL: https://github.com/apache/spark/pull/48818#discussion_r1836914069


##
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/Sink.scala:
##


Review Comment:
   This makes Sink use the classic API.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-49700][CONNECT][SQL] Unified Scala Interface for Connect and Classic [spark]

2024-11-11 Thread via GitHub


hvanhovell commented on code in PR #48818:
URL: https://github.com/apache/spark/pull/48818#discussion_r1836872444


##
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSink.scala:
##


Review Comment:
   This moves FileStreamSink to classic.SparkSession.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-49700][CONNECT][SQL] Unified Scala Interface for Connect and Classic [spark]

2024-11-11 Thread via GitHub


hvanhovell commented on code in PR #48818:
URL: https://github.com/apache/spark/pull/48818#discussion_r1836905794


##
sql/core/src/test/scala/org/apache/spark/sql/util/ExecutionListenerManagerSuite.scala:
##
@@ -53,7 +55,7 @@ class ExecutionListenerManagerSuite extends SparkFunSuite 
with LocalSparkSession
   .set(QUERY_EXECUTION_LISTENERS, 
Seq(classOf[SQLConfQueryExecutionListener].getName()))
   .set("spark.aaa", "aaa")
 val sc = new SparkContext(conf)
-spark = SparkSession.builder()
+spark = classic.SparkSession.builder()

Review Comment:
   Undo this.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-49700][CONNECT][SQL] Unified Scala Interface for Connect and Classic [spark]

2024-11-11 Thread via GitHub


hvanhovell commented on code in PR #48818:
URL: https://github.com/apache/spark/pull/48818#discussion_r1836902493


##
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala:
##
@@ -34,7 +35,7 @@ class HiveContext private[hive](_sparkSession: SparkSession)
   self =>
 
   def this(sc: SparkContext) = {
-
this(SparkSession.builder().enableHiveSupport().sparkContext(sc).getOrCreate())
+
this(classic.SparkSession.builder().enableHiveSupport().sparkContext(sc).getOrCreate())

Review Comment:
   Undo



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-49700][CONNECT][SQL] Unified Scala Interface for Connect and Classic [spark]

2024-11-11 Thread via GitHub


hvanhovell commented on code in PR #48818:
URL: https://github.com/apache/spark/pull/48818#discussion_r1836899807


##
sql/core/src/test/scala/org/apache/spark/sql/internal/ColumnNodeToExpressionConverterSuite.scala:
##


Review Comment:
   Move test suite.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-49700][CONNECT][SQL] Unified Scala Interface for Connect and Classic [spark]

2024-11-11 Thread via GitHub


hvanhovell commented on code in PR #48818:
URL: https://github.com/apache/spark/pull/48818#discussion_r1836883913


##
sql/core/src/test/scala/org/apache/spark/sql/DataFrameTimeWindowingSuite.scala:
##
@@ -596,7 +596,9 @@ class DataFrameTimeWindowingSuite extends QueryTest with 
SharedSparkSession {
   df <- Seq(df1, df2)
   nullable <- Seq(true, false)
 } {
-  val dfWithDesiredNullability = new DataFrame(df.queryExecution, 
ExpressionEncoder(
+  val dfWithDesiredNullability = new classic.DataFrame(
+df.queryExecution,
+ExpressionEncoder(

Review Comment:
   Style...



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-49700][CONNECT][SQL] Unified Scala Interface for Connect and Classic [spark]

2024-11-11 Thread via GitHub


hvanhovell commented on code in PR #48818:
URL: https://github.com/apache/spark/pull/48818#discussion_r1836881419


##
sql/core/src/test/scala/org/apache/spark/sql/ApproxCountDistinctForIntervalsQuerySuite.scala:
##
@@ -82,7 +83,7 @@ class ApproxCountDistinctForIntervalsQuerySuite extends 
QueryTest with SharedSpa
 ApproxCountDistinctForIntervals(dtAttr, 
CreateArray(endpoints.map(Literal(_
   val dtAggExpr = dtAggFunc.toAggregateExpression()
   val dtNamedExpr = Alias(dtAggExpr, dtAggExpr.toString)()
-  val result = Dataset.ofRows(spark, Aggregate(Nil, Seq(ymNamedExpr, 
dtNamedExpr), relation))
+  val result = ofRows(spark, Aggregate(Nil, Seq(ymNamedExpr, dtNamedExpr), 
relation))

Review Comment:
   use classic.Dataset.ofRows...



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-49700][CONNECT][SQL] Unified Scala Interface for Connect and Classic [spark]

2024-11-11 Thread via GitHub


hvanhovell commented on code in PR #48818:
URL: https://github.com/apache/spark/pull/48818#discussion_r1836880699


##
sql/core/src/test/java/test/org/apache/spark/sql/JavaDatasetSuite.java:
##
@@ -867,7 +867,7 @@ public void testRandomSplit() {
 Dataset ds = spark.createDataset(data, Encoders.STRING());
 double[] arraySplit = {1, 2, 3};
 
-List> randomSplit =  ds.randomSplitAsList(arraySplit, 1);
+List> randomSplit = 
ds.randomSplitAsList(arraySplit, 1);

Review Comment:
   This is a minor breaking change. I can just force the implementations to 
return a `sql.Dataset`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-49700][CONNECT][SQL] Unified Scala Interface for Connect and Classic [spark]

2024-11-11 Thread via GitHub


hvanhovell commented on code in PR #48818:
URL: https://github.com/apache/spark/pull/48818#discussion_r1836872444


##
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSink.scala:
##


Review Comment:
   This moves FileStreamSink to classic.SparkSession.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-49700][CONNECT][SQL] Unified Scala Interface for Connect and Classic [spark]

2024-11-11 Thread via GitHub


hvanhovell commented on code in PR #48818:
URL: https://github.com/apache/spark/pull/48818#discussion_r1836870600


##
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/FileScan.scala:
##


Review Comment:
   We should probably also add the FileScanBuilder here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-49700][CONNECT][SQL] Unified Scala Interface for Connect and Classic [spark]

2024-11-11 Thread via GitHub


hvanhovell commented on code in PR #48818:
URL: https://github.com/apache/spark/pull/48818#discussion_r1836814168


##
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/FileDataSourceV2.scala:
##


Review Comment:
   This makes FileDataSourceV2 use classic.SparkSession.



##
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/FileScan.scala:
##


Review Comment:
   This makes FileScan use classic.SparkSession.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-49700][CONNECT][SQL] Unified Scala Interface for Connect and Classic [spark]

2024-11-11 Thread via GitHub


hvanhovell commented on code in PR #48818:
URL: https://github.com/apache/spark/pull/48818#discussion_r1836809436


##
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormat.scala:
##


Review Comment:
   This makes FileFormat use classic.SparkSession



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-49700][CONNECT][SQL] Unified Scala Interface for Connect and Classic [spark]

2024-11-11 Thread via GitHub


hvanhovell commented on code in PR #48818:
URL: https://github.com/apache/spark/pull/48818#discussion_r1836805659


##
sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzeTablesCommand.scala:
##
@@ -21,13 +21,15 @@ import scala.util.control.NonFatal
 
 import org.apache.spark.internal.LogKeys.{DATABASE_NAME, ERROR, TABLE_NAME}
 import org.apache.spark.internal.MDC
-import org.apache.spark.sql.{Row, SparkSession}
+import org.apache.spark.sql.Row
+import org.apache.spark.sql.classic.SparkSession
 
 
 /**
  * Analyzes all tables in the given database to generate statistics.
  */
-case class AnalyzeTablesCommand(
+case class
+AnalyzeTablesCommand(

Review Comment:
   Fix me...



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-49700][CONNECT][SQL] Unified Scala Interface for Connect and Classic [spark]

2024-11-11 Thread via GitHub


hvanhovell commented on code in PR #48818:
URL: https://github.com/apache/spark/pull/48818#discussion_r1836802998


##
sql/core/src/main/scala/org/apache/spark/sql/classic/UDFRegistration.scala:
##
@@ -15,25 +15,27 @@
  * limitations under the License.
  */
 
-package org.apache.spark.sql
+package org.apache.spark.sql.classic
 
 import java.lang.reflect.ParameterizedType
 
 import org.apache.spark.annotation.Stable
 import org.apache.spark.api.python.PythonEvalType
 import org.apache.spark.internal.Logging
+import org.apache.spark.sql
 import org.apache.spark.sql.api.java._
 import org.apache.spark.sql.catalyst.JavaTypeInference
 import org.apache.spark.sql.catalyst.analysis.FunctionRegistry
 import org.apache.spark.sql.catalyst.expressions.Expression
+import org.apache.spark.sql.classic.UserDefinedFunctionUtils.toScalaUDF
 import org.apache.spark.sql.errors.QueryCompilationErrors
 import org.apache.spark.sql.execution.aggregate.{ScalaAggregator, ScalaUDAF}
 import org.apache.spark.sql.execution.python.UserDefinedPythonFunction
 import org.apache.spark.sql.expressions.{SparkUserDefinedFunction, 
UserDefinedAggregateFunction, UserDefinedAggregator, UserDefinedFunction}
-import org.apache.spark.sql.internal.UserDefinedFunctionUtils.toScalaUDF
 import org.apache.spark.sql.types.DataType
 import org.apache.spark.util.Utils
 
+

Review Comment:
   Remove



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-49700][CONNECT][SQL] Unified Scala Interface for Connect and Classic [spark]

2024-11-11 Thread via GitHub


hvanhovell commented on code in PR #48818:
URL: https://github.com/apache/spark/pull/48818#discussion_r1836798356


##
sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala:
##


Review Comment:
   We plan to make SQLContext part of the interface in a follow-up.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-49700][CONNECT][SQL] Unified Scala Interface for Connect and Classic [spark]

2024-11-11 Thread via GitHub


hvanhovell commented on code in PR #48818:
URL: https://github.com/apache/spark/pull/48818#discussion_r1836797698


##
sql/connect/server/src/test/scala/org/apache/spark/sql/connect/planner/SparkConnectWithSessionExtensionSuite.scala:
##


Review Comment:
   Document story around extensions.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-49700][CONNECT][SQL] Unified Scala Interface for Connect and Classic [spark]

2024-11-11 Thread via GitHub


hvanhovell commented on code in PR #48818:
URL: https://github.com/apache/spark/pull/48818#discussion_r1836776941


##
sql/connect/common/src/main/scala/org/apache/spark/sql/connect/columnNodeSupport.scala:
##
@@ -29,6 +29,7 @@ import org.apache.spark.sql.catalyst.trees.{CurrentOrigin, 
Origin}
 import org.apache.spark.sql.connect.common.DataTypeProtoConverter
 import 
org.apache.spark.sql.connect.common.LiteralValueProtoConverter.toLiteralProtoBuilder
 import org.apache.spark.sql.expressions.{Aggregator, UserDefinedAggregator, 
UserDefinedFunction}
+import org.apache.spark.sql.internal._

Review Comment:
   Fix wild card imports.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-49700][CONNECT][SQL] Unified Scala Interface for Connect and Classic [spark]

2024-11-11 Thread via GitHub


hvanhovell commented on code in PR #48818:
URL: https://github.com/apache/spark/pull/48818#discussion_r1836775290


##
sql/connect/common/src/main/scala/org/apache/spark/sql/connect/UDFRegistration.scala:
##
@@ -39,4 +39,10 @@ class UDFRegistration(session: SparkSession) extends 
api.UDFRegistration {
 session.registerUdf(UdfToProtoUtils.toProto(named))
 named
   }
+
+  /** @inheritdoc */
+  override def register(
+  name: String,
+  udaf: UserDefinedAggregateFunction): UserDefinedAggregateFunction =
+throw ConnectClientUnsupportedErrors.registerUdaf()

Review Comment:
   We will (probably) add supports in a follow-up.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-49700][CONNECT][SQL] Unified Scala Interface for Connect and Classic [spark]

2024-11-11 Thread via GitHub


hvanhovell commented on code in PR #48818:
URL: https://github.com/apache/spark/pull/48818#discussion_r1836773794


##
sql/connect/common/src/main/scala/org/apache/spark/sql/connect/StreamingQueryListenerBus.scala:
##
@@ -15,17 +15,17 @@
  * limitations under the License.
  */
 
-package org.apache.spark.sql.streaming
+package org.apache.spark.sql.connect
 
 import java.util.concurrent.CopyOnWriteArrayList
 
 import scala.jdk.CollectionConverters._
 
 import org.apache.spark.connect.proto.{Command, ExecutePlanResponse, Plan, 
StreamingQueryEventType}
 import org.apache.spark.internal.{Logging, LogKeys, MDC}
-import org.apache.spark.sql.SparkSession
 import org.apache.spark.sql.connect.client.CloseableIterator
-import org.apache.spark.sql.streaming.StreamingQueryListener.{Event, 
QueryIdleEvent, QueryProgressEvent, QueryStartedEvent, QueryTerminatedEvent}
+import org.apache.spark.sql.streaming.StreamingQueryListener
+import org.apache.spark.sql.streaming.StreamingQueryListener._

Review Comment:
   Fix wildcard import.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-49700][CONNECT][SQL] Unified Scala Interface for Connect and Classic [spark]

2024-11-11 Thread via GitHub


hvanhovell commented on code in PR #48818:
URL: https://github.com/apache/spark/pull/48818#discussion_r1836772897


##
sql/connect/common/src/main/scala/org/apache/spark/sql/connect/SparkSession.scala:
##
@@ -649,15 +648,19 @@ object SparkSession extends api.BaseSparkSessionCompanion 
with Logging {
*/
   def builder(): Builder = new Builder()
 
-  class Builder() extends api.SparkSessionBuilder {
+  class Builder() extends SparkSessionBuilder {
 // Initialize the connection string of the Spark Connect client builder 
from SPARK_REMOTE
 // by default, if it exists. The connection string can be overridden using
 // the remote() function, as it takes precedence over the SPARK_REMOTE 
environment variable.
 private val builder = SparkConnectClient.builder().loadFromEnvironment()
 private var client: SparkConnectClient = _
 
 /** @inheritdoc */
-def remote(connectionString: String): this.type = {
+override private[spark] def sparkContext(sparkContext: SparkContext) =
+  throw ConnectClientUnsupportedErrors.sparkContext()

Review Comment:
   Either throw an exception, or ignore it, not both...



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-49700][CONNECT][SQL] Unified Scala Interface for Connect and Classic [spark]

2024-11-11 Thread via GitHub


hvanhovell commented on code in PR #48818:
URL: https://github.com/apache/spark/pull/48818#discussion_r1836770731


##
sql/connect/common/src/main/scala/org/apache/spark/sql/connect/KeyValueGroupedDataset.scala:
##
@@ -24,15 +24,17 @@ import scala.jdk.CollectionConverters._
 
 import org.apache.spark.api.java.function._
 import org.apache.spark.connect.proto
+import org.apache.spark.sql
+import org.apache.spark.sql.{Column, Encoder, TypedColumn}
 import org.apache.spark.sql.catalyst.encoders.AgnosticEncoder
 import 
org.apache.spark.sql.catalyst.encoders.AgnosticEncoders.{agnosticEncoderFor, 
ProductEncoder}
+import org.apache.spark.sql.connect.ColumnNodeToProtoConverter.toExpr
 import org.apache.spark.sql.connect.ConnectConversions._
 import org.apache.spark.sql.connect.common.UdfUtils
 import org.apache.spark.sql.expressions.SparkUserDefinedFunction
 import org.apache.spark.sql.functions.col
-import org.apache.spark.sql.internal.ColumnNodeToProtoConverter.toExpr
 import org.apache.spark.sql.internal.UDFAdaptors
-import org.apache.spark.sql.streaming.{GroupState, GroupStateTimeout, 
OutputMode, StatefulProcessor, StatefulProcessorWithInitialState, TimeMode}
+import org.apache.spark.sql.streaming._

Review Comment:
   Fix imports.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-49700][CONNECT][SQL] Unified Scala Interface for Connect and Classic [spark]

2024-11-11 Thread via GitHub


hvanhovell commented on code in PR #48818:
URL: https://github.com/apache/spark/pull/48818#discussion_r1836768940


##
sql/connect/common/src/main/scala/org/apache/spark/sql/connect/DataFrameWriterV2Impl.scala:
##


Review Comment:
   Can rename this to DataFrameWriterV2. Same for all the other Impl classes...



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-49700][CONNECT][SQL] Unified Scala Interface for Connect and Classic [spark]

2024-11-11 Thread via GitHub


hvanhovell commented on code in PR #48818:
URL: https://github.com/apache/spark/pull/48818#discussion_r1836759622


##
sql/api/src/main/scala/org/apache/spark/sql/SparkSession.scala:
##
@@ -1105,6 +1195,14 @@ abstract class SparkSessionBuilder {
*/
   def withExtensions(f: SparkSessionExtensions => Unit): this.type
 
+  /**
+   * Set the [[SparkContext]] to use for the [[SparkSession]].
+   *
+   * @note
+   *   * this method is not supported in Spark Connect.
+   */
+  private[spark] def sparkContext(sparkContext: SparkContext): this.type

Review Comment:
   Added to make the mllib migration easier.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-49700][CONNECT][SQL] Unified Scala Interface for Connect and Classic [spark]

2024-11-11 Thread via GitHub


hvanhovell commented on code in PR #48818:
URL: https://github.com/apache/spark/pull/48818#discussion_r1836758221


##
sql/api/src/main/scala/org/apache/spark/sql/SparkSession.scala:
##
@@ -822,36 +937,50 @@ object SparkSession extends SparkSessionCompanion {
 private[sql] abstract class SparkSessionCompanion {
   private[sql] type Session <: SparkSession
 
+  /** The active SparkSession for the current thread. */
+  private val activeThreadSession = new InheritableThreadLocal[Session]

Review Comment:
   TODO Active/default session clean-up still needs work.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-49700][CONNECT][SQL] Unified Scala Interface for Connect and Classic [spark]

2024-11-11 Thread via GitHub


hvanhovell commented on code in PR #48818:
URL: https://github.com/apache/spark/pull/48818#discussion_r1836752279


##
sql/api/src/main/scala/org/apache/spark/sql/Dataset.scala:
##
@@ -2137,14 +2130,41 @@ abstract class Dataset[T] extends Serializable {
* @group untypedrel
* @since 3.3.0
*/
-  def withColumns(colsMap: util.Map[String, Column]): Dataset[Row] = 
withColumns(
+  def withColumns(colsMap: util.Map[String, Column]): DataFrame = withColumns(
 colsMap.asScala.toMap)
 
   /**
* Returns a new Dataset by adding columns or replacing the existing columns 
that has the same
* names.
*/
-  protected def withColumns(colNames: Seq[String], cols: Seq[Column]): 
Dataset[Row]
+  private[spark] def withColumns(colNames: Seq[String], cols: Seq[Column]): 
DataFrame
+
+  /**
+   * Returns a new Dataset by adding columns with metadata.
+   */
+  private[spark] def withColumns(

Review Comment:
   This got added to make the migration for MLLib (a lot) easier.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-49700][CONNECT][SQL] Unified Scala Interface for Connect and Classic [spark]

2024-11-11 Thread via GitHub


hvanhovell commented on code in PR #48818:
URL: https://github.com/apache/spark/pull/48818#discussion_r1836756844


##
sql/api/src/main/scala/org/apache/spark/sql/SparkSession.scala:
##
@@ -784,35 +787,147 @@ abstract class SparkSession extends Serializable with 
Closeable {
 object SparkSession extends SparkSessionCompanion {
   type Session = SparkSession
 
-  private[this] val companion: SparkSessionCompanion = {
-val cls = SparkClassUtils.classForName("org.apache.spark.sql.SparkSession")
+  // Config key/values used to set the SparkSession API mode.
+  val API_MODE_KEY: String = "spark.api.mode"
+  val API_MODE_CLASSIC: String = "classic"
+  val API_MODE_CONNECT: String = "connect"
+
+  // Implementation specific companions
+  private lazy val CLASSIC_COMPANION = lookupCompanion(
+"org.apache.spark.sql.classic.SparkSession")
+  private lazy val CONNECT_COMPANION = lookupCompanion(
+"org.apache.spark.sql.connect.SparkSession")
+  private def DEFAULT_COMPANION =
+Try(CLASSIC_COMPANION).orElse(Try(CONNECT_COMPANION)).getOrElse {
+  throw new IllegalStateException(
+"Cannot find an SparkSession implementation on the Classpath.")
+}
+
+  private[this] def lookupCompanion(name: String): SparkSessionCompanion = {
+val cls = SparkClassUtils.classForName(name)
 val mirror = scala.reflect.runtime.currentMirror
 val module = mirror.classSymbol(cls).companion.asModule
 mirror.reflectModule(module).instance.asInstanceOf[SparkSessionCompanion]
   }
 
   /** @inheritdoc */
-  override def builder(): SparkSessionBuilder = companion.builder()
+  override def builder(): Builder = new Builder
 
   /** @inheritdoc */
-  override def setActiveSession(session: SparkSession): Unit =
-companion.setActiveSession(session.asInstanceOf[companion.Session])
+  override def setActiveSession(session: SparkSession): Unit = 
super.setActiveSession(session)
 
   /** @inheritdoc */
-  override def clearActiveSession(): Unit = companion.clearActiveSession()
+  override def setDefaultSession(session: SparkSession): Unit = 
super.setDefaultSession(session)
 
   /** @inheritdoc */
-  override def setDefaultSession(session: SparkSession): Unit =
-companion.setDefaultSession(session.asInstanceOf[companion.Session])
+  override def getActiveSession: Option[SparkSession] = super.getActiveSession
 
   /** @inheritdoc */
-  override def clearDefaultSession(): Unit = companion.clearDefaultSession()
+  override def getDefaultSession: Option[SparkSession] = 
super.getDefaultSession
 
-  /** @inheritdoc */
-  override def getActiveSession: Option[SparkSession] = 
companion.getActiveSession
+  class Builder extends SparkSessionBuilder {

Review Comment:
   This is by far and large the most important change. This builder supports 
both implementations.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-49700][CONNECT][SQL] Unified Scala Interface for Connect and Classic [spark]

2024-11-11 Thread via GitHub


hvanhovell commented on code in PR #48818:
URL: https://github.com/apache/spark/pull/48818#discussion_r1836754390


##
sql/api/src/main/scala/org/apache/spark/sql/UDFRegistration.scala:
##
@@ -63,6 +63,26 @@ abstract class UDFRegistration {
 register(name, udf, "scala_udf", validateParameterCount = false)
   }
 
+  /**
+   * Registers a user-defined aggregate function (UDAF).
+   *
+   * @param name
+   *   the name of the UDAF.
+   * @param udaf
+   *   the UDAF needs to be registered.
+   * @return
+   *   the registered UDAF.
+   * @since 1.5.0
+   * @deprecated
+   *   this method and the use of UserDefinedAggregateFunction are deprecated. 
Aggregator[IN, BUF,
+   *   OUT] should now be registered as a UDF via the functions.udaf(agg) 
method.
+   */
+  @deprecated(
+"Aggregator[IN, BUF, OUT] should now be registered as a UDF" +
+  " via the functions.udaf(agg) method.",
+"3.0.0")
+  def register(name: String, udaf: UserDefinedAggregateFunction): 
UserDefinedAggregateFunction

Review Comment:
   Added this to be completely on par with Classic.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-49700][CONNECT][SQL] Unified Scala Interface for Connect and Classic [spark]

2024-11-11 Thread via GitHub


hvanhovell commented on code in PR #48818:
URL: https://github.com/apache/spark/pull/48818#discussion_r1836752934


##
sql/api/src/main/scala/org/apache/spark/sql/KeyValueGroupedDataset.scala:
##
@@ -31,7 +30,6 @@ import org.apache.spark.sql.streaming.{GroupState, 
GroupStateTimeout, OutputMode
  * @since 2.0.0
  */
 abstract class KeyValueGroupedDataset[K, V] extends Serializable {
-  type KVDS[KL, VL] <: KeyValueGroupedDataset[KL, VL]

Review Comment:
   Removed self type because it was not working well.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-49700][CONNECT][SQL] Unified Scala Interface for Connect and Classic [spark]

2024-11-11 Thread via GitHub


hvanhovell commented on code in PR #48818:
URL: https://github.com/apache/spark/pull/48818#discussion_r1836751385


##
sql/api/src/main/scala/org/apache/spark/sql/Dataset.scala:
##
@@ -123,9 +123,8 @@ import org.apache.spark.util.SparkClassUtils
  */
 @Stable
 abstract class Dataset[T] extends Serializable {
-  type DS[U] <: Dataset[U]
 
-  def sparkSession: SparkSession
+  val sparkSession: SparkSession

Review Comment:
   Needed to be a val to allow for `import df.sparkSession.implicits._`...



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-49700][CONNECT][SQL] Unified Scala Interface for Connect and Classic [spark]

2024-11-11 Thread via GitHub


hvanhovell commented on code in PR #48818:
URL: https://github.com/apache/spark/pull/48818#discussion_r1836750755


##
sql/api/src/main/scala/org/apache/spark/sql/Dataset.scala:
##
@@ -123,9 +123,8 @@ import org.apache.spark.util.SparkClassUtils
  */
 @Stable
 abstract class Dataset[T] extends Serializable {
-  type DS[U] <: Dataset[U]

Review Comment:
   I remove the self type. It was not working well enough in the interface.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-49700][CONNECT][SQL] Unified Scala Interface for Connect and Classic [spark]

2024-11-11 Thread via GitHub


hvanhovell commented on code in PR #48818:
URL: https://github.com/apache/spark/pull/48818#discussion_r1836747118


##
connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/connect/SparkSessionSuite.scala:
##
@@ -14,16 +14,16 @@
  * See the License for the specific language governing permissions and
  * limitations under the License.
  */
-package org.apache.spark.sql
+package org.apache.spark.sql.connect
 
 import java.util.concurrent.{Executors, Phaser}
 
 import scala.util.control.NonFatal
 
-import io.grpc.{CallOptions, Channel, ClientCall, ClientInterceptor, 
MethodDescriptor}
+import io.grpc._

Review Comment:
   Mweh



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-49700][CONNECT][SQL] Unified Scala Interface for Connect and Classic [spark]

2024-11-11 Thread via GitHub


hvanhovell commented on code in PR #48818:
URL: https://github.com/apache/spark/pull/48818#discussion_r1836746610


##
connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/connect/SQLImplicitsTestSuite.scala:
##
@@ -14,10 +14,10 @@
  * See the License for the specific language governing permissions and
  * limitations under the License.
  */
-package org.apache.spark.sql
+package org.apache.spark.sql.connect
 
 import java.sql.{Date, Timestamp}
-import java.time.{Duration, Instant, LocalDate, LocalDateTime, Period}
+import java.time._

Review Comment:
   Mweh



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-49700][CONNECT][SQL] Unified Scala Interface for Connect and Classic [spark]

2024-11-11 Thread via GitHub


hvanhovell commented on PR #48818:
URL: https://github.com/apache/spark/pull/48818#issuecomment-2468294511

   For anyone interested. I have opened this up so people can take a look. I 
still need to make a bunch of changes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org