[GitHub] [spark] jdesjean commented on a diff in pull request #42454: [SPARK-44776] Add ProducedRowCount to SparkListenerConnectOperationFinished

via GitHub Fri, 11 Aug 2023 10:55:35 -0700


jdesjean commented on code in PR #42454:
URL: https://github.com/apache/spark/pull/42454#discussion_r1291566570



##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/execution/SparkConnectPlanExecution.scala:
##########
@@ -120,11 +121,12 @@ private[execution] class 
SparkConnectPlanExecution(executeHolder: ExecuteHolder)
       response.setArrowBatch(batch)
       responseObserver.onNext(response.build())
       numSent += 1
+      totalNumRows += count
     }
 
     dataframe.queryExecution.executedPlan match {
       case LocalTableScanExec(_, rows) =>
-        executePlan.eventsManager.postFinished()

Review Comment:
   I realize we don't have a test case for LocalTableScanExec. We should add 
one. 
   Not sure on the exact syntax
   ```
   val request = proto.ExecutePlanRequest
           .newBuilder()
           .setPlan(proto.Plan
             .newBuilder()
             .setRoot(proto.Relation
               .newBuilder()
               .getLocalRelationBuilder
               .setSchema(new StructType()
                 .add("id", "long").json)
               .setData(?)
               .build())
             .build())
           .setUserContext(context)
           .setSessionId(UUID.randomUUID.toString())
           .build()
   ```
   
   totalNumRows will only be populated after sendBatch. We'll need to move 
postFinished below to accommodate this.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] jdesjean commented on a diff in pull request #42454: [SPARK-44776] Add ProducedRowCount to SparkListenerConnectOperationFinished

Reply via email to