HeartSaVioR commented on a change in pull request #26109: [SPARK-29461][SQL] 
Measure the number of records being updated for JDBC writer
URL: https://github.com/apache/spark/pull/26109#discussion_r337858397
 
 

 ##########
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala
 ##########
 @@ -615,7 +615,7 @@ object JdbcUtils extends Logging {
       batchSize: Int,
       dialect: JdbcDialect,
       isolationLevel: Int,
-      options: JDBCOptions): Iterator[Byte] = {
+      options: JDBCOptions): Long = {
 
 Review comment:
   I guess it would be cleaner to handle the metric outside of the method, as 
it will not update metric if savePartition throws exception. We should add the 
metrics update logic to the end of finally statement which doesn't seem to be 
cleaner if we want to do the same but inside `savePartition`.
   
   In other words, this approach doesn't support iterative updates on metric, 
as well as no update on partially written and failed. It would totally make 
sense to not update if it supports transaction, but if it doesn't support 
transaction and it leaves some records on failure, I'm not sure we should 
update the metric. How we are dealing with partial output?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to