HeartSaVioR commented on a change in pull request #26109: [SPARK-29461][SQL]
Measure the number of records being updated for JDBC writer
URL: https://github.com/apache/spark/pull/26109#discussion_r337858397
##########
File path:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala
##########
@@ -615,7 +615,7 @@ object JdbcUtils extends Logging {
batchSize: Int,
dialect: JdbcDialect,
isolationLevel: Int,
- options: JDBCOptions): Iterator[Byte] = {
+ options: JDBCOptions): Long = {
Review comment:
I guess it would be cleaner to handle the metric outside of the method, as
it will not update metric if savePartition throws exception. We should add the
metrics update logic to the end of finally statement which doesn't seem to be
cleaner if we want to do the same but inside `savePartition`.
In other words, this approach doesn't support iterative updates on metric,
as well as no update on partially written and failed. It would totally make
sense to not update if it supports transaction, but if it doesn't support
transaction and it leaves some records on failure, I'm not sure we should
update the metric. How we are dealing with partial output?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]