LuciferYang commented on a change in pull request #33267:
URL: https://github.com/apache/spark/pull/33267#discussion_r666631560
##########
File path:
core/src/main/scala/org/apache/spark/storage/DiskBlockObjectWriter.scala
##########
@@ -288,3 +288,19 @@ private[spark] class DiskBlockObjectWriter(
bs.flush()
}
}
+
+private[spark] object DiskBlockObjectHelper extends Logging {
+
+ /**
+ * Reverts writes that haven't been committed yet and delete the object file
hold by the
+ * DiskBlockObjectWriter.
+ */
+ def deleteAbnormalDiskBlockObjectFile(writer: DiskBlockObjectWriter): Unit =
{
Review comment:
@Ngone51 That's a good question, but I think it's strange for an
`OutputStream( DiskBlockObjectWriter extends OutputStream)` to delete itself.
For example, `o.a.h.fs.FileSystem` can `create` a `FSDataOutputStream`,
although the `FSDataOutputStream` has the capabilities of `write data` and
`close itself`, but we still need to use the `o.a.h.fs.FileSystem` to delete
the file.
Of course, there are other ways of API design, such as `java.io.File` can
`createNewFile` and the `java.io.File` instance can `delete` itself.
At present, I choose the former because `DiskBlockObjectWriter` is a
`OutputStream` rather than a `File` , but I also accept it as a member
function of `DiskBlockObjectWriter`.
Do you think `keep it as a member function of DiskBlockObjectWriter` is
better?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]