juliuszsompolski commented on code in PR #48324:
URL: https://github.com/apache/spark/pull/48324#discussion_r1792197667
##########
sql/api/src/main/scala/org/apache/spark/sql/api/Dataset.scala:
##########
@@ -363,7 +365,29 @@ abstract class Dataset[T] extends Serializable {
* @since 2.3.0
*/
def localCheckpoint(eager: Boolean): Dataset[T] =
- checkpoint(eager = eager, reliableCheckpoint = false)
+ checkpoint(eager = eager, reliableCheckpoint = false, storageLevel = None)
+
+ /**
+ * Locally checkpoints a Dataset and return the new Dataset. Checkpointing
can be used to
+ * truncate the logical plan of this Dataset, which is especially useful in
iterative algorithms
+ * where the plan may grow exponentially. Local checkpoints are written to
executor storage and
+ * despite potentially faster they are unreliable and may compromise job
completion.
+ *
+ * @param eager
+ * Whether to checkpoint this dataframe immediately
+ * @param storageLevel
+ * Option. If defined, StorageLevel with which to checkpoint the data.
+ * @note
+ * When checkpoint is used with eager = false, the final data that is
checkpointed after the
+ * first action may be different from the data that was used during the
job due to
+ * non-determinism of the underlying operation and retries. If checkpoint
is used to achieve
+ * saving a deterministic snapshot of the data, eager = true should be
used. Otherwise, it is
+ * only deterministic after the first execution, after the checkpoint was
finalized.
+ * @group basic
+ * @since 4.0.0
+ */
+ def localCheckpoint(eager: Boolean, storageLevel: Option[StorageLevel]):
Dataset[T] =
Review Comment:
I suppose that Option is also very bad for Java API compatibility... so no
Option definitely.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]