Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/6990#discussion_r33741857
--- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala
---
@@ -833,8 +833,10 @@ private[spark] class BlockManager(
logDebug("Put block %s locally took %s".format(blockId,
Utils.getUsedTimeMs(startTimeMs)))
// Either we're storing bytes and we asynchronously started
replication, or we're storing
- // values and need to serialize and replicate them now:
- if (putLevel.replication > 1) {
+ // values and need to serialize and replicate them now.
+ // Should not replicate the block if its StorageLevel is
StorageLevel.NONE or
+ // putting it to local is failed.
+ if (!putBlockInfo.isFailed && putLevel.replication > 1) {
--- End diff --
i can see that its beneficial to throw errors if replicating to two was not
possible, so that the receiver can retry. However, even if the receiver
retries, there is no good way for the receiver to ensure that the block has
been replicated to the desired level even after two tries. Since there is not
feedback mechanism to check for success after retries, doing something that
increases "likelihood" is not very useful. Doing something like that and
relying on is bad design.
That's why I am more inclined towards option 3, that is, if local fails, it
tries to replicate it two machines. I agree that it is inconsistent with
MEMORY_ONLY, but its still better change than the above which does not provided
anything significantly more. And I think its okay to break consistency because
of the benefit we are getting especially in a scenario where the behavior is
related to something as critical as fault-tolerance behavior.
Furthermore, stepping back, for receivers, why are even using MEMORY_ONLY
and not MEMORY_AND_DISK (with or w/o replication)? Do you get any benefit by
using the former over latter?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]