Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/6990#discussion_r33429725
  
    --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala 
---
    @@ -833,8 +833,10 @@ private[spark] class BlockManager(
         logDebug("Put block %s locally took %s".format(blockId, 
Utils.getUsedTimeMs(startTimeMs)))
     
         // Either we're storing bytes and we asynchronously started 
replication, or we're storing
    -    // values and need to serialize and replicate them now:
    -    if (putLevel.replication > 1) {
    +    // values and need to serialize and replicate them now.
    +    // Should not replicate the block if its StorageLevel is 
StorageLevel.NONE or
    +    // putting it to local is failed.
    +    if (!putBlockInfo.isFailed && putLevel.replication > 1) {
    --- End diff --
    
    Thanks for the clarification.  Yes, I think you are right about the 3 
options @lianhuiwang .  I'm sorry I didn't understand what was going on 
earlier, and I feel bad for saying this b/c I know this has already been 
through a few rounds of revision -- but I'm not sure I see the point in 
implementing option 2, as this patch does in its current form.  It doesn't seem 
particularly bad to me if you request replication level 2 in memory, and you 
only end up with 1 replicate in memory.  In fact, this can happen a few other 
ways -- what if you do manage to store locally, but no peer has enough memory?  
 Or what if you get 2 replicates stored now, but later on only one copy gets 
evicted from the cache so you are down to one copy in memory?
    
    (All that being said, I do see the appeal of this patch as a *heuristic* 
fix -- if you can't store the data locally, you are unlikely to be able to 
store the data on a remote, so maybe you shouldn't bother?  But if that is the 
reason we are making this change, that is exactly what I'd like the comment 
here to indicate.)
    
    What about option 1 (current behavior -- just 1 replicate) & option 3 
(change behavior -- replicate to two nodes). Option 3 is slightly appealing ... 
but for consistency, it would also suggest that if you use 
`StorageLevel.MEMORY_ONLY` (ie. only 1 replicate), and you can't store locally, 
that then you would also try to find a remote node to store the block on.   And 
I really doubt we want to do that -- that is a pretty big change in behavior 
for how `StorageLevel.MEMORY_ONLY` works.
    
    I'm really on the fence on this.  Lets see what others think.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to