Github user tdas commented on the pull request:
https://github.com/apache/spark/pull/6990#issuecomment-121412485
@dibbhatt
Yes, you can close this PR.
Regarding the alternative fix of finding the blocks from the
BlockManagerMaster, I thought about it a little bit more, and its seems
non-trivial to make that change. Right now the direct route of block reporting
[ReceiverBlockHandler <--> ReceiverTracker] ensures that by the time this
message exchange completes the block has been received reliably, especially in
case of WAL and stuff. Its easy to reason about the fault-tolerance guarantees.
If the report is to be reimplemented indirectly through the BlockManager, its
non-trivial to changes things such that the guarantee is maintained. So I am
not super inclined to make make a new JIRA for that.
Rather what I am inclined to do is to modify this existing JIRA to flag
this issue specifically for Spark Streaming (so that people can find it), and
then close it as Won't Fix for now. This is not a big issue as - 1. people
really should not be doing obviously problematic thing as use MEMORY_ONLY
2 even if it occurs, the extra blocks will just drop out of memory
eventually anyways as it wont be used and LRU will push out those blocks.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]