Github user mateiz commented on the pull request:
https://github.com/apache/spark/pull/9214#issuecomment-151209206
Maybe just go for version 2) above then, it seems like the simplest one.
Regarding re-engineering vs not, the problem is that if you're trying to do
a bug fix, you should introduce the least complexity possible. With fault
tolerance in particular it's possible to imagine lots of conditions that don't
really happen. For example, what if network messages get corrupted? What if
DRAM gets corrupted? You just need to pick a failure model (e.g. do we trust
the filesystem to be reliable or not) that fits your observations and make sure
things are correct within that model.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]