Github user advancedxy commented on the pull request:
https://github.com/apache/spark/pull/1281#issuecomment-50255139
Hi @mateiz, I think ignoring bad dir is needed in production cluster.
In production, there is a good chance for disk failures. I always love the
idea that we could replace the bad disks without service downtime. I hope this
can be implemented in spark cluster.
To replace disks without service downtime, it require:
1. the service is tolerant with bad dirs, which this pr did.
2. make sure the dir is read-only or remove all permissions anybody have
(chmod 000 /dir assume it's a unix-like os), so the service doesn't pick the
wrong dir.
3. replace the bad disk (modern machine supports hot plugging). mount it.
bring the permissions back.
4. service auto detect the new good dir(disk), or provide a reload api so
that we can notify it.
I didn't dig the code, so I don't know where `spark.local.dir` are used.
But, if it's for storage, it's better to
choose different dirs(disks) to spread the disk IO.
Ok, let's go back to this behavior. @mateiz, when running spark service,
one of the configured dir(disks) fails, I simple prefer ignoring the bad dir
rather than bring down the entire service.
What hadoop's datanode and tasktracker do is simply ignoring some bad dirs
with a maximum num limit.
what about a misconfigure? If a misconfigured directory is usable, we
cannot do anything, it's uses' mistake. if the directory is bad, ignoring it
isn't that bad.
@YanTangZhai, I believe we should log the bad dir, so user can know there
is a bad dir. And what do you think the idea of replace bad disks?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---