Github user RongGu commented on the pull request:
https://github.com/apache/spark/pull/158#issuecomment-38779460
Hey @mateiz ,
2) What happens if someone creates a StorageLevel with useTachyon = true
and replication > 1? Do we get two files? Do they step over each other?
3) Related to 2, in the first version we might want to let users put
useTachyon only if useMemory and useDisk are false (and possibly the same with
replication). That would reduce the number of weird interactions. I don't see
when you'd want both Tachyon and disk for example, and furthermore Tachyon
doesn't notify us when something falls out of it.
Currently, when a user creates a StorageLevel with useTachyon = true and
replication = n, the rdd will be stored with n replicas explicitly. They will
be stored in n executors' directories on Tachyon respectively. They won't step
over each other.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---