Bryan Duxbury wrote:
If you are considering using it as a conventional filesystem from a few
clients, then it most resembles NAS. However, I don't think it makes
sense to try and classify it as SAN or NAS. HDFS is a distributed
filesystem designed to be consumed in a massively distributed fashion,
so it does fall into its own category.
"datacentre scale distributed file system with location awareness"
-single datacentre only; not directly for long haul use, though its
sub-posix semantics map well to DAV and other long-haul front ends
-the location of data is not hidden from the job managers that want to
run work near the job.
I'd compare with GFS and Amazon S3 rather than NAS/SAN systems, which
are normally trying to hide the fact there is networked storage underneath
-steve