Kurtis Heimerl wrote:
ok, I have an architectural question. I think I get the client-side stack.
DFSClient creates a proxy, which connects to the namenode. This all uses
ClientProtocol. So, to implement what I need I'll probably need to modify
ClientProtocol and NameNode.
Yes, ClientProtocol will need to be modified to pass the username.
Now we have the whole DistributedFileSystem and FileSystem stuff. I see the
cache in FileSystem, I just don't see where in the stack this is. It's
server-side I assume. I see where we instantiate the NameNode on the
server,
but it seemingly just deals with blocks. Where's the filesystem at?
FileSystem instances are created by user code when to access the
FileSystem. So a map InputFormat implementation uses a FileSystem to
open input files, a reduce OutputFormat implementation uses a FileSystem
to open output files, and the 'bin/hadoop fs' commands create a
FileSystem, etc. Does that make more sense?
Doug