Which java file is responsible for replication? Which file chooses random data node from same rack and which chooses random rack?
On Wed, Apr 10, 2013 at 3:26 AM, Raj Vishwanathan <[email protected]> wrote: > You could use the following facts. > 1. Files are stored in blocks. So make your blocksize bigger than the > largest file. > 2, The first split is stored on the localnode. > > Raj > > ------------------------------ > *From:* jeremy p <[email protected]> > *To:* [email protected] > *Sent:* Tuesday, April 9, 2013 1:49 PM > *Subject:* When copying a file to HDFS, how to control what nodes that > file will reside on? > > Hey all, > > I'm dealing with kind of a bizarre use case where I need to make sure that > File A is local to Machine A, File B is local to Machine B, etc. When > copying a file to HDFS, is there a way to control which machines that file > will reside on? I know that any given file will be replicated across three > machines, but I need to be able to say "File A will DEFINITELY exist on > Machine A". I don't really care about the other two machines -- they could > be any machines on my cluster. > > Thank you. > > > -- *With regards ---* *Mohammad Mustaqeem*, M.Tech (CSE) MNNIT Allahabad 9026604270
