Yes, You are right. Here goes the details related:: -> I have a Hadoop cluster of 7 nodes. Now there is this 8th machine, which is not a part of the hadoop cluster. -> I want to place the data of that machine into the HDFS. Thus, before placing it in HDFS, I want to compress it, and then dump in the HDFS. -> I have 4 datanodes in my cluster. also, data might get extended upto tera bytes. -> Also, i have set thr replication factor as 2. -> I guess, for compression, I will have to run map reduce...? right..please tel me the complete approach that is needed to be followed.
On Mon, Aug 3, 2009 at 10:48 AM, prashant ullegaddi < [email protected]> wrote: > By "I want to compress the data first and then place it in HDFS", do you > mean you want to compress the data > locally and then copy to DFS? > > What's the size of your data? What's the capacity of HDFS? > > On Mon, Aug 3, 2009 at 10:45 AM, Sugandha Naolekar > <[email protected]>wrote: > > > I want to compress the data first and then place it in HDFS. Again, while > > retrieving the same, I want to uncompress it and place on the desired > > destination. Can this be possible. How to get started? Also, I want to > get > > started with actual coding part of compression and MAP reduce. PLease > > suggest me aptly...! > > > > > > > > -- > > Regards! > > Sugandha > > > -- Regards! Sugandha
