Yes, You are right. Here goes the details related::

-> I have a Hadoop cluster of 7 nodes. Now there is this 8th machine, which
is not a part of the hadoop cluster.
-> I want to place the data of that machine into the HDFS. Thus, before
placing it in HDFS, I want to compress it, and then dump in the HDFS.
-> I have 4 datanodes in my cluster. also, data might get extended upto tera
bytes.
-> Also, i have set thr replication factor as 2.
-> I guess, for compression, I will have to run map reduce...? right..please
tel me the complete approach that is needed to be followed.

On Mon, Aug 3, 2009 at 10:48 AM, prashant ullegaddi <
[email protected]> wrote:

> By "I want to compress the data first and then place it in HDFS", do you
> mean you want to compress the data
> locally and then copy to DFS?
>
> What's the size of your data? What's the capacity of HDFS?
>
> On Mon, Aug 3, 2009 at 10:45 AM, Sugandha Naolekar
> <[email protected]>wrote:
>
> > I want to compress the data first and then place it in HDFS. Again, while
> > retrieving the same, I want to uncompress it and place on the desired
> > destination. Can this be possible. How to get started? Also, I want to
> get
> > started with actual coding part of compression and MAP reduce. PLease
> > suggest me aptly...!
> >
> >
> >
> > --
> > Regards!
> > Sugandha
> >
>



-- 
Regards!
Sugandha

Reply via email to