This is ridiculous. What do you mean by unsubscribe.?? I have few queries and dats why have logged in to the corresponding forum.
On Mon, Aug 3, 2009 at 12:33 PM, A BlueCoder <[email protected]> wrote: > unsubscribe > > On Mon, Aug 3, 2009 at 12:01 AM, Sugandha Naolekar > <[email protected]>wrote: > > > dats fine. But, if I place the data in HDFS and then run map reduce code > to > > provide compression, then the data will get compressed in sequence files > > but, even the original data will reside in the memory;thereby leading or > > causing a kind of redundancy of data... > > > > Can u pls suggest me a way out?/ > > > > On Mon, Aug 3, 2009 at 12:07 PM, prashant ullegaddi < > > [email protected]> wrote: > > > > > I don't think you will be able to compress some data unless it's on > HDFS. > > > What you can do is > > > 1. Manually compress the data on the machine where the data resides. > > Then, > > > copy the same to > > > HDFS. or > > > 2. Copy the data without compressing to HDFS, then run a job which just > > > emits the data as it reads > > > in key/value pair. You can set > > > FileOutputFormat.setOutputCompressorClass(job,GzipCodec.class) so > > > that output gets gzipped. > > > > > > Does that solve your problem? > > > > > > btw you didn't exactly specify your data size (how many TBs). > > > > > > On Mon, Aug 3, 2009 at 11:02 AM, Sugandha Naolekar > > > <[email protected]>wrote: > > > > > > > Yes, You are right. Here goes the details related:: > > > > > > > > -> I have a Hadoop cluster of 7 nodes. Now there is this 8th machine, > > > which > > > > is not a part of the hadoop cluster. > > > > -> I want to place the data of that machine into the HDFS. Thus, > before > > > > placing it in HDFS, I want to compress it, and then dump in the HDFS. > > > > -> I have 4 datanodes in my cluster. also, data might get extended > upto > > > > tera > > > > bytes. > > > > -> Also, i have set thr replication factor as 2. > > > > -> I guess, for compression, I will have to run map reduce...? > > > > right..please > > > > tel me the complete approach that is needed to be followed. > > > > > > > > On Mon, Aug 3, 2009 at 10:48 AM, prashant ullegaddi < > > > > [email protected]> wrote: > > > > > > > > > By "I want to compress the data first and then place it in HDFS", > do > > > you > > > > > mean you want to compress the data > > > > > locally and then copy to DFS? > > > > > > > > > > What's the size of your data? What's the capacity of HDFS? > > > > > > > > > > On Mon, Aug 3, 2009 at 10:45 AM, Sugandha Naolekar > > > > > <[email protected]>wrote: > > > > > > > > > > > I want to compress the data first and then place it in HDFS. > Again, > > > > while > > > > > > retrieving the same, I want to uncompress it and place on the > > desired > > > > > > destination. Can this be possible. How to get started? Also, I > want > > > to > > > > > get > > > > > > started with actual coding part of compression and MAP reduce. > > PLease > > > > > > suggest me aptly...! > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > Regards! > > > > > > Sugandha > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > Regards! > > > > Sugandha > > > > > > > > > > > > > > > -- > > Regards! > > Sugandha > > > -- Regards! Sugandha
