Hey Sugandha,

It's a common mistake - I think he was trying to unsubscribe to the mailing list (which is done by sending a message to a specific email address with the command "unsubscribe"), not telling you to unsubscribe.

Brian

On Aug 3, 2009, at 2:09 AM, Sugandha Naolekar wrote:

This is ridiculous. What do you mean by unsubscribe.?? I have few queries
and dats why have logged in to the corresponding forum.

On Mon, Aug 3, 2009 at 12:33 PM, A BlueCoder <[email protected]> wrote:

unsubscribe

On Mon, Aug 3, 2009 at 12:01 AM, Sugandha Naolekar
<[email protected]>wrote:

dats fine. But, if I place the data in HDFS and then run map reduce code
to
provide compression, then the data will get compressed in sequence files but, even the original data will reside in the memory;thereby leading or
causing a kind of redundancy of data...

Can u pls suggest me a way out?/

On Mon, Aug 3, 2009 at 12:07 PM, prashant ullegaddi <
[email protected]> wrote:

I don't think you will be able to compress some data unless it's on
HDFS.
What you can do is
1. Manually compress the data on the machine where the data resides.
Then,
copy the same to
HDFS. or
2. Copy the data without compressing to HDFS, then run a job which just
emits the data as it reads
in key/value pair. You can set
FileOutputFormat.setOutputCompressorClass(job,GzipCodec.class) so
that output gets gzipped.

Does that solve your problem?

btw you didn't exactly specify your data size (how many TBs).

On Mon, Aug 3, 2009 at 11:02 AM, Sugandha Naolekar
<[email protected]>wrote:

Yes, You are right. Here goes the details related::

-> I have a Hadoop cluster of 7 nodes. Now there is this 8th machine,
which
is not a part of the hadoop cluster.
-> I want to place the data of that machine into the HDFS. Thus,
before
placing it in HDFS, I want to compress it, and then dump in the HDFS.
-> I have 4 datanodes in my cluster. also, data might get extended
upto
tera
bytes.
-> Also, i have set thr replication factor as 2.
-> I guess, for compression, I will have to run map reduce...?
right..please
tel me the complete approach that is needed to be followed.

On Mon, Aug 3, 2009 at 10:48 AM, prashant ullegaddi <
[email protected]> wrote:

By "I want to compress the data first and then place it in HDFS",
do
you
mean you want to compress the data
locally and then copy to DFS?

What's the size of your data? What's the capacity of HDFS?

On Mon, Aug 3, 2009 at 10:45 AM, Sugandha Naolekar
<[email protected]>wrote:

I want to compress the data first and then place it in HDFS.
Again,
while
retrieving the same, I want to uncompress it and place on the
desired
destination. Can this be possible. How to get started? Also, I
want
to
get
started with actual coding part of compression and MAP reduce.
PLease
suggest me aptly...!



--
Regards!
Sugandha





--
Regards!
Sugandha





--
Regards!
Sugandha





--
Regards!
Sugandha

Reply via email to