Re: :!

Brian Bockelman Mon, 03 Aug 2009 10:19:21 -0700

Hey Sugandha,

It's a common mistake - I think he was trying to unsubscribe to themailing list (which is done by sending a message to a specific emailaddress with the command "unsubscribe"), not telling you to unsubscribe.


Brian

On Aug 3, 2009, at 2:09 AM, Sugandha Naolekar wrote:

This is ridiculous. What do you mean by unsubscribe.?? I have fewqueries

and dats why have logged in to the corresponding forum.

On Mon, Aug 3, 2009 at 12:33 PM, A BlueCoder<[email protected]> wrote:

unsubscribe

On Mon, Aug 3, 2009 at 12:01 AM, Sugandha Naolekar
<[email protected]>wrote:

dats fine. But, if I place the data in HDFS and then run mapreduce code

to

provide compression, then the data will get compressed in sequencefilesbut, even the original data will reside in the memory;therebyleading or
causing a kind of redundancy of data...

Can u pls suggest me a way out?/

On Mon, Aug 3, 2009 at 12:07 PM, prashant ullegaddi <
[email protected]> wrote:
I don't think you will be able to compress some data unless it's on

HDFS.

What you can do is
1. Manually compress the data on the machine where the dataresides.

Then,

copy the same to
HDFS. or

2. Copy the data without compressing to HDFS, then run a jobwhich just

emits the data as it reads
in key/value pair. You can set
FileOutputFormat.setOutputCompressorClass(job,GzipCodec.class) so
that output gets gzipped.

Does that solve your problem?

btw you didn't exactly specify your data size (how many TBs).

On Mon, Aug 3, 2009 at 11:02 AM, Sugandha Naolekar
<[email protected]>wrote:

Yes, You are right. Here goes the details related::
-> I have a Hadoop cluster of 7 nodes. Now there is this 8thmachine,

which

is not a part of the hadoop cluster.
-> I want to place the data of that machine into the HDFS. Thus,

before

placing it in HDFS, I want to compress it, and then dump in theHDFS.
-> I have 4 datanodes in my cluster. also, data might get extended

upto

tera
bytes.
-> Also, i have set thr replication factor as 2.
-> I guess, for compression, I will have to run map reduce...?
right..please
tel me the complete approach that is needed to be followed.

On Mon, Aug 3, 2009 at 10:48 AM, prashant ullegaddi <
[email protected]> wrote:

By "I want to compress the data first and then place it in HDFS",

do

you

mean you want to compress the data
locally and then copy to DFS?

What's the size of your data? What's the capacity of HDFS?

On Mon, Aug 3, 2009 at 10:45 AM, Sugandha Naolekar
<[email protected]>wrote:

I want to compress the data first and then place it in HDFS.

Again,

while

retrieving the same, I want to uncompress it and place on the

desired

destination. Can this be possible. How to get started? Also, I

want

to

get

started with actual coding part of compression and MAP reduce.

PLease

suggest me aptly...!



--
Regards!
Sugandha




--
Regards!
Sugandha




--
Regards!
Sugandha




--
Regards!
Sugandha

Re: :!

Reply via email to