Re: Doubt in reducer

Vladimir Klimontovich Thu, 27 Aug 2009 09:11:46 -0700

But reducer can do some preparations during map process. It can
distribute map output across nodes that will work as reducers.


Copying and sorting map output is also time costuming process (maybe,

more consuming than reduce itself). For example, piece job run log on40node cluster

could be like that:

09/08/27 11:08:24 INFO job.JobRunningListener:  map 36% reduce 10%
09/08/27 11:08:28 INFO job.JobRunningListener:  map 37% reduce 10%
09/08/27 11:08:29 INFO job.JobRunningListener:  map 37% reduce 11%

But if you run job on single node cluster reduce will start only aftermap finished.


On Aug 27, 2009, at 4:31 PM, Harish Mallipeddi wrote:

On Thu, Aug 27, 2009 at 5:22 PM, Rakhi Khatwani<[email protected]> wrote:
but i want my reduce to run , tht is if 25% map is done, thn i wantthereduce 2 save that much data. even if the 2nd map fails, i dontloose data.
any pointers?
Regards,
Raakhi
What you're asking for will break the semantics of reduce(). Reducecan only
proceed after receiving all the map-outputs.

--
Harish Mallipeddi
http://blog.poundbang.in


---
Vladimir Klimontovich,
skype: klimontovich
GoogleTalk/Jabber: [email protected]
Cell phone: +7926 890 2349

Re: Doubt in reducer

Reply via email to