The first approach works assuming the output keyspace is the same, or at least compatible... I was thinking I'd have to do something like a side file. Thanks for the quick response. Mark
-----Original Message----- From: Owen O'Malley [mailto:[EMAIL PROTECTED] Sent: Tuesday, May 22, 2007 11:44 AM To: [email protected] Subject: Re: Is it possible to have one map operation "talking" to multiple Reduce operations? On May 22, 2007, at 11:31 AM, Mark Meissonnier wrote: > Say you have a complicated function that is being called by a map > method, but it produces a lot of information that can be used to > produce two types of indices, is it possible to have 2 "map" outputs , > which branch off respectively to a reduce1 method and a reduce2 > method? No. There are a couple of ways around it. Probably the most efficient is to make the reduces act differently based on their partition id. So you'd say that reduces 0...999 are doing X and reduces 1000...1999 are doing Y. The transient data would have to be a tagged union of the types you are sending to the different reduces. The easier approach is that you have the maps write a side file with the input for the second reduce. After your first job finishes, you launch a second job that processes the side files as the input. -- Owen
