But OTOH, if I wanted my reducer to write binary output, I'd be screwed, especially so in the streaming world (where I'd like to stay for the moment).
Actually, I don't think I understand your point: if the reducer's output is in a key/value format, you still can run another map over it or another reduce, can't you? If the output isn't, you can't; it's up to the user who coded up the Reducer. What am I missing? Thanks, -Yuri On Tue 12 2008, Miles Osborne wrote: > You may well have another Map operation operate over the Reducer > output, in which case you'd want key-value pairs. > > Miles > > On 12/02/2008, Yuri Pradkin <[EMAIL PROTECTED]> wrote: > > Hi, > > > > I'm relatively new to Hadoop and I have what I hope is a simple > > question: > > > > I don't understand why the key/value assumption is preserved AFTER > > the reduce operation, in other words why the output of a reducer > > is expected as <key,value> instead of arbitrary, possibly binary > > bytes? Why can't OutputCollector just give those raw bytes to the > > RecordWriter and have it make sense of them as it pleases, or just > > dump them to a file? > > > > This seems like an unnecessary restriction to me, at least at the > > first glance. > > > > Thanks, > > > > -Yuri
