subject:"using MultipleOutputFormat to ensure one output file per key"

using MultipleOutputFormat to ensure one output file per key

2014-11-25 Thread Arpan Ghosh

Hi, How can I implement a custom MultipleOutputFormat and specify it as the output of my Spark job so that I can ensure that there is a unique output file per key (instead of a a unique output file per reducer)? Thanks Arpan

Re: using MultipleOutputFormat to ensure one output file per key

2014-11-25 Thread Rafal Kwasny

Hi, Arpan Ghosh wrote: Hi, How can I implement a custom MultipleOutputFormat and specify it as the output of my Spark job so that I can ensure that there is a unique output file per key (instead of a a unique output file per reducer)? I use something like this: class KeyBasedOutput[T :