Hi Camusensei, Thank you. That's very helpful!
Rex On Thu, Jan 21, 2016 at 1:41 AM, Namikaze Minato <[email protected]> wrote: > Hi Rex X, > > We are using the -outputFormat <classname> option of hadoop-streaming. > Here is the detail: http://www.infoq.com/articles/HadoopOutputFormat > > Regards, > Camusensei > > On 21 January 2016 at 07:21, Rex X <[email protected]> wrote: > > Thank you, Rohit! > > > > Any multiple outputs sample code in python? > > > > Rex > > > > > > On Wed, Jan 20, 2016 at 10:04 PM, rohit sarewar <[email protected]> > > wrote: > >> > >> Hi Rex > >> > >> Please explore multiple outputs. > >> > >> Regards > >> Rohit Sarewar > >> > >> > >> On Thu, Jan 21, 2016 at 5:13 AM, Rex X <[email protected]> wrote: > >>> > >>> Dear all, > >>> > >>> To be specific, for example, given > >>> > >>> hadoop jar hadoop-streaming.jar \ > >>> -input myInputDirs \ > >>> -output myOutputDir \ > >>> -mapper /bin/cat \ > >>> -reducer /usr/bin/wc > >>> > >>> Where myInputDirs has a dated subfolder structure of > >>> > >>> /input_dir/yyyy/mm/dd/part-* > >>> > >>> I want myOutputDir has the same dated subfolder structure: > >>> > >>> /output_dir/yyyy/mm/dd/part-* > >>> > >>> Guess there should be an option to do this. Can "-partitioner" or any > >>> "-D" option make this? > >>> > >>> > >>> Thanks & regards, > >>> Rex > >> > >> > > >
