Code, input and output - http://pastebin.com/AG2DyZ22 userlogs - http://pastebin.com/bLv2Ad3J, http://pastebin.com/RzzEre1R
On Thu, Apr 19, 2012 at 7:04 PM, Harsh J <ha...@cloudera.com> wrote: > Can you pastebin and provide your specific mapper userlog > (syslogs/stderr/stdout)? > > On Thu, Apr 19, 2012 at 6:05 PM, Sudip Sinha <sudipsinha.ba...@gmail.com> > wrote: > > Hi, > > > > I'm reposting this as I've not received any reply to my earlier post on > the > > same issue. > > > > I've read that the combiner only works if it is specified AND the sort > > memory buffer overflows in the mapper. > > > http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-user/201107.mbox/%3c374d8f3f-b8b1-499f-bedb-bfee32190...@hortonworks.com%3E > > > > But when I run a Hadoop streaming job in R using RHadoop, the combiner > > always runs when specified. This is on a very small dataset. > > > > Is this the desired behaviour? > > > > More on this: https://github.com/RevolutionAnalytics/RHadoop/issues/70 > > > > Thanks, > > Sudip Sinha > > > > -- > Harsh J >