the size of dataset only takes dozens MB.
But on my own understanding of PFP,
the more reduce tasks may have better performance, isn't it?

By the way, I don't find "setNumReduceTasks()" in the source code of Mahout
How does it work with multiple reduce tasks?




2012/4/30 戴清灏 <[email protected]>

> Then how big your input data size is?
> For a rather small dataset, one reduce task is enough to process.
>
> Regards,
> Q
>
>
>
> 2012/4/30 培宇 <[email protected]>
>
> > I set mapred.tasktracker.reduce.tasks.maximum 1 in conf for each node
> >
> > but I have 4 nodes for running Hadoop
> >
> > Should I install mahout for each node or only master node?
> >
> > Thanks for your help
> >
> >
> > 2012/4/30 戴清灏 <[email protected]>
> >
> > > Sorry for having made you confused.
> > > I mean, if you have explicitly specify the reduce task number in your
> > > hadoop conf/mapred-site.xml or some where else,
> > > PFP would only execute one reduce task.
> > > Your parameter groups 10 would only make PFP call reduce method 10
> times.
> > > Actually reduce method had been called 10 times with a single reduce
> > task.
> > >
> > >
> > > Regards,
> > > Q
> > >
> > >
> > >
> > > 2012/4/30 培宇 <[email protected]>
> > >
> > > > Hello
> > > > I mean reduce tasks.
> > > > I set the parameter -g 10,
> > > > but there is still one reduce task in ParallelFPGrowth.
> > > >
> > > > How do I set parameter -g to change the number of reduce tasks?
> > > >
> > > > Thanks for your reply
> > > >
> > > >
> > > >
> > > > --
> > > > Best regards,
> > > > 培宇 Omi
> > > >
> > >
> >
> >
> >
> > --
> > Best regards,
> > 培宇 Omi
> >
>



-- 
Best regards,
培宇 Omi

Reply via email to