Great :) Do you have plans to integrate a partitioning? Currently this is just a block assignment partitioning, hardcoded in the client. This won't be useful for PageRank and SSSP. This would help us in Graph package as well for the next release.
2011/11/2 Edward J. Yoon <[email protected]> > > For sure I agree we should allow the former programming model with no > input> without explicitly instantiating dummy inputs/splits. What about > providing> two basic (different) implementations? > > +1 > > I was about to. > On Wed, Nov 2, 2011 at 9:23 PM, Tommaso Teofili > <[email protected]> wrote: > > 2011/11/2 Thomas Jungblut <[email protected]> > > > >> Another point while fixing the local runner: > >> > >> Are we now input driven? > >> I see in the code that the user defined task number is overriden by the > >> number of splits. > >> Was this your intention? This will actually make realtime processing > with > >> no static input a real pain. > >> For example if you want a similar behaviour in Hadoop M/R you'll need to > >> create dummy splits, and this is not what we should aim at. > >> > >> We could simply check if the user define the NullInputFormat or nothing > and > >> then use the number of tasks the user has configured. > >> > > > > For sure I agree we should allow the former programming model with no > input > > without explicitly instantiating dummy inputs/splits. What about > providing > > two basic (different) implementations? > > Tommaso > > > > > >> > >> 2011/11/2 Tommaso Teofili <[email protected]> > >> > >> > 2011/11/2 Edward J. Yoon <[email protected]> > >> > > >> > > > I'm sure that not every job actually needs a cleanup or a setup. > >> > > > >> > > You're right. Almost BSP applications should override bsp() method > >> > > but, setup() and cleaner() methods are not as you said. Let's fix > >> > > them. > >> > > > >> > > >> > Agreed +1 > >> > > >> > > >> > > > >> > > > Generally I would suggest to integrate the OutputCollector and the > >> > > > RecordReader into the BSPPeerImpl. > >> > > > So our peer is like the context in Hadoop. > >> > > > >> > > Good idea. > >> > > > >> > > >> > +1 here too > >> > > >> > Tommaso > >> > > >> > > >> > > > >> > > On Wed, Nov 2, 2011 at 9:03 PM, Thomas Jungblut > >> > > <[email protected]> wrote: > >> > > > Yes. When I reworked that API, I made a default implementation in > our > >> > > > abstract BSP class. > >> > > > So the user has to override the methods for himself, if he needs > to. > >> > > > I'm sure that not every job actually needs a cleanup or a setup. > >> > > > > >> > > > Generally I would suggest to integrate the OutputCollector and the > >> > > > RecordReader into the BSPPeerImpl. > >> > > > So our peer is like the context in Hadoop. > >> > > > But that is just a minor thing. It is a great improvement ;) > >> > > > > >> > > > 2011/11/2 Edward J. Yoon <[email protected]> > >> > > > > >> > > >> There're bsp(), setup() and cleaner() methods. > >> > > >> > >> > > >> What is you suggestion? > >> > > >> > >> > > >> On Wed, Nov 2, 2011 at 8:47 PM, Thomas Jungblut > >> > > >> <[email protected]> wrote: > >> > > >> > Have a look at the combiner class. I know that this is just a > >> > "test", > >> > > but > >> > > >> > it is really messy if the user does not use the methods, but is > >> > > forced to > >> > > >> > override them. > >> > > >> > > >> > > >> > 2011/11/2 Edward J. Yoon <[email protected]> > >> > > >> > > >> > > >> >> Why? > >> > > >> >> > >> > > >> >> On Wed, Nov 2, 2011 at 8:21 PM, Thomas Jungblut > >> > > >> >> <[email protected]> wrote: > >> > > >> >> > I totally dislike that BSP class now has abstract methods > >> instead > >> > > of > >> > > >> >> > default implementations. > >> > > >> >> > > >> > > >> >> > 2011/11/2 Edward J. Yoon <[email protected]> > >> > > >> >> > > >> > > >> >> >> Hi all, > >> > > >> >> >> > >> > > >> >> >> As you know, recently combiners and IO are added. > >> > > >> >> >> > >> > > >> >> >> Please review them from user viewpoint. > >> > > >> >> >> > >> > > >> >> >> > >> > > >> >> >> > >> > > >> >> > >> > > >> > >> > > > >> > > >> > http://svn.apache.org/repos/asf/incubator/hama/trunk/examples/src/main/java/org/apache/hama/examples/PiEstimator.java > >> > > >> >> >> > >> > > >> >> >> I'm testing multiple tasks and IO features on 100 nodes > >> cluster > >> > > using > >> > > >> >> >> 10 tasks per node. If there's no issue, I'll close > HAMA-258. > >> > > >> >> >> > >> > > >> >> >> Thanks. > >> > > >> >> >> > >> > > >> >> >> -- > >> > > >> >> >> Best Regards, Edward J. Yoon > >> > > >> >> >> @eddieyoon > >> > > >> >> >> > >> > > >> >> > > >> > > >> >> > > >> > > >> >> > > >> > > >> >> > -- > >> > > >> >> > Thomas Jungblut > >> > > >> >> > Berlin <[email protected]> > >> > > >> >> > > >> > > >> >> > >> > > >> >> > >> > > >> >> > >> > > >> >> -- > >> > > >> >> Best Regards, Edward J. Yoon > >> > > >> >> @eddieyoon > >> > > >> >> > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > -- > >> > > >> > Thomas Jungblut > >> > > >> > Berlin <[email protected]> > >> > > >> > > >> > > >> > >> > > >> > >> > > >> > >> > > >> -- > >> > > >> Best Regards, Edward J. Yoon > >> > > >> @eddieyoon > >> > > >> > >> > > > > >> > > > > >> > > > > >> > > > -- > >> > > > Thomas Jungblut > >> > > > Berlin <[email protected]> > >> > > > > >> > > > >> > > > >> > > > >> > > -- > >> > > Best Regards, Edward J. Yoon > >> > > @eddieyoon > >> > > > >> > > >> > >> > >> > >> -- > >> Thomas Jungblut > >> Berlin <[email protected]> > >> > > > > > > -- > Best Regards, Edward J. Yoon > @eddieyoon > -- Thomas Jungblut Berlin <[email protected]>
