Yes I'm sorry, the problem was actually that I thought we are going to be incompatible. But that is not correct ;)
2011/11/2 Edward J. Yoon <[email protected]> > Just FYI, one reason is that there're a lot of KeyValue stores. > > On Wed, Nov 2, 2011 at 11:46 PM, Thomas Jungblut > <[email protected]> wrote: > > Ah okay I see why. > > But I don't see that this is very good. BTW the classes you've added from > > Hadoop are missing the Apache header. > > > > Sorry for spamming. > > > > 2011/11/2 Thomas Jungblut <[email protected]> > > > >> And what is the reason to implement our own Input/output format if you > >> stick with key/value pairs. > >> Let's be compatible to Hadoop and use theirs. > >> > >> And we should really stop copying hadoop stuff arround. It is already > >> there. > >> > >> > >> 2011/11/2 Thomas Jungblut <[email protected]> > >> > >>> Great :) > >>> > >>> Do you have plans to integrate a partitioning? Currently this is just a > >>> block assignment partitioning, hardcoded in the client. > >>> This won't be useful for PageRank and SSSP. > >>> This would help us in Graph package as well for the next release. > >>> > >>> 2011/11/2 Edward J. Yoon <[email protected]> > >>> > >>>> > For sure I agree we should allow the former programming model with > no > >>>> input> without explicitly instantiating dummy inputs/splits. What > about > >>>> providing> two basic (different) implementations? > >>>> > >>>> +1 > >>>> > >>>> I was about to. > >>>> On Wed, Nov 2, 2011 at 9:23 PM, Tommaso Teofili > >>>> <[email protected]> wrote: > >>>> > 2011/11/2 Thomas Jungblut <[email protected]> > >>>> > > >>>> >> Another point while fixing the local runner: > >>>> >> > >>>> >> Are we now input driven? > >>>> >> I see in the code that the user defined task number is overriden by > >>>> the > >>>> >> number of splits. > >>>> >> Was this your intention? This will actually make realtime > processing > >>>> with > >>>> >> no static input a real pain. > >>>> >> For example if you want a similar behaviour in Hadoop M/R you'll > need > >>>> to > >>>> >> create dummy splits, and this is not what we should aim at. > >>>> >> > >>>> >> We could simply check if the user define the NullInputFormat or > >>>> nothing and > >>>> >> then use the number of tasks the user has configured. > >>>> >> > >>>> > > >>>> > For sure I agree we should allow the former programming model with > no > >>>> input > >>>> > without explicitly instantiating dummy inputs/splits. What about > >>>> providing > >>>> > two basic (different) implementations? > >>>> > Tommaso > >>>> > > >>>> > > >>>> >> > >>>> >> 2011/11/2 Tommaso Teofili <[email protected]> > >>>> >> > >>>> >> > 2011/11/2 Edward J. Yoon <[email protected]> > >>>> >> > > >>>> >> > > > I'm sure that not every job actually needs a cleanup or a > setup. > >>>> >> > > > >>>> >> > > You're right. Almost BSP applications should override bsp() > method > >>>> >> > > but, setup() and cleaner() methods are not as you said. Let's > fix > >>>> >> > > them. > >>>> >> > > > >>>> >> > > >>>> >> > Agreed +1 > >>>> >> > > >>>> >> > > >>>> >> > > > >>>> >> > > > Generally I would suggest to integrate the OutputCollector > and > >>>> the > >>>> >> > > > RecordReader into the BSPPeerImpl. > >>>> >> > > > So our peer is like the context in Hadoop. > >>>> >> > > > >>>> >> > > Good idea. > >>>> >> > > > >>>> >> > > >>>> >> > +1 here too > >>>> >> > > >>>> >> > Tommaso > >>>> >> > > >>>> >> > > >>>> >> > > > >>>> >> > > On Wed, Nov 2, 2011 at 9:03 PM, Thomas Jungblut > >>>> >> > > <[email protected]> wrote: > >>>> >> > > > Yes. When I reworked that API, I made a default > implementation > >>>> in our > >>>> >> > > > abstract BSP class. > >>>> >> > > > So the user has to override the methods for himself, if he > >>>> needs to. > >>>> >> > > > I'm sure that not every job actually needs a cleanup or a > setup. > >>>> >> > > > > >>>> >> > > > Generally I would suggest to integrate the OutputCollector > and > >>>> the > >>>> >> > > > RecordReader into the BSPPeerImpl. > >>>> >> > > > So our peer is like the context in Hadoop. > >>>> >> > > > But that is just a minor thing. It is a great improvement ;) > >>>> >> > > > > >>>> >> > > > 2011/11/2 Edward J. Yoon <[email protected]> > >>>> >> > > > > >>>> >> > > >> There're bsp(), setup() and cleaner() methods. > >>>> >> > > >> > >>>> >> > > >> What is you suggestion? > >>>> >> > > >> > >>>> >> > > >> On Wed, Nov 2, 2011 at 8:47 PM, Thomas Jungblut > >>>> >> > > >> <[email protected]> wrote: > >>>> >> > > >> > Have a look at the combiner class. I know that this is > just a > >>>> >> > "test", > >>>> >> > > but > >>>> >> > > >> > it is really messy if the user does not use the methods, > but > >>>> is > >>>> >> > > forced to > >>>> >> > > >> > override them. > >>>> >> > > >> > > >>>> >> > > >> > 2011/11/2 Edward J. Yoon <[email protected]> > >>>> >> > > >> > > >>>> >> > > >> >> Why? > >>>> >> > > >> >> > >>>> >> > > >> >> On Wed, Nov 2, 2011 at 8:21 PM, Thomas Jungblut > >>>> >> > > >> >> <[email protected]> wrote: > >>>> >> > > >> >> > I totally dislike that BSP class now has abstract > methods > >>>> >> instead > >>>> >> > > of > >>>> >> > > >> >> > default implementations. > >>>> >> > > >> >> > > >>>> >> > > >> >> > 2011/11/2 Edward J. Yoon <[email protected]> > >>>> >> > > >> >> > > >>>> >> > > >> >> >> Hi all, > >>>> >> > > >> >> >> > >>>> >> > > >> >> >> As you know, recently combiners and IO are added. > >>>> >> > > >> >> >> > >>>> >> > > >> >> >> Please review them from user viewpoint. > >>>> >> > > >> >> >> > >>>> >> > > >> >> >> > >>>> >> > > >> >> >> > >>>> >> > > >> >> > >>>> >> > > >> > >>>> >> > > > >>>> >> > > >>>> >> > >>>> > http://svn.apache.org/repos/asf/incubator/hama/trunk/examples/src/main/java/org/apache/hama/examples/PiEstimator.java > >>>> >> > > >> >> >> > >>>> >> > > >> >> >> I'm testing multiple tasks and IO features on 100 > nodes > >>>> >> cluster > >>>> >> > > using > >>>> >> > > >> >> >> 10 tasks per node. If there's no issue, I'll close > >>>> HAMA-258. > >>>> >> > > >> >> >> > >>>> >> > > >> >> >> Thanks. > >>>> >> > > >> >> >> > >>>> >> > > >> >> >> -- > >>>> >> > > >> >> >> Best Regards, Edward J. Yoon > >>>> >> > > >> >> >> @eddieyoon > >>>> >> > > >> >> >> > >>>> >> > > >> >> > > >>>> >> > > >> >> > > >>>> >> > > >> >> > > >>>> >> > > >> >> > -- > >>>> >> > > >> >> > Thomas Jungblut > >>>> >> > > >> >> > Berlin <[email protected]> > >>>> >> > > >> >> > > >>>> >> > > >> >> > >>>> >> > > >> >> > >>>> >> > > >> >> > >>>> >> > > >> >> -- > >>>> >> > > >> >> Best Regards, Edward J. Yoon > >>>> >> > > >> >> @eddieyoon > >>>> >> > > >> >> > >>>> >> > > >> > > >>>> >> > > >> > > >>>> >> > > >> > > >>>> >> > > >> > -- > >>>> >> > > >> > Thomas Jungblut > >>>> >> > > >> > Berlin <[email protected]> > >>>> >> > > >> > > >>>> >> > > >> > >>>> >> > > >> > >>>> >> > > >> > >>>> >> > > >> -- > >>>> >> > > >> Best Regards, Edward J. Yoon > >>>> >> > > >> @eddieyoon > >>>> >> > > >> > >>>> >> > > > > >>>> >> > > > > >>>> >> > > > > >>>> >> > > > -- > >>>> >> > > > Thomas Jungblut > >>>> >> > > > Berlin <[email protected]> > >>>> >> > > > > >>>> >> > > > >>>> >> > > > >>>> >> > > > >>>> >> > > -- > >>>> >> > > Best Regards, Edward J. Yoon > >>>> >> > > @eddieyoon > >>>> >> > > > >>>> >> > > >>>> >> > >>>> >> > >>>> >> > >>>> >> -- > >>>> >> Thomas Jungblut > >>>> >> Berlin <[email protected]> > >>>> >> > >>>> > > >>>> > >>>> > >>>> > >>>> -- > >>>> Best Regards, Edward J. Yoon > >>>> @eddieyoon > >>>> > >>> > >>> > >>> > >>> -- > >>> Thomas Jungblut > >>> Berlin <[email protected]> > >>> > >> > >> > >> > >> -- > >> Thomas Jungblut > >> Berlin <[email protected]> > >> > > > > > > > > -- > > Thomas Jungblut > > Berlin <[email protected]> > > > > > > -- > Best Regards, Edward J. Yoon > @eddieyoon > -- Thomas Jungblut Berlin <[email protected]>
