Ah okay I see why. But I don't see that this is very good. BTW the classes you've added from Hadoop are missing the Apache header.
Sorry for spamming. 2011/11/2 Thomas Jungblut <[email protected]> > And what is the reason to implement our own Input/output format if you > stick with key/value pairs. > Let's be compatible to Hadoop and use theirs. > > And we should really stop copying hadoop stuff arround. It is already > there. > > > 2011/11/2 Thomas Jungblut <[email protected]> > >> Great :) >> >> Do you have plans to integrate a partitioning? Currently this is just a >> block assignment partitioning, hardcoded in the client. >> This won't be useful for PageRank and SSSP. >> This would help us in Graph package as well for the next release. >> >> 2011/11/2 Edward J. Yoon <[email protected]> >> >>> > For sure I agree we should allow the former programming model with no >>> input> without explicitly instantiating dummy inputs/splits. What about >>> providing> two basic (different) implementations? >>> >>> +1 >>> >>> I was about to. >>> On Wed, Nov 2, 2011 at 9:23 PM, Tommaso Teofili >>> <[email protected]> wrote: >>> > 2011/11/2 Thomas Jungblut <[email protected]> >>> > >>> >> Another point while fixing the local runner: >>> >> >>> >> Are we now input driven? >>> >> I see in the code that the user defined task number is overriden by >>> the >>> >> number of splits. >>> >> Was this your intention? This will actually make realtime processing >>> with >>> >> no static input a real pain. >>> >> For example if you want a similar behaviour in Hadoop M/R you'll need >>> to >>> >> create dummy splits, and this is not what we should aim at. >>> >> >>> >> We could simply check if the user define the NullInputFormat or >>> nothing and >>> >> then use the number of tasks the user has configured. >>> >> >>> > >>> > For sure I agree we should allow the former programming model with no >>> input >>> > without explicitly instantiating dummy inputs/splits. What about >>> providing >>> > two basic (different) implementations? >>> > Tommaso >>> > >>> > >>> >> >>> >> 2011/11/2 Tommaso Teofili <[email protected]> >>> >> >>> >> > 2011/11/2 Edward J. Yoon <[email protected]> >>> >> > >>> >> > > > I'm sure that not every job actually needs a cleanup or a setup. >>> >> > > >>> >> > > You're right. Almost BSP applications should override bsp() method >>> >> > > but, setup() and cleaner() methods are not as you said. Let's fix >>> >> > > them. >>> >> > > >>> >> > >>> >> > Agreed +1 >>> >> > >>> >> > >>> >> > > >>> >> > > > Generally I would suggest to integrate the OutputCollector and >>> the >>> >> > > > RecordReader into the BSPPeerImpl. >>> >> > > > So our peer is like the context in Hadoop. >>> >> > > >>> >> > > Good idea. >>> >> > > >>> >> > >>> >> > +1 here too >>> >> > >>> >> > Tommaso >>> >> > >>> >> > >>> >> > > >>> >> > > On Wed, Nov 2, 2011 at 9:03 PM, Thomas Jungblut >>> >> > > <[email protected]> wrote: >>> >> > > > Yes. When I reworked that API, I made a default implementation >>> in our >>> >> > > > abstract BSP class. >>> >> > > > So the user has to override the methods for himself, if he >>> needs to. >>> >> > > > I'm sure that not every job actually needs a cleanup or a setup. >>> >> > > > >>> >> > > > Generally I would suggest to integrate the OutputCollector and >>> the >>> >> > > > RecordReader into the BSPPeerImpl. >>> >> > > > So our peer is like the context in Hadoop. >>> >> > > > But that is just a minor thing. It is a great improvement ;) >>> >> > > > >>> >> > > > 2011/11/2 Edward J. Yoon <[email protected]> >>> >> > > > >>> >> > > >> There're bsp(), setup() and cleaner() methods. >>> >> > > >> >>> >> > > >> What is you suggestion? >>> >> > > >> >>> >> > > >> On Wed, Nov 2, 2011 at 8:47 PM, Thomas Jungblut >>> >> > > >> <[email protected]> wrote: >>> >> > > >> > Have a look at the combiner class. I know that this is just a >>> >> > "test", >>> >> > > but >>> >> > > >> > it is really messy if the user does not use the methods, but >>> is >>> >> > > forced to >>> >> > > >> > override them. >>> >> > > >> > >>> >> > > >> > 2011/11/2 Edward J. Yoon <[email protected]> >>> >> > > >> > >>> >> > > >> >> Why? >>> >> > > >> >> >>> >> > > >> >> On Wed, Nov 2, 2011 at 8:21 PM, Thomas Jungblut >>> >> > > >> >> <[email protected]> wrote: >>> >> > > >> >> > I totally dislike that BSP class now has abstract methods >>> >> instead >>> >> > > of >>> >> > > >> >> > default implementations. >>> >> > > >> >> > >>> >> > > >> >> > 2011/11/2 Edward J. Yoon <[email protected]> >>> >> > > >> >> > >>> >> > > >> >> >> Hi all, >>> >> > > >> >> >> >>> >> > > >> >> >> As you know, recently combiners and IO are added. >>> >> > > >> >> >> >>> >> > > >> >> >> Please review them from user viewpoint. >>> >> > > >> >> >> >>> >> > > >> >> >> >>> >> > > >> >> >> >>> >> > > >> >> >>> >> > > >> >>> >> > > >>> >> > >>> >> >>> http://svn.apache.org/repos/asf/incubator/hama/trunk/examples/src/main/java/org/apache/hama/examples/PiEstimator.java >>> >> > > >> >> >> >>> >> > > >> >> >> I'm testing multiple tasks and IO features on 100 nodes >>> >> cluster >>> >> > > using >>> >> > > >> >> >> 10 tasks per node. If there's no issue, I'll close >>> HAMA-258. >>> >> > > >> >> >> >>> >> > > >> >> >> Thanks. >>> >> > > >> >> >> >>> >> > > >> >> >> -- >>> >> > > >> >> >> Best Regards, Edward J. Yoon >>> >> > > >> >> >> @eddieyoon >>> >> > > >> >> >> >>> >> > > >> >> > >>> >> > > >> >> > >>> >> > > >> >> > >>> >> > > >> >> > -- >>> >> > > >> >> > Thomas Jungblut >>> >> > > >> >> > Berlin <[email protected]> >>> >> > > >> >> > >>> >> > > >> >> >>> >> > > >> >> >>> >> > > >> >> >>> >> > > >> >> -- >>> >> > > >> >> Best Regards, Edward J. Yoon >>> >> > > >> >> @eddieyoon >>> >> > > >> >> >>> >> > > >> > >>> >> > > >> > >>> >> > > >> > >>> >> > > >> > -- >>> >> > > >> > Thomas Jungblut >>> >> > > >> > Berlin <[email protected]> >>> >> > > >> > >>> >> > > >> >>> >> > > >> >>> >> > > >> >>> >> > > >> -- >>> >> > > >> Best Regards, Edward J. Yoon >>> >> > > >> @eddieyoon >>> >> > > >> >>> >> > > > >>> >> > > > >>> >> > > > >>> >> > > > -- >>> >> > > > Thomas Jungblut >>> >> > > > Berlin <[email protected]> >>> >> > > > >>> >> > > >>> >> > > >>> >> > > >>> >> > > -- >>> >> > > Best Regards, Edward J. Yoon >>> >> > > @eddieyoon >>> >> > > >>> >> > >>> >> >>> >> >>> >> >>> >> -- >>> >> Thomas Jungblut >>> >> Berlin <[email protected]> >>> >> >>> > >>> >>> >>> >>> -- >>> Best Regards, Edward J. Yoon >>> @eddieyoon >>> >> >> >> >> -- >> Thomas Jungblut >> Berlin <[email protected]> >> > > > > -- > Thomas Jungblut > Berlin <[email protected]> > -- Thomas Jungblut Berlin <[email protected]>
