Hi, we sorted things out why there is keyin and keyout there on chat. However I really like your two ideas: - generic message types (to prevent casting, at least at the users side) - possible sort of incoming messages
I see them integrated in two ways: First the "Generic messages": HAMA-503 <https://issues.apache.org/jira/browse/HAMA-503> is going to add us a new kind of writing a BSP. I'm pretty sure that this will be on top of the BSP Class. I can think that a computation unit will have this kind of <MESSAGEIN, MESSAGEOUT> interface that will gurantee typesafetyness at the user-api level. Internally this can be accomplished by casting the underlying writable. Maybe I can get a prototype over the next week, so you can have a look and tell me what you think. Second the "Sort of messages": This is a fancy feature, consider SSSP and you just get the messages in ascending order by cost. This would save a lot of looping ;) However this has overhead and needs the Comparable interface. I see this integrated in the MessageService. In my opinion, especially when we start adding more RPC protocols we have to make an abstract subclass and a more pluggable solution to support these kind of mechanisms. I keep both things in the back of my mind. Suraj, if you'd like the second idea, please file a new Jira. I think it is a great idea. 2012/2/4 Suraj Menon <[email protected]> > Hi, I like this idea. But I want to explore one more step backwards on this > :). I want to know what purpose does the restriction of having KEYIN and > KEYOUT serve for user here? I feel they could be a field in user’s message > class than Hama suggesting it to be there. We already have KeyValuePair > defined. But the user’s bsp module should be able to send message with no > keys. > > In Map-Reduce, this serves as a part of the programming model, where the > entries read are aggregated based on keys and then in reduce again we > process sorted records per key. But, in BSP model, it is the destination of > each message that regulates where a particular piece of data is processed > in the next superstep. Hence logically, the key on the message is the > identity of destination peer(or group of peers). So why do we need KEYIN > and KEYOUT? How is it different when the input and output format is > expressed as KeyValuePair. > > Let’s consider an example where a user has written a BSP class named > MyCoolClass that passes messages of type MyCoolMessage (extends > BSPMessage). > > Today he would have to write the bsp function as : > MyCoolClass extends BSP<in_tag_type, in_msg_type, out_msg_tag, > out_msg_type>{ > > bsp(peer<*in_tag_type, in_msg_type, out_key_type, out_msg_type>* ){ > > } > > > MyCoolClass extends BSP<in_msg_type, out_msg_type>{ > > bsp(peer<? super Writable in_msg_type, ? extends Writable out_msg_type){ > > > > } > } > > There are other scenarios to consider too. What if a user wants the > messages sent to his BSPPeer sorted. I think we should provide this flavor. > ** > > bsp(peer<? super WritableComparable in_msg_type, ? extends > WritableComparable out_msg_type) > > > and Hama framework should support this. > > If the aforesaid doesn’t make sense please help in getting correct > understanding. :) > * > * > *Thanks,* > *Suraj* > > On Fri, Feb 3, 2012 at 8:52 AM, Tommaso Teofili > <[email protected]>wrote: > > > +1, nice API improvement. > > Tommaso > > > > 2012/2/3 Thomas Jungblut <[email protected]> > > > > > Yes, this sounds to me reasonable as well. > > > Other opinions? Otherwise I am filing a jira for that. > > > > > > 2012/2/3 Edward J. Yoon <[email protected]> > > > > > > > I think, we may want to change like <? extends Writable, ? extends > > > > Writable>. > > > > > > > > On Fri, Feb 3, 2012 at 9:45 AM, Edward J. Yoon < > [email protected]> > > > > wrote: > > > > > I prefer the Writable. > > > > > > > > > > On Thu, Feb 2, 2012 at 8:49 PM, Thomas Jungblut > > > > > <[email protected]> wrote: > > > > >> Hi all, > > > > >> > > > > >> I refactored the messaging in 0.3.0 and changed this from an > > inteface > > > > to an > > > > >> abstract base class. > > > > >> Currently it is fine, but I feel that the user is too restricted > in > > > > using > > > > >> messages. > > > > >> You have this strict structure of tag and data. I think we should > > > widen > > > > the > > > > >> messages to just Messagable . > > > > >> If we want to have the freedom to add additional things, we should > > > > extend > > > > >> Messagable from Writable and use this for it. > > > > >> > > > > >> So send may look like this: > > > > >> > > > > >> public final void send(String peerName, Messagable msg) > > > > >> > > > > >> > > > > >> and getCurrentMessage: > > > > >> > > > > >> public final Messagable getCurrentMessage() > > > > >> > > > > >> > > > > >> However, I am not really happy that we return Messagable (requires > > > > casting > > > > >> and stuff). > > > > >> For the usecases of specific tagging we can add the getTag() > method > > to > > > > the > > > > >> Messagable interface. > > > > >> What type should this be then? I mean, String would be quite a > large > > > > >> overhead. Integer might not be useful. > > > > >> > > > > >> Or should we widen this to Writable instead? So you can send > things > > > > you've > > > > >> read from sequencefiles directly to other tasks. > > > > >> > > > > >> What do you think? I am still not aware of how it should look > like. > > Or > > > > are > > > > >> you satisfied with the current messaging? > > > > >> > > > > >> -- > > > > >> Thomas Jungblut > > > > >> Berlin <[email protected]> > > > > > > > > > > > > > > > > > > > > -- > > > > > Best Regards, Edward J. Yoon > > > > > @eddieyoon > > > > > > > > > > > > > > > > -- > > > > Best Regards, Edward J. Yoon > > > > @eddieyoon > > > > > > > > > > > > > > > > -- > > > Thomas Jungblut > > > Berlin <[email protected]> > > > > > > -- Thomas Jungblut Berlin <[email protected]>
