Sounds good to me. The buffer per partition must be small. On Fri, Nov 11, 2011 at 4:57 PM, Karthik Kambatla <[email protected]>wrote:
> Okay. > > I ll implement the following simple fix - > > 1. If connection not open, try connecting. > 2. If can't connect, queue to the buffer. If buffer full, return false > 3. Have another thread which works on re-connections. > 4. If connection open, check if the buffer has any messages queued, send > them and then send the current one. > > Let me know if you think it is okay. > > Thanks > Karthik > > On Fri, Nov 11, 2011 at 6:18 PM, kishore g <[email protected]> wrote: > > > It boils down to the simple question if one node(partition) is down do we > > want the system to halt or continue sending data to other partition. > Other > > option is to provide zeromq implementation and it takes care of buffering > > on the sender side per destination. > > > > We dont have to complicate the api, the api remains the same. This logic > is > > much simpler "if something fails while sending put in the retry buffer > and > > retry when we are able to connect again". Also this is only for the netty > > implementation. > > > > thanks, > > Kishore G > > > > On Fri, Nov 11, 2011 at 2:40 PM, Leo Neumeyer <[email protected]> > > wrote: > > > > > Not sure I understand how this would help. We already have a queue in > the > > > stream for events. Once the framework gives up trying, blocking the > queue > > > seems like a natural solution, the queue will fill up and block. The > app > > > will remain blocked until the underlying problem is resolved. We don't > > want > > > to complicate the API unless, of course, there is a compelling reason. > > > Let's wait until someone can make the case that this is required. More > > > API-level features, means more complexity, more states, more potential > > > bugs, more documentation, etc. What do you think? > > > > > > On Fri, Nov 11, 2011 at 2:04 PM, Karthik Kambatla < > > [email protected] > > > >wrote: > > > > > > > Kishore, > > > > > > > > One buffer per destination node seems fair enough. If we are > > maintaining > > > > buffers, how about just queue messages into this buffer on call to > > > send(), > > > > we can have another thread sending messages on these buffers across. > > > > > > > > Leo, > > > > > > > > It is true that the application can't do much, however the > application > > > can > > > > choose to forget the old messages and see if the new messages can be > > sent > > > > across. The buffers that Kishore suggested can be circular, > indicating > > to > > > > the application a full buffer from time to time. > > > > > > > > Thanks > > > > Karthik > > > > > > > > On Fri, Nov 11, 2011 at 4:29 PM, Leo Neumeyer <[email protected] > > > > > > wrote: > > > > > > > > > Seems to me that the application shouldn't handle the failure > because > > > it > > > > is > > > > > pretty much a fatal error. The framework operator needs to do > > something > > > > > while the apps are blocked, the failure would affect all loaded > > > > > applications. Once the framework gives up, there is no more > recovery, > > > > > stream queues will fill up and block, and the whole system will > stop > > > > until > > > > > the operator resolves the problem. Is there any other possible > > > scenario? > > > > > What can an app do once it is informed other than log more errors? > > > > > > > > > > -leo > > > > > > > > > > On Fri, Nov 11, 2011 at 11:52 AM, Karthik Kambatla > > > > > <[email protected]>wrote: > > > > > > > > > > > Yes, we have to retry connecting to the node but only a bounded > > > number > > > > of > > > > > > times, after which we give up. If someone decides to use TCP, it > is > > > for > > > > > > guaranteed message delivery in which case the application needs > to > > be > > > > > > informed. > > > > > > > > > > > > What we need to decide is whether to let the application handle > an > > > > > > exception or a return value? > > > > > > > > > > > > Karthik > > > > > > > > > > > > On Fri, Nov 11, 2011 at 2:41 PM, Leo Neumeyer < > > [email protected] > > > > > > > > > > wrote: > > > > > > > > > > > > > Presumably the scenario is that we will retry so I guess we > > should > > > > log > > > > > > each > > > > > > > connection error when the exception is thrown. > > > > > > > > > > > > > > What can we do if we were to return false? We can always do > this > > > > later > > > > > > once > > > > > > > we decide how to handle the error, no? > > > > > > > > > > > > > > -leo > > > > > > > > > > > > > > On Fri, Nov 11, 2011 at 11:25 AM, Karthik Kambatla > > > > > > > <[email protected]>wrote: > > > > > > > > > > > > > > > How should we go about handling connection failures (not > being > > > able > > > > > to > > > > > > > > connect to) for a send(). > > > > > > > > > > > > > > > > We can either (1) throw java.net.ConnectException or (2) > return > > > > > false. > > > > > > In > > > > > > > > the second case, we need to modify the Emitter.send to return > > > > > boolean. > > > > > > > > Comments? > > > > > > > > > > > > > > > > Thanks > > > > > > > > Karthik > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > > > Leo Neumeyer (@leoneu) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > Leo Neumeyer (@leoneu) > > > > > > > > > > > > > > > > > > > > > -- > > > > > > Leo Neumeyer (@leoneu) > > > > > >
