2009/9/21 Rafael Schloming <rafa...@redhat.com>:
> Martin Ritchie wrote:
>>
>> Hi Rafi,
>>
>> I saw the syncDispatchQueue method but I don't think waiting for the
>> _queue to clear is not the right solution, IMO, for rollback, even on
>> 0-10. When rollback is called you don't want the dispatcher to process
>> any more messages. Your client may have MessageListeners setup that
>> will take a long time to process, so the Dispatcher should stop
>> processing messages and perform the Rollback.
>>
>> I've attached a new test for RollbackOrderTest that blocks because the
>> syncDispatchQueue waits for the Dispatcher to empty the _queue.
>> However, when called via the MessageListener.onMessage(), you end up
>> blocking the Dispatcher.
>
> That's a good point, however won't this also be an issue in the design
> you're proposing? Regardless of which thread actually does the rollback
> processing, rollback needs to block until that processing is complete, and
> if you're waiting for the "ServiceRequest" to be processed from the thread
> that is supposed to process it then you have essentially the same deadlock
> you've attached below.
>
>> I think extracting the Dispatcher will make it clearer to show that
>> the message processing varies in each protocol and will allow the
>> Session classes to focus on the creation and control of
>> Consumer/Producers. This will allow Dispatchers for each protocol to
>> be cleaner and highlight the protocol differences at failover; A 0-8
>> Dispatcher that simply drops all pending messages compared to the 0-10
>> Dispatcher that attempts to process all the messages it has received.
>
> I don't actually think any of this needs to be (or should be) protocol
> specific. AFAICT there's no relevant difference in the protocol semantics
> here, these issues are common to any protocol that does prefetch. For
> example, the choice of whether to drop or process pending messages in a
> given scenario could be made either way for any protocol version.
>
>> I think it is a significant change but I think it is worth it as it
>> will improve the readability of the Client code as well as improving
>> the testability. Currently AMQSession is not exactly easy to unit test
>> so splitting it in to more discrete components and unit testing them
>> will be beneficial.
>>
>> The change boils down to:
>> - Extract Dispatcher Logic to Dispatcher_0_8 , Dispatcher_0_10
>> - Update AMQSession to use new Dispatcher
>> - Update Dispatcher to be able to stop processing the _queue of
>> messages and perform rollback.
>
> I agree the client is badly in need of some improvements in maintainability
> and readability, however in this particular case I don't think moving the
> rollback processing from one thread to another actually improves the
> situation significantly. Fundamentally the rollback logic needs to be
> performed from at least two different threads, the dispatcher thread and the
> application thread. I suspect in order do this properly we really need to
> stop thinking in terms of code being associated with a given thread, and
> think instead about what locks we have, what data structures those locks
> protect, and which locks need to be held in order to execute a given piece
> of code.
>
> In my experience the number one issue with the client is deadlocks and race
> conditions stemming from the large number of haphazardly defined
> locks/conditions that are littered throughout the client code. I think
> before we can safely and productively move large chunks of code around we
> really need to have some sort plan for documenting and simplifying the
> locking situation. Really we need to be able to articulate exactly what
> locks the client has, what data structure(s) each lock protects, and what
> order should be used to acquire multiple locks when necessary.
>
> --Rafael
>
> ---------------------------------------------------------------------
> Apache Qpid - AMQP Messaging Implementation
> Project:      http://qpid.apache.org
> Use/Interact: mailto:dev-subscr...@qpid.apache.org

I had a brief call with Rob/Rafi to discuss the changes I had proposed
to the client and the concerns. To summarise the concerns, which I
think we all feel, the Java client is quite fragile and the number of
locks means that it is difficult to reason that changes will not
introduce a new race condition or subtly change an existing one.

Rafi's recent work on the Python client leads him to believe that we
can do message delivery in a much simpler way which will allow us to
reduce the number of locks we need in the client.

So with that in mind, spending the effort to refactor the current
client to make it more maintainable may not be justified. As a result
we discussed an alternative approach documented here:
http://cwiki.apache.org/confluence/display/qpid/0.6+Java+Client+Dispatcher+Changes+-+Alternative

This approach will increase code reused between existing protocol
versions. Reducing the instances were a protocol version does a
particular feature in its own way, when there is no need to have a
protocol specific version, will help reduce the complexity of the
code. It should also make it much clearer what is protocol specific so
when we come to add further protocol versions we have a solid
codebase.

If anyone has any comments on what I am proposing to do then we can
discuss it either on the Wiki or here. Either way I shall capture the
comments at the end of the wiki so we do not lose any contributions.

Cheers

Martin
-- 
Martin Ritchie

---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:dev-subscr...@qpid.apache.org

Reply via email to