ok, we'll try this out then.
One question about the regression, would it occur if the 2 nodes are in
different Solaris containers (both having different IPs) but on the same
physical host?

Thanks a lot!
Wong


On Wed, Aug 26, 2009 at 10:39 AM, Filip Hanik - Dev Lists <
devli...@hanik.com> wrote:

> hi Wong, yes, that one does implement a higher level of thread safety, and
> most likely would resolve your problem.
> With 6.0.20, there is a regression where tomcat nodes on the same host wont
> discover each other
> https://issues.apache.org/bugzilla/show_bug.cgi?id=47308
>
> Filip
>
>
> On 08/25/2009 07:22 PM, CS Wong wrote:
>
>> A brief look through "svn log
>>
>> http://svn.apache.org/repos/asf/tomcat/trunk/java/org/apache/catalina/ha/session/DeltaRequest.java
>> "
>> turns up this:
>> ------------------------------------------------------------------------
>> r618823 | fhanik | 2008-02-06 07:29:56 +0800 (Wed, 06 Feb 2008) | 3 lines
>>
>> Remove synchronization on the DeltaRequest object, and let the object that
>> manages the delta request (session/manager) to handle the locking
>> properly,
>> using the session lock
>> There is a case with a non sticky load balancer where using synchronized
>> and
>> a lock (essentially two locks) can end up in a dead lock
>> ------------------------------------------------------------------------
>>
>> This is the only one where the commit comments seem to indicate anything
>> related to my issue. Given that 6.0.14 was released on 14 Aug 2007 (
>> http://www.mail-archive.com/annou...@apache.org/msg00386.html), it may be
>> applicable.
>>
>> Would just like to know your opinion, is it likely that this is the issue
>> I'm facing? Thanks!
>>
>> Wong
>>
>>
>> On Wed, Aug 26, 2009 at 8:48 AM, CS Wong<lilw...@gmail.com>  wrote:
>>
>>
>>
>>> Thanks, Filip.
>>> I'm running 6.0.14 right now. Would you have any idea whether any changes
>>> in the code since then would have fixed something like this? I can try to
>>> push for an upgrade to 6.0.20 but the app owners would probably want to
>>> know
>>> whether it would be fixed for sure since they have to go through a rather
>>> troublesome round of testing which takes up quite a bit of time. It helps
>>> that they know that the problem won't reoccur once this has been done.
>>>
>>> Thanks,
>>> Wong
>>>
>>>
>>> On Tue, Aug 25, 2009 at 11:35 PM, Filip Hanik - Dev Lists<
>>> devli...@hanik.com>  wrote:
>>>
>>>
>>>
>>>> I've taken a look at the code.
>>>> The fix for this is easy, but it doesn't explain why it happens. This is
>>>> a
>>>> concurrency issue, but if you're not running the latest tomcat version,
>>>> then
>>>> it could already have been fixed.
>>>>
>>>> best
>>>> Filip
>>>>
>>>>
>>>> On 08/25/2009 01:55 AM, CS Wong wrote:
>>>>
>>>>
>>>>
>>>>> Hi Michael,
>>>>> The logs are the bit that went haywire. The applications at this point
>>>>> still
>>>>> work but often, there's not enough time to troubleshoot much else. The
>>>>> logs
>>>>> can increase by 5-6GB in a matter of an hour or so and hence, we often
>>>>> just
>>>>> kill the service (normal shutdown.sh doesn't respond any more at this
>>>>> point,
>>>>> we have to kill -9 it) in panic and delete the logs before the entire
>>>>> server
>>>>> goes kaboom. This time, I managed to tail out some of the logs, for
>>>>> which
>>>>> I
>>>>> pasted an extract (same repeating pattern of errors):
>>>>>
>>>>> Aug 25, 2009 11:44:02 AM org.apache.catalina.ha.session.DeltaRequest
>>>>> reset
>>>>> SEVERE: Unable to remove element
>>>>> java.util.NoSuchElementException
>>>>> at java.util.LinkedList.remove(LinkedList.java:788)
>>>>> at java.util.LinkedList.removeFirst(LinkedList.java:134)
>>>>> at
>>>>>
>>>>> org.apache.catalina.ha.session.DeltaRequest.reset(DeltaRequest.java:201)
>>>>> at
>>>>>
>>>>>
>>>>> org.apache.catalina.ha.session.DeltaRequest.execute(DeltaRequest.java:195)
>>>>> at
>>>>>
>>>>>
>>>>> org.apache.catalina.ha.session.DeltaManager.handleSESSION_DELTA(DeltaManager.java:1364)
>>>>> at
>>>>>
>>>>>
>>>>> org.apache.catalina.ha.session.DeltaManager.messageReceived(DeltaManager.java:1320)
>>>>> at
>>>>>
>>>>>
>>>>> org.apache.catalina.ha.session.DeltaManager.messageDataReceived(DeltaManager.java:1083)
>>>>> at
>>>>>
>>>>>
>>>>> org.apache.catalina.ha.session.ClusterSessionListener.messageReceived(ClusterSessionListener.java:87)
>>>>> at
>>>>>
>>>>>
>>>>> org.apache.catalina.ha.tcp.SimpleTcpCluster.messageReceived(SimpleTcpCluster.java:916)
>>>>> at
>>>>>
>>>>>
>>>>> org.apache.catalina.ha.tcp.SimpleTcpCluster.messageReceived(SimpleTcpCluster.java:897)
>>>>> at
>>>>>
>>>>>
>>>>> org.apache.catalina.tribes.group.GroupChannel.messageReceived(GroupChannel.java:264)
>>>>> at
>>>>>
>>>>>
>>>>> org.apache.catalina.tribes.group.ChannelInterceptorBase.messageReceived(ChannelInterceptorBase.java:79)
>>>>> at
>>>>>
>>>>>
>>>>> org.apache.catalina.tribes.group.interceptors.TcpFailureDetector.messageReceived(TcpFailureDetector.java:110)
>>>>> at
>>>>>
>>>>>
>>>>> org.apache.catalina.tribes.group.ChannelInterceptorBase.messageReceived(ChannelInterceptorBase.java:79)
>>>>> at
>>>>>
>>>>>
>>>>> org.apache.catalina.tribes.group.ChannelInterceptorBase.messageReceived(ChannelInterceptorBase.java:79)
>>>>> at
>>>>>
>>>>>
>>>>> org.apache.catalina.tribes.group.ChannelInterceptorBase.messageReceived(ChannelInterceptorBase.java:79)
>>>>> at
>>>>>
>>>>>
>>>>> org.apache.catalina.tribes.group.ChannelCoordinator.messageReceived(ChannelCoordinator.java:241)
>>>>> at
>>>>>
>>>>>
>>>>> org.apache.catalina.tribes.transport.ReceiverBase.messageDataReceived(ReceiverBase.java:225)
>>>>> at
>>>>>
>>>>>
>>>>> org.apache.catalina.tribes.transport.nio.NioReplicationTask.drainChannel(NioReplicationTask.java:188)
>>>>> at
>>>>>
>>>>>
>>>>> org.apache.catalina.tribes.transport.nio.NioReplicationTask.run(NioReplicationTask.java:91)
>>>>> at
>>>>>
>>>>>
>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>>>>> at
>>>>>
>>>>>
>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>>>>> at java.lang.Thread.run(Thread.java:619)
>>>>>
>>>>> Wong
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Aug 25, 2009 at 3:36 PM, Michael Ludwig<m...@as-guides.com>
>>>>>  wrote:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> CS Wong schrieb:
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>> Periodically, I'm getting problems with my Tomcat 6 cluster (2
>>>>>>> nodes).
>>>>>>> One of the nodes would just go haywire
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> Could you elaborate on what "going haywire" means?
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> Below, you write:
>>>>>>
>>>>>>  [The NoSuchElementException is] the only thing that it shows. The
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>> other node in the cluster is still active at this time. There's
>>>>>>> nothing to do but to restart. The large amount of logs has caused
>>>>>>> disk space issues more than a couple of times too.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> So is that server not active any more? Unresponsive? Hyperactive
>>>>>> writing
>>>>>> to the log file? Looping?
>>>>>>
>>>>>>  and generate a ton of logs repeating the following:
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>> Aug 25, 2009 11:44:10 AM org.apache.catalina.ha.session.DeltaRequest
>>>>>>> reset
>>>>>>> SEVERE: Unable to remove element
>>>>>>> java.util.NoSuchElementException
>>>>>>>        at java.util.LinkedList.remove(LinkedList.java:788)
>>>>>>>        at java.util.LinkedList.removeFirst(LinkedList.java:134)
>>>>>>>        at
>>>>>>>
>>>>>>>
>>>>>>> org.apache.catalina.ha.session.DeltaRequest.reset(DeltaRequest.java:201)
>>>>>>>        at
>>>>>>>
>>>>>>>
>>>>>>> org.apache.catalina.ha.session.DeltaRequest.execute(DeltaRequest.java:195)
>>>>>>>        at
>>>>>>>
>>>>>>>
>>>>>>> org.apache.catalina.ha.session.DeltaManager.handleSESSION_DELTA(DeltaManager.java:1364)
>>>>>>>        at
>>>>>>>
>>>>>>>
>>>>>>> org.apache.catalina.ha.session.DeltaManager.messageReceived(DeltaManager.java:1320)
>>>>>>>        at
>>>>>>>
>>>>>>>
>>>>>>> org.apache.catalina.ha.session.DeltaManager.messageDataReceived(DeltaManager.java:1083)
>>>>>>>        at
>>>>>>>
>>>>>>>
>>>>>>> org.apache.catalina.ha.session.ClusterSessionListener.messageReceived(ClusterSessionListener.java:87)
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> I only found this, which seems to have led you here:
>>>>>>
>>>>>> http://stackoverflow.com/questions/1326336/
>>>>>>
>>>>>> Maybe it is helpful to others who know about Tomcat internals.
>>>>>>
>>>>>> --
>>>>>> Michael Ludwig
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
>>>>>> For additional commands, e-mail: users-h...@tomcat.apache.org
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
>>>> For additional commands, e-mail: users-h...@tomcat.apache.org
>>>>
>>>>
>>>>
>>>>
>>>
>>
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> For additional commands, e-mail: users-h...@tomcat.apache.org
>
>

Reply via email to