Problems with /balancer-manager

2008-05-15 Thread Ahmed Musa
Hi,

i have the following situation

Apache is balancing requests to backend JBoss Server . Everything (the 
balancing of requests to the webcontainer (tomcat) of jboss)works fine - except 
i cannot get the balancer-manager working.

Of course the GUI appears but after clicking on a worker link nothing happens.

Apache 2.2.3 on Suse Linux Enterprise Version 10

Proxy balancer://portal
Order deny,allow
Allow from all
BalancerMember ajp://lx-tpor01..xxx.xx:8009/portal route=jboss11
BalancerMember ajp://lx-tpor01..xxx.xx:18009/portal 
route=jboss12
and so on...
/Proxy

ProxyPass /portal balancer://portal stickysession=JSESSIONID 
lbmethod=byrequests nofailover=Off


ProxyPass /balancer-manager !
Location /balancer-manager
SetHandler balancer-manager
Order Deny,Allow
Deny from all
Allow from xx
/Location

I got the follwing gui

LoadBalancer Status for balancer://portal
StickySession Timeout FailoverAttempts Method JSESSIONID 0 7 byrequests

Worker URL Route RouteRedir Factor Status
ajp://lx-tpor01..xxx.xx:8009/portal jboss11  1 Ok
ajp://lx-tpor01..xxx.xx:18009/portal jboss12  1 Ok

and so on

but if i disable one Jboss instance the status remains on ok, and if i click on 
a worker url i don't get the possibility to edit the attribute - only the url 
is changing without any change in the gui.
also when i click on the balancer nothing happens.

i appreciate any help - thanxs in advance
ahmed

-- 
Pt! Schon vom neuen GMX MultiMessenger gehört?
Der kann`s mit allen: http://www.gmx.net/de/go/multimessenger

-
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Problems with /balancer-manager

2008-05-15 Thread Ahmed Musa
Hallo Rainer,
Thanxs for your quick answer - i will talk to my responsible collegue about 
upgrading Apache - could bee a Problem because its is in the Suse Bundle.
I have also added the question to the apache mailinglist - maybe i will get a 
tip ...from there.
Thanxs for your answer und ebenfalls schöne Grüße nach Bonn
ciao ahmed
 Original-Nachricht 
 Datum: Thu, 15 May 2008 12:25:05 +0200
 Von: Rainer Jung [EMAIL PROTECTED]
 An: Tomcat Users List users@tomcat.apache.org
 Betreff: Re: Problems with /balancer-manager

 Hallo Ahmed,
 
 Ahmed Musa wrote:
  Hi,
  
  i have the following situation
  
  Apache is balancing requests to backend JBoss Server . Everything (the
 balancing of requests to the webcontainer (tomcat) of jboss)works fine -
 except i cannot get the balancer-manager working.
  
  Of course the GUI appears but after clicking on a worker link nothing
 happens.
  
  Apache 2.2.3 on Suse Linux Enterprise Version 10
  
  Proxy balancer://portal
  Order deny,allow
  Allow from all
  BalancerMember ajp://lx-tpor01..xxx.xx:8009/portal
 route=jboss11
  BalancerMember ajp://lx-tpor01..xxx.xx:18009/portal
 route=jboss12
  and so on...
  /Proxy
  
  ProxyPass /portal balancer://portal stickysession=JSESSIONID
 lbmethod=byrequests nofailover=Off
  
  
  ProxyPass /balancer-manager !
  Location /balancer-manager
  SetHandler balancer-manager
  Order Deny,Allow
  Deny from all
  Allow from xx
  /Location
  
  I got the follwing gui
  
  LoadBalancer Status for balancer://portal
  StickySession Timeout FailoverAttempts Method JSESSIONID 0 7 byrequests
  
  Worker URL Route RouteRedir Factor Status
  ajp://lx-tpor01..xxx.xx:8009/portal jboss11  1 Ok
  ajp://lx-tpor01..xxx.xx:18009/portal jboss12  1 Ok
  
  and so on
  
  but if i disable one Jboss instance the status remains on ok, and if i
 click on a worker url i don't get the possibility to edit the attribute -
 only the url is changing without any change in the gui.
  also when i click on the balancer nothing happens.
  
  i appreciate any help - thanxs in advance
  ahmed
 
 I just tried it with httpd 2.2.8 and it works for me. Although there 
 seeems to be no fit in the httpd changelog, 2.2.3 is a little early in 
 the 2.2.x release cycle and the balancer and balancer manager were new 
 in 2.2.x, so if nothing else helps, upgrading to a more recent 2.2.x 
 (like 2.2.8 or 2.2.9 expected in a few weeks) would be worth trying. I 
 think I remember having it used with 2.2.6, but I didn't try it with 
 2.2.3 or earlier.
 
 Others?
 
 Apart from that: the httpd users list might be a better place to ask, 
 because this seems not to be related to some difficult AJP13 stuff 
 instead it seems to be a more general httpd mod_proxy_* issue.
 
 Regards und Grüße nach Wien
 
 Rainer
 
 -
 To start a new topic, e-mail: users@tomcat.apache.org
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]

-- 
Pt! Schon vom neuen GMX MultiMessenger gehört?
Der kann`s mit allen: http://www.gmx.net/de/go/multimessenger

-
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: JkRequestLogFormat Options

2008-02-29 Thread Ahmed Musa
Hallo Fred,

A - you're right - the missing Letter was the fault - i checked this 
command so many times -but don't see this. 
Thanks a lot
best
ahmed
 Original-Nachricht 
 Datum: Thu, 28 Feb 2008 12:23:25 -0800 (PST)
 Von: fredk2 [EMAIL PROTECTED]
 An: users@tomcat.apache.org
 Betreff: Re: JkRequestLogFormat Options

 
 Hi,
 
 btw, in your log format line you have %{JK_REQUEST_DURATON}n instead of
 %{JK_REQUEST_DURATION}n see the missing I.
 
 I am using 1.2.25 and i get times alike 0.0275 when using Apache 2.2
 
 Rgds, Fred
 
 
 Ahmed Musa wrote:
  
  Hallo,
  
  I am logging the mod_jk Output through the Apache access_log - as
 written
  in the reference found under
  http://tomcat.apache.org/connectors-doc/reference/apache.html
  
  Because i want to get clearness about what exactly is going on in our
  system i use the following LogFormat:
  
  LogFormat %h %l %u %t \%r\ %s %b \%{Referer}i\ \%{User-Agent}i\
  \%{Cookie}i\ \%{Set-Cookie}o\ %{pid}P %{tid}P%T 
  %{JK_WORKER_NAME}n %{JK_REQUEST_DURATON}n %{JK_WORKER_ROUTE}n
  %{JK_LB_FIRST_NAME}n %{JK_LB_FIRST_BUSY}n %{JK_LB_FIRST_VALUE}n
  %{JK_LB_FIRST_ACCESSED}n %{JK_LB_FIRST_READ}n
 %{JK_LB_FIRST_TRANSFERRED}n
  %{JK_LB_FIRST_ERRORS}n %{JK_LB_FIRST_ACTIVATION}n
  %{JK_LB_FIRST_STATE}n %{JK_LB_LAST_NAME}n mod_jk_log
  
  ...everthing works fine except the Options responsible for the Request
  Duration.
  
  Mostly neither %T nor %{JK_REQUEST_DURATON}n have a Value (%T mostly is
 0
  an the other Parameter is -).
  At some Requests i found the %T has a value like for example 2 or 3.. -
  and JK_REQUEST DURATION has -
  or %T is 0 and JK_REQUEST_DURATION has an value like 2 or 3 ...
  
  First - why are there not values at each request ?
  Second -i think both Options are measuring the same Value - why they are
  not the same ?
  Third - why they are not showing seconds.microseconds as written in the
  reference but only (I think so) rounded seconds.
  
  We use mod_jk 1.2.26
  
  Thanks for help
  Best 
  ahmed
  -- 
  Psst! Geheimtipp: Online Games kostenlos spielen bei den GMX Free Games!
  http://games.entertainment.web.de/de/entertainment/games/free
  
  -
  To start a new topic, e-mail: users@tomcat.apache.org
  To unsubscribe, e-mail: [EMAIL PROTECTED]
  For additional commands, e-mail: [EMAIL PROTECTED]
  
  
  
 
 -- 
 View this message in context:
 http://www.nabble.com/JkRequestLogFormat-Options-tp15736214p15745192.html
 Sent from the Tomcat - User mailing list archive at Nabble.com.
 
 
 -
 To start a new topic, e-mail: users@tomcat.apache.org
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]

-- 
Der GMX SmartSurfer hilft bis zu 70% Ihrer Onlinekosten zu sparen! 
Ideal für Modem und ISDN: http://www.gmx.net/de/go/smartsurfer

-
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: JkRequestLogFormat Options

2008-02-29 Thread Ahmed Musa
Hallo Rainer,

thanks for your Input
- of course i have to change my FIRST and LAST variant (the FIRST_NAME i will 
use to check if the worker has changed) - but you're right - i am more 
interested in the LAST values.

Changed %T to %D - works fine, thanks

We upgraded to 1.2.26 last week - but the Values for ROUTE and DURATION are the 
same than before (1.2.25) - and we haven't set JkRequestLogFormat 
explicitly.(of course i wrote DURATION without I - now it's ok).

thanks
best ahmed



 Original-Nachricht 
 Datum: Thu, 28 Feb 2008 23:52:40 +0100
 Von: Rainer Jung [EMAIL PROTECTED]
 An: Tomcat Users List users@tomcat.apache.org
 Betreff: Re: JkRequestLogFormat Options

 In addition to Freds remark:
 
 Usually you want the LAST variant, instead of the FIRST variant. The 
 two are the same, if a loab balancer only tries one worker, but in case 
 of an error and failover, FIRST will be the first worker tried (so the 
 failed one) and LAST the last one, so usually the successful one (unless 
 all workers fail).
 
 %T: response time in seconds, and I think it always gets rounded down. 
 So usually not very useful
 
 Instead you could use the httpd standard %D, which is response time in 
 microseconds.
 
 Last remark: until JK 1.2.25 the variables JK_WORKER_ROUTE and 
 JK_REQUEST_DURATION where only filled, if some JkRequestLogFormat was 
 set. In your version 1.2.26 both of them should get set even with a 
 JkRequestLogFormat (but only, if the request gets handled by mod_jk, so 
 not for static content, that is returned by the web server without any 
 Tomcat interaction).
 
 Regards,
 
 Rainer
 
 Ahmed Musa schrieb:
  Hallo,
  
  I am logging the mod_jk Output through the Apache access_log - as
 written in the reference found under
  http://tomcat.apache.org/connectors-doc/reference/apache.html
  
  Because i want to get clearness about what exactly is going on in our
 system i use the following LogFormat:
  
  LogFormat %h %l %u %t \%r\ %s %b \%{Referer}i\ \%{User-Agent}i\
 \%{Cookie}i\ \%{Set-Cookie}o\ %{pid}P %{tid}P%T 
  %{JK_WORKER_NAME}n %{JK_REQUEST_DURATON}n %{JK_WORKER_ROUTE}n
 %{JK_LB_FIRST_NAME}n %{JK_LB_FIRST_BUSY}n %{JK_LB_FIRST_VALUE}n
  %{JK_LB_FIRST_ACCESSED}n %{JK_LB_FIRST_READ}n
 %{JK_LB_FIRST_TRANSFERRED}n %{JK_LB_FIRST_ERRORS}n %{JK_LB_FIRST_ACTIVATION}n
  %{JK_LB_FIRST_STATE}n %{JK_LB_LAST_NAME}n mod_jk_log
  
  ...everthing works fine except the Options responsible for the Request
 Duration.
  
  Mostly neither %T nor %{JK_REQUEST_DURATON}n have a Value (%T mostly is
 0 an the other Parameter is -).
  At some Requests i found the %T has a value like for example 2 or 3.. -
 and JK_REQUEST DURATION has -
  or %T is 0 and JK_REQUEST_DURATION has an value like 2 or 3 ...
  
  First - why are there not values at each request ?
  Second -i think both Options are measuring the same Value - why they are
 not the same ?
  Third - why they are not showing seconds.microseconds as written in the
 reference but only (I think so) rounded seconds.
  
  We use mod_jk 1.2.26
  
  Thanks for help
  Best 
  ahmed
 
 -
 To start a new topic, e-mail: users@tomcat.apache.org
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]

-- 
Der GMX SmartSurfer hilft bis zu 70% Ihrer Onlinekosten zu sparen! 
Ideal für Modem und ISDN: http://www.gmx.net/de/go/smartsurfer

-
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



JkRequestLogFormat Options

2008-02-28 Thread Ahmed Musa
Hallo,

I am logging the mod_jk Output through the Apache access_log - as written in 
the reference found under
http://tomcat.apache.org/connectors-doc/reference/apache.html

Because i want to get clearness about what exactly is going on in our system i 
use the following LogFormat:

LogFormat %h %l %u %t \%r\ %s %b \%{Referer}i\ \%{User-Agent}i\ 
\%{Cookie}i\ \%{Set-Cookie}o\ %{pid}P %{tid}P%T 
%{JK_WORKER_NAME}n %{JK_REQUEST_DURATON}n %{JK_WORKER_ROUTE}n 
%{JK_LB_FIRST_NAME}n %{JK_LB_FIRST_BUSY}n %{JK_LB_FIRST_VALUE}n
%{JK_LB_FIRST_ACCESSED}n %{JK_LB_FIRST_READ}n %{JK_LB_FIRST_TRANSFERRED}n 
%{JK_LB_FIRST_ERRORS}n %{JK_LB_FIRST_ACTIVATION}n
%{JK_LB_FIRST_STATE}n %{JK_LB_LAST_NAME}n mod_jk_log

...everthing works fine except the Options responsible for the Request Duration.

Mostly neither %T nor %{JK_REQUEST_DURATON}n have a Value (%T mostly is 0 an 
the other Parameter is -).
At some Requests i found the %T has a value like for example 2 or 3.. - and 
JK_REQUEST DURATION has -
or %T is 0 and JK_REQUEST_DURATION has an value like 2 or 3 ...

First - why are there not values at each request ?
Second -i think both Options are measuring the same Value - why they are not 
the same ?
Third - why they are not showing seconds.microseconds as written in the 
reference but only (I think so) rounded seconds.

We use mod_jk 1.2.26

Thanks for help
Best 
ahmed
-- 
Psst! Geheimtipp: Online Games kostenlos spielen bei den GMX Free Games! 
http://games.entertainment.web.de/de/entertainment/games/free

-
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Questions to some mod_jk Options

2008-02-27 Thread Ahmed Musa
Hallo,
I studied the mod_jk docs and the following questions about mod_jk Options are 
haunting me - i hope wrote the questions in an understandable form and i am 
pleased of getting hints and tips.

.) retries (for LB workers) 
- At the Apache we use he prefork MPM. So how big is the connection_pool ?
because a retry of a lb-worker happens if the loadbalancer can not get a free 
connection for a member worker from the pool (Info from the doku).
Does it depends on the Apache prefork Parameters MaxClients and 
MaxRequestsPerChild ?
If it is so - we have MaxClients 500 and MaxRequestsPerChild 1 = this 
means the webserver can send/handle 500 requests ?
- is this the size of our connection_pool? - i don't think so.
On the other side we have 36 Tomcat instances - each Tomcat has - 
maxThreads=300 on the AJP connector. = ?this doesn't fit, or?
(And 3 Apache as frontend - all configured the same)
In the worker model i think the number of threads must correspond to the max 
threads of the Tomcat - but how does it work in our prefork model?

.) Why does a load-balancer retries to get a free connection for a member 
worker from the pool ? Why doesn't he use another member worker ?

.) reply_timeout - does it only work between the request and the first response 
packet or between each two response packets. Is a response packet an AJP-packet 
with 8k default size ?

.) what is the socket_timeout good for ?
We configured a connection_timeout, a prepost_timeout and a reply_timeout = i 
can't find a situation where i need an additional socket_timeout ?
And when i wants to know what happens in my system - i think i need a more 
higher level failure message to evaluate the situation - but on socket level ?

.) this question concerns to the mod_jk options retries (for normal worker) 
(hint - better to find an other Name - the same name for two different things 
makes problems when writing about) in association with the recovery_options.
= when i use the value 7 for the recovery_option - Bit 1+2+4 = i think a 
retry is only possible if the connection timeout matches.
- not on the prepost_timout and not in the situation of reply_timeout = is 
this right ?

Another question to the same topic: i have a long running sticky session - this 
means that in this session are many requests against the same Tomcat.
Will there be established a new connection for each request ? or will there be 
used the established connection for all requests?

If second - that means the established connection is used for all requests of 
the session = than a retry will not happen if during
the session the Tomcat causes Problems. (with recovery_options 7). - is this 
right?

Version mod_jk 1.2.26 (upgraded recently)  

Here my worker.properties

worker.properties

worker.list=ajp_bam,ajp_ggi,ajp_ad,ajp_svp,...,jkstatus

worker.template.type=ajp13
worker.template.lbfactor=5
worker.template.socket_keepalive=1
worker.template.connect_timeout=7000
worker.template.prepost_timeout=5000
worker.template.reply_timeout=18
worker.template.retries=20
worker.template.activation=Active
worker.template.recovery_options=7

worker.lbtemplate.type=lb
worker.lbtemplate.max_reply_timeouts=6
worker.lbtemplate.method=Session

#Produktions Worker
# AS-INETP101 - 106 - 6/6 GGI
worker.INETP1011.host=AS-INETP101.AEAT.ALLIANZ.AT
worker.INETP1011.port=65001
worker.INETP1011.reference=worker.template
many more of the same
then
worker.ajp_ad.reference=worker.lbtemplate
worker.ajp_ad.balance_workers=INETP1032,INETP1062
 many more portals
at least jkstatus

The JKMount is very simple
JkMount /* ajp_ad--- for the other portals mostly the same

The Portals are Virtual Hosts on the Apache.

Tomcat - server.xml
example
Connector port=65001 maxThreads=300 protocol=AJP/1.3 /
 Engine name=Catalina jvmRoute=INETP5021
defaultHost=default
 ..
Host name=slfinsol.com appBase=webapps unpackWARs=true
 autoDeploy=false deployOnStartup=false xmlValidation=false
 xmlNamespaceAware=false
  Aliaswww.slfinsol.com/Alias
  Aliasweb1.slfinsol.com/Alias
  ...
  Aliastestweb.slfinsol.com/Alias
  .
  Valve className=org.apache.catalina.valves.AccessLogValve
   directory=logs prefix=swl_access_log. suffix=.txt
   pattern=common resolveHosts=false /
 Valve className=at.allianz.tomcat.valve.RequestTimeValve/
 Valve 
className=at.allianz.tomcat.valve.WebcollaborationWorkaroundValve/
   Context path= docBase=swl /
   Context path=/monitor5 docBase=monitor /
   Context path=/swl docBase=swl /
/Host

thanxs for your time reading this and maybe giving tipps - 
with kind regards
ahmed musa
-- 
Ist Ihr Browser Vista-kompatibel? Jetzt die neuesten 
Browser-Versionen downloaden: http://www.gmx.net/de/go/browser

-
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: RE: mod_jk Problems - - worker went to error state and dont recover

2008-02-21 Thread Ahmed Musa
Hallo Luke,

Here the information from tomcat.apache.org

Unsubscription: Send a blank email to  [EMAIL PROTECTED]
Digest unsubscription:  Send a blank email to [EMAIL PROTECTED]

best ahmed

 Original-Nachricht 
 Datum: Thu, 21 Feb 2008 09:27:31 -
 Von: [EMAIL PROTECTED]
 An: users@tomcat.apache.org
 Betreff: RE: mod_jk Problems - - worker went to error state and dont recover

 All
 
 Apologies, this is unrelated. How do I unsubscribe from this mailing
 list, I thought it would be useful and small but its overwhelming my
 inbox?
 
 Thanks in Advance.
 
 Luke Walshe
 BT Operate, HGIPCC Technical Specialist
 Telephone: +44 (0)1314483482, Email: [EMAIL PROTECTED] 
 
 -Original Message-
 From: Ahmed Musa [mailto:[EMAIL PROTECTED] 
 Sent: 21 February 2008 09:25
 To: Tomcat Users List
 Subject: Re: mod_jk Problems - - worker went to error state and dont
 recover
 
 Hello Rainer,
 Thanks for your informations - the Situation gets more clear now.
 I will read again some dics - following your links and will make further
 tests also with the improved logging.
 Thanks a lot for your time
 with best regards 
 ahmed
 
  Original-Nachricht 
  Datum: Wed, 20 Feb 2008 18:59:01 +0100
  Von: Rainer Jung [EMAIL PROTECTED]
  An: Tomcat Users List users@tomcat.apache.org
  Betreff: Re: mod_jk Problems - - worker went to error state and dont
 recover
 
  Ahmed Musa wrote:
   Hello,
   Wow -thank you very much Rainer for your very quick and informative
  answer.
   I will go to 1.2.26 and think about some smoother Values for
  reply_timeout and max_reply_timeouts.
   I will search for the requests which causes the Problems - becasue i
  still log the response time in your mentioned way - but I am not sure
 that the
  Userrequests are responsible for the Situation. 
  
  One note: for Apache httpd 2.x %d is microseconds (there is no format 
  for milliseconds), for Tomcat %D is milliseconds. As long as you are 
  searching for the root cause, it might make sense to have both access 
  logs active to check about duration differences.
  
   So one further question - does mod_jk itself checks if the Backend
 is
  reachable - without userrequests? 
  
  No. Everything only works on top of user requests.
  
   When there are connections to the Backend - are they closed after
 the
  respone or are the hold open for further requests.
  
  In general hold open. There are parameters on how long they are held 
  open without more requests before they get shut down, and also how
 many 
  might be kept open even when no requests are coming in. Those are the 
  connection pool parameters, which you will find on
  
  http://tomcat.apache.org/connectors-doc/reference/workers.html
  
  Tomcat also has a connectionTimeout on the connector, which will shut 
  down a connection from the Tomcat side if it is idle for to long.
  
  If you don't want to reuse connections at all, there's also a setting
 (a 
  JkOption in Apache).
  
   Is it possible that the Checkpoint Firewall in Between can be
  responsible for the connectivity problem?
  
  It can cut a connection that's idle for too long. Since you have 
  cping/cpong active via connect_timeout and prepost_timeout, you should
 
  get a cping error message, if the connection was dropped by the
 firewall 
  during idle times and mod_jk tries to use it again. The reply timeout
 in 
  the error log indicates, that the backend isn't answering. Of course
 if 
  it takes *very* long to answer, it might be that the firewall dropped 
  the connection in between, but then the root cause would still be the 
  long response time of the backend.
  
   Another point is the not recovering of the worker. Yes, you are
 right
  - in this situation i have many reply_timeouts - but these happens in
 a
  period of time - for example 30 minutes - but the worker is still dead
 even
  then when there are no more reply_timeouts. It remains dead.
   It was necessary to restart it manually via jkstatus.
  
  I assume you are using stickyness, so when a session started on a
 node, 
  it will stay there. So when a worker is in error for a long time, all 
  new sessions will start on other nodes. If the worker is ready for 
  recovery, it needs a request, that doesn't carry a session to get
 probed 
  with this request.
  
  In jkstatus, the status of an error worker should switch to REC, when 
  mod_jk decides that it could send a non-sticky request there (to
 probe) 
  and to PRB, during the time this request is on the node, and finally 
  either to OK or back to ERR depending on the result of the request.
  
  You can log the number of errors (and accesses) that happened on the 
  node in the httpd access log. If you think that the node simply stays
 in 
  error for a long time, then the error count (and access count) should 
  stay constant. I would expect, that they do not.
  
  Have a look at how LogFormat in Apache httpd works, and then add some
 of 
  those documented

Re: mod_jk Problems - - worker went to error state and dont recover

2008-02-21 Thread Ahmed Musa
Hello Rainer,
Thanks for your informations - the Situation gets more clear now.
I will read again some dics - following your links and will make further tests 
also with the improved logging.
Thanks a lot for your time
with best regards 
ahmed

 Original-Nachricht 
 Datum: Wed, 20 Feb 2008 18:59:01 +0100
 Von: Rainer Jung [EMAIL PROTECTED]
 An: Tomcat Users List users@tomcat.apache.org
 Betreff: Re: mod_jk Problems - - worker went to error state and dont recover

 Ahmed Musa wrote:
  Hello,
  Wow -thank you very much Rainer for your very quick and informative
 answer.
  I will go to 1.2.26 and think about some smoother Values for
 reply_timeout and max_reply_timeouts.
  I will search for the requests which causes the Problems - becasue i
 still log the response time in your mentioned way - but I am not sure that the
 Userrequests are responsible for the Situation. 
 
 One note: for Apache httpd 2.x %d is microseconds (there is no format 
 for milliseconds), for Tomcat %D is milliseconds. As long as you are 
 searching for the root cause, it might make sense to have both access 
 logs active to check about duration differences.
 
  So one further question - does mod_jk itself checks if the Backend is
 reachable - without userrequests? 
 
 No. Everything only works on top of user requests.
 
  When there are connections to the Backend - are they closed after the
 respone or are the hold open for further requests.
 
 In general hold open. There are parameters on how long they are held 
 open without more requests before they get shut down, and also how many 
 might be kept open even when no requests are coming in. Those are the 
 connection pool parameters, which you will find on
 
 http://tomcat.apache.org/connectors-doc/reference/workers.html
 
 Tomcat also has a connectionTimeout on the connector, which will shut 
 down a connection from the Tomcat side if it is idle for to long.
 
 If you don't want to reuse connections at all, there's also a setting (a 
 JkOption in Apache).
 
  Is it possible that the Checkpoint Firewall in Between can be
 responsible for the connectivity problem?
 
 It can cut a connection that's idle for too long. Since you have 
 cping/cpong active via connect_timeout and prepost_timeout, you should 
 get a cping error message, if the connection was dropped by the firewall 
 during idle times and mod_jk tries to use it again. The reply timeout in 
 the error log indicates, that the backend isn't answering. Of course if 
 it takes *very* long to answer, it might be that the firewall dropped 
 the connection in between, but then the root cause would still be the 
 long response time of the backend.
 
  Another point is the not recovering of the worker. Yes, you are right
 - in this situation i have many reply_timeouts - but these happens in a
 period of time - for example 30 minutes - but the worker is still dead even
 then when there are no more reply_timeouts. It remains dead.
  It was necessary to restart it manually via jkstatus.
 
 I assume you are using stickyness, so when a session started on a node, 
 it will stay there. So when a worker is in error for a long time, all 
 new sessions will start on other nodes. If the worker is ready for 
 recovery, it needs a request, that doesn't carry a session to get probed 
 with this request.
 
 In jkstatus, the status of an error worker should switch to REC, when 
 mod_jk decides that it could send a non-sticky request there (to probe) 
 and to PRB, during the time this request is on the node, and finally 
 either to OK or back to ERR depending on the result of the request.
 
 You can log the number of errors (and accesses) that happened on the 
 node in the httpd access log. If you think that the node simply stays in 
 error for a long time, then the error count (and access count) should 
 stay constant. I would expect, that they do not.
 
 Have a look at how LogFormat in Apache httpd works, and then add some of 
 those documented in
 
 http://tomcat.apache.org/connectors-doc/reference/apache.html
 
 like:
 
 JK_LB_LAST_NAME
 JK_LB_LAST_ACCESSED
 JK_LB_LAST_ERRORS
 JK_LB_LAST_BUSY
 JK_LB_LAST_STATE
 
 using the syntax %{JK_LB_LAST_STATE}n etc.
 
  
  Another point is the learning - i read the dics - the infos on the
 apache Website i dont't find other ones - are there other ones ? - and they 
 are
 not going in depth - if you read the spec and watch the logs it is - for me
 - very hard to match the things. Also the many possibilities that mod_jk
 has to prove if there is a connection to the Backend,... - i understand them
 but check the reality in an error situation is very hard. Under matching i
 mean Which Part of the Communication sequence failed - why - and causes
 which error message.
  But i will try - and study also the mailing list..
 
 It's hard for us too (sometimes).
 
  Thank you for your time - tomorrow we will have the new version and will
 see what happens.
  
  best
  ahmed
 
 
 Regards,
 
 Rainer

mod_jk Problems - worker went to error state and dont recover

2008-02-20 Thread Ahmed Musa
Hallo to all,
After long unsuccessful research i hope someone can give me a hint to the
following problems.

Our Apache-mod_jk-Tomcat Infrastructur was running without Problems for
about one year-than since two month mod_jk errors occurs.
We upgraded the mod_jk Version, made improvements in the worker.properties
- the problems changed and get less but sometimes they appear further on.
 
It seems that the mod_jk worker loose the connection to their
Tomcat-Backendserver - there are messages in the mod_jk log Files which
points in this direction.
Normally this seems not to be a big problem - but under certain conditions
(which ?) the worker goes to an error state and cannot recover itself- must
be done manually.

Problem 1: The Tomcats are reachable - unknown why the workers think the
server is dead ?
Problem 2: I have no idea why the worker goes to an error state and cannot
recover.
Problem3: I miss explanations of logged messages - i read the messages -
but cannot match them to the situation - when does a worker post this
messages

[Wed Feb 20 10:04:01.889 2008] [19237:3086010048] [info]
jk_handler::mod_jk.c (2270): Aborting connection for worker=ajp_ggi
[Wed Feb 20 10:04:39.799 2008] [19294:3086010048] [error]
ajp_get_reply::jk_ajp_common.c (1623): (INETP1011) Timeout with waiting
reply from tomca
t. Tomcat is down, stopped or network problems (errno=110)
[Wed Feb 20 10:04:39.799 2008] [19294:3086010048] [error]
ajp_service::jk_ajp_common.c (2034): (INETP1011) receiving reply from
tomcat failed with
out recovery in send loop attempt=0
[Wed Feb 20 10:04:41.799 2008] [19294:3086010048] [error]
service::jk_lb_worker.c (1105): unrecoverable error 504, request failed.
Tomcat failed i
n the middle of request, we can't recover to another instance.

- Which Timeout - how does mod_jk think Tomcat is down ? Where can i found
details to errno=110 ?...
- receiving reply from tomcat failed with out recovery in send loop
attempt=0  - ? with out recovery in send loop - means?
- unrecoverable error 504 - details to this error ?

Ok - i turn the logging level to debug - the course of events get more
clear - but also more questions appear - there are socket numbers - which
sockets - what are these numbers e.g
will be shutting down socket 35 for worker INETP1021 - The sockets are good
for ? - how many are there/per worker ? can i configure them ?

= Generally -How can i solve such problems - i tried to look into the
mod_jk code - searching for error codes, error messages - but cannot find
some relevant informations,
- i am studying the log Files - but don't find out what really happens.

So  - maybe someone has an idea why the worker think that the
corresponding Tomcat is dead, and why he will not recover by itself. !

And i am also searching for tips how i can help myself  - and where to
find something about the error codes, messages,..in mod_jk

thanks for your attention
Best
ahmed musa (writing from vienna)
 
Current Infrastructur
We have 3 Apache Webserver (2.2.6) -based on CentOS release 4.3 /
Kernelversion 2.6.9-34
In front of the Webserver there are two (two Locations) HW-Loadbalancer
(but they have no role in this story)
The Webservers are hosted at our ISP.
 
The Webserver balance the requests via mod_jk (Version 1.2.25) for approx.
10 Webapps to 18 Backend-Tomcatserver (Bladeserver - because of underlying
Application-Parts the OS ist Windows 2003 Server - a long story not worth
to explain :-) ). The Tomcatserver gain Data via Requests against
DB2 Server/DB2-Databases on the Mainframe. The Tomcatserver are Inhouse -
and were rebooted nightly because of automated Deployment processes.

Between the Webserver and the Tomcatserver is a Checkpoint Firewall.
 
All webapps are deployed on all Tomcats - only mod_jk manages the requests
to certain Tomcat- instances.
(on one Bladeserver there are two identically Tomcat Instances running).
 
Versions: Tomcat - 5.5.17_11, JDK 1.5.0_11-b03. The requests against the
public Website(s) are normal short living requests - not many -
The most Webapps (Portals) need a login, have a strong focus on business
logic - so the instances are big (many MBs in RAM), the sessions are sticky
and the session timeout is 20 minutes. But there are also less requests. To
the User requests - Monitoring requests from our ISP are added.
The Problems appears at Servers/Portals which very less Userrequests.

worker.properties
worker.list=ajp_bam,ajp_ggi,ajp_ad,ajp_svp,...,jkstatus

worker.template.type=ajp13
worker.template.lbfactor=5
worker.template.socket_keepalive=1
worker.template.connect_timeout=7000
worker.template.prepost_timeout=5000
worker.template.reply_timeout=12
worker.template.retries=6
worker.template.activation=Active
worker.template.recovery_options=7

worker.lbtemplate.type=lb
worker.lbtemplate.max_reply_timeouts=6
worker.lbtemplate.method=Session

#Produktions Worker
# AS-INETP101 - 106 - 6/6 GGI
worker.INETP1011.host=AS-INETP101.AEAT.ALLIANZ.AT
worker.INETP1011.port=65001

Re: mod_jk Problems - - worker went to error state and dont recover

2008-02-20 Thread Ahmed Musa
 of max_reply_timeouts in 1.2.25 was 
 wrong, so you need to go to 1.2.26 to get it working right.
 
 See:
 
 http://issues.apache.org/bugzilla/show_bug.cgi?id=43229
 
 Caution: this does *not* explain, why the backends are not automatically 
 recovered after a minute of error condition. Maybe you have times, where 
 you getr to many of those reply_timeouts (see log file), and although we 
 recover after a minute the backend almost immediately goes back into 
 error status.
 
  - Which Timeout - how does mod_jk think Tomcat is down ? Where can i
 found details to errno=110 ?...
 
 reply_timeout, see above and also
 
 http://tomcat.apache.org/connectors-doc/generic_howto/timeouts.html
 
 errno: a standard unix feature. The numbers are platform dependent. I 
 would assume in your case
 
 ETIMEDOUT   110 /* Connection timed out */
 
 so no wonder, that's exactly what we expect (and doesn't tell us the 
 reason, i.e. what's wrong on the *backend* taking that long for a
 response).
 
  - receiving reply from tomcat failed with out recovery in send loop
 attempt=0  - ? with out recovery in send loop - means?
 
 That your configuration doesn't allow us to send the request to another 
 backend. recovery_options 7 include: if mod_jk was able to send the 
 request to a backend, do not try to send it to another backend in case 
 of an error during the response handling. Even if you would allow 
 sending to another backend, it would not help with *not* putting the 
 worker into error state. More likely would be, that you would put all 
 workers into error state, because all of them might run into the same 
 timeout, one after the other.
 
  - unrecoverable error 504 - details to this error ?
 
 That's simply how we return the situation back to the client (browser).
 
  
  Ok - i turn the logging level to debug - the course of events get
  more
  clear - but also more questions appear - there are socket numbers -
  which sockets - what are these numbers e.g will be shutting down socket
  35 for worker INETP1021 - The sockets are good for ? - how many are
  there/per worker ? can i configure them ?
 
 Should not be the problem here. For apache httpd if you do *not* 
 configure anything, we automatically choose the number of httpd threads 
 as the maximum number of connections. No need to change anything here.
  
  = Generally -How can i solve such problems - i tried to look into
  the
  mod_jk code - searching for error codes, error messages - but cannot
  find some relevant informations, - i am studying the log Files - but
  don't find out what really happens.
 
 Post to the list. Improve our dics.
 
 The error message contains the word timeout and reply and you have a 
 reply_timeout.
 
 Long running requests are a frequent problem. If you want to get rid of 
 them, start by adding response times to your httpd and your tomcat 
 access log format (%D). Then have a look, which URLs are producing long 
 running requests, during what time of day are they happening etc. This 
 might give you a clue about the reasons.
 
 And if they are very frequent: do Java Thread Dumps of your backends and 
 analyze them.
 
  So - maybe someone has an idea why the worker think that the
  corresponding Tomcat is dead, and why he will not recover by itself. !
 
 Tomecat is dead: from the point of view of mod_jk it simply means: we 
 didn't get an answer, when we expected one. Details depend on the 
 additional log lines (could not connect, reply timeout etc.).
 
  And i am also searching for tips how i can help myself - and where to
  find something about the error codes, messages,..in mod_jk
  
  thanks for your attention
  Best
  ahmed musa (writing from vienna)
 
 
 Regards,
 
 Rainer
 
  Current Infrastructur
  We have 3 Apache Webserver (2.2.6) -based on CentOS release 4.3
 /Kernelversion 2.6.9-34
  In front of the Webserver there are two (two Locations) HW-Loadbalancer
 (but they have no role in this story)
  The Webservers are hosted at our ISP.
   
  The Webserver balance the requests via mod_jk (Version 1.2.25) for
  approx. 10 Webapps to 18 Backend-Tomcatserver (Bladeserver - because of
  underlying Application-Parts the OS is Windows 2003 Server - a long
  story not worth to explain :-) ). The Tomcatserver gain Data via
  Requests against DB2 Server/DB2-Databases on the Mainframe. The
  Tomcatserver are Inhouse -and were rebooted nightly because of automated
  Deployment processes.
  
  Between the Webserver and the Tomcatserver is a Checkpoint Firewall. 
  All webapps are deployed on all Tomcats - only mod_jk manages the
  requests to certain Tomcat- instances.
  (on one Bladeserver there are two identically Tomcat Instances
  running).
  
  Versions: Tomcat - 5.5.17_11, JDK 1.5.0_11-b03. The requests against
  the public Website(s) are normal short living requests - not many - The
  most Webapps (Portals) need a login, have a strong focus on business
  logic - so the instances are big (many MBs in RAM), the sessions