On 15/04/2012 23:53, Tom Eastep wrote:
>> I'm seeing a regression in stale lock handling.  If there is a stale
>> lock at boot is seems to deadlock forever (which is inconvenient...).
>> If I start it via the command line it seems to time out after some large
>> number of seconds and continue.  Old behaviour (4.5.1.1) was to somehow
>> immediately burst the lock if it was stale.
> Lock handling hasn't changed in years; so what you are seeing must be a
> side effect of something else.

Hmm, it's possible.  I'm just debugging another problem where ipset 
takes some many seconds to run if reverse dns isn't available (eg 
iptables -P OUTPUT DROP), eg this takes some 10s of seconds in this 
state... (the change was I tried to lock down iptables at boot about the 
same time I updated shorewall, durr)
     ipset create cp1 bitmap:ip,mac range 192.168.111.0/24



> What are your settings for:
>
>       MUTEX_TIMEOUT
60


>       SUBSYSLOCK
/var/lock/subsys/shorewall

However, the message I get says something about "stale lock on 
/var/lib/shorewall/lock", so I think it's something different?


> What are the contents of your shorewallrc file (normally
> /usr/share/shorewall/shorewallrc)?

HOST=linux
PREFIX=/usr
SHAREDIR=/usr/share
LIBEXECDIR=${PREFIX}/share
PERLLIBDIR=${PREFIX}/share/shorewall
CONFDIR=/etc
SBINDIR=/sbin
MANDIR=/usr/share/man
INITDIR=etc/init.d
INITSOURCE=init.sh
INITFILE=$PRODUCT
AUXINITSOURCE=
AUXINITFILE=
SYSTEMD=
SYSCONFFILE=
SYSCONFDIR=
ANNOTATED=
VARDIR=/var/lib

Can you confirm this looks sensible?  (Gentoo based system, setting 
host=linux to build).

However, I'm sure you made a change for me some few versions back where 
the lock file handling got smarter, I had assumed you checked for a pid 
listed by the lock file? What I'm seeing now (but perhaps it's the same 
for 4.5.1.1) is that lock timeout is quite some time (presume 60 seconds...)

I *think* however, I need to do some more testing.  I believe that what 
I might be seeing is problems due to the ipset timeouts, this has 
triggered some reboots to gain control and that in turn may have caused 
me to see some lock timeouts.  Let me just check that chain of logic.  
However, in any case, would it not be possible to check if there even is 
a PID with the number shown in the lock file and bail out immediately if 
not?  This is a common algorithm (although I will concede it can get 
racy in corner cases)

Thanks for replying

Ed W

------------------------------------------------------------------------------
For Developers, A Lot Can Happen In A Second.
Boundary is the first to Know...and Tell You.
Monitor Your Applications in Ultra-Fine Resolution. Try it FREE!
http://p.sf.net/sfu/Boundary-d2dvs2
_______________________________________________
Shorewall-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/shorewall-users

Reply via email to