Re: [Shorewall-users] locking processes left behind
On Mon, 2019-02-25 at 07:14 -0500, Brian J. Murrell wrote: > > On the "lite" machine I have > 5.2.0.4. ~sigh~ Which is one single bugfix release behind what I need. Cheers, b. signature.asc Description: This is a digitally signed message part ___ Shorewall-users mailing list Shorewall-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/shorewall-users
Re: [Shorewall-users] locking processes left behind
On Wed, 2018-08-01 at 14:27 -0700, Tom Eastep wrote: > So, getting back to this... > Any results with this patch? I would like to include this fix in > 5.2.0.5. On the shorewall (i.e. not the "lite") machine, I have 5.2.0.5 installed that includes this patch. On the "lite" machine I have 5.2.0.4. I am still seeing locking problems. For example: 1084 ?S 0:06 /usr/sbin/foolsm -c /etc/foolsm/foolsm.conf -p /var/run/foolsm.pid 2332 ?S 0:00 \_ /bin/sh /etc/foolsm/script up Cogeco 24.226.22.71 eth0.2 root 10 6 5 0 10 0 0 17690 2362 ?S 0:00 \_ /bin/sh /etc/shorewall-lite/state/firewall enable eth0.2 2377 ?S 0:00 \_ lock /etc/shorewall-lite/state/lock 2928 ?S 0:00 lock /etc/shorewall-lite/state/lock And then once I kill the stale lock: # kill 2928 the above blocked "firewall enable eth0.2" proceeds but then leaves behind another stale lock: root 8558 1 0 06:54 ?00:00:00 lock /etc/shorewall-lite/state/lock So something is still amiss here. Cheers, b. signature.asc Description: This is a digitally signed message part ___ Shorewall-users mailing list Shorewall-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/shorewall-users
Re: [Shorewall-users] locking processes left behind
On 07/31/2018 08:26 AM, Tom Eastep wrote: > > In addition to that issue, the preceding line was incorrect ('qt' was > incorrect). Revised second patch attached. > Any results with this patch? I would like to include this fix in 5.2.0.5. Thanks, -Tom -- Tom Eastep\ Q: What do you get when you cross a mobster with Shoreline, \ an international standard? Washington, USA \ A: Someone who makes you an offer you can't http://shorewall.org \ understand \___ signature.asc Description: OpenPGP digital signature -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Shorewall-users mailing list Shorewall-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/shorewall-users
Re: [Shorewall-users] locking processes left behind
On 07/31/2018 05:19 AM, Brian J. Murrell wrote: > On Tue, 2018-07-31 at 10:48 +0200, Matt Darfeuille wrote: >> >> The attached patch (MUTEX_ON_TAKE1.patch) includes: >> - MUTEX_ON.patch >> - MUTEX_ON1.patch >> - The above correction (changing 'openwrt' to 'lockbin') >> - My take on fixing the above error ( "/sbin/shorewall-lite: line 14: >> -n: not found") >> >> Brian/Tom, thoughts? > > Yeah. The use of "test" in that patch was a new one on me so I had > just assumed the use was correct. But even in a bash shell here, that > syntax doesn't work: > > $ foo=[ -n "$foobar" ] > bash: -n: command not found > > Nor do any more fully qualified uses of it (to eliminate shell > interpretation as being the cause of the problem): > > $ foo=\[ -n "$foobar" ] > bash: -n: command not found > $ foo=/bin/[ -n "$foobar" ] > bash: -n: command not found > $ foo=/bin/test -n "$foobar" > bash: -n: command not found > > What does work (doesn't produce a syntax error) is: > > $ foo=$([ -n "$foobar" ]) > $ echo $foo > $ foobar=foobar > $ foo=$([ -n "$foobar" ]) > $ echo $foo > > $ > > but I can't see how that helps us since $foo is the same when the test > passes and fails. The best approximation of what I think Tom was > trying to achieve is: > > $ unset foobar > $ [ -n "$foobar" ] > $ echo $? > 1 > $ foobar=foobar > $ [ -n "$foobar" ] > $ echo $? > 0 > > But that still doesn't give us a "boolean" type value in $openwrt that > we can use in if statements. So, I think what we want is: > > if [ -n "$lockbin" -a -h "$lockbin" ]; then > openwrt=true > else > openwrt=false > fi > In addition to that issue, the preceding line was incorrect ('qt' was incorrect). Revised second patch attached. -Tom -- Tom Eastep\ Q: What do you get when you cross a mobster with Shoreline, \ an international standard? Washington, USA \ A: Someone who makes you an offer you can't http://shorewall.org \ understand \___ diff --git a/Shorewall-core/lib.common b/Shorewall-core/lib.common index 205fc705f..7df2879c7 100644 --- a/Shorewall-core/lib.common +++ b/Shorewall-core/lib.common @@ -751,6 +751,8 @@ mutex_on() lockf=${LOCKFILE:=${VARDIR}/lock} local lockpid local lockd +local lockbin +local openwrt MUTEX_TIMEOUT=${MUTEX_TIMEOUT:-60} @@ -760,28 +762,33 @@ mutex_on() [ -d "$lockd" ] || mkdir -p "$lockd" + lockbin=$(mywhich lock) + [ -n "$lockbin" -a -h "$lockbin" ] && openwrt=Yes + if [ -f $lockf ]; then lockpid=`cat ${lockf} 2> /dev/null` if [ -z "$lockpid" ] || [ $lockpid = 0 ]; then rm -f ${lockf} error_message "WARNING: Stale lockfile ${lockf} removed" - elif [ $lockpid -eq $$ ]; then -fatal_error "Mutex_on confusion" - elif ! qt ps --pid ${lockpid}; then - rm -f ${lockf} - error_message "WARNING: Stale lockfile ${lockf} from pid ${lockpid} removed" + elif [ -z "$openwrt" ]; then + if [ $lockpid -eq $$ ]; then +fatal_error "Mutex_on confusion" + elif ! qt ps --pid ${lockpid}; then + rm -f ${lockf} + error_message "WARNING: Stale lockfile ${lockf} from pid ${lockpid} removed" + fi fi fi - if qt mywhich lockfile; then + if [ -n "$openwrt" ]; then + lock ${lockf} || fatal_error "Can't lock ${lockf}" + g_havemutex="lock -u ${lockf}" + elif qt mywhich lockfile; then lockfile -${MUTEX_TIMEOUT} -r1 ${lockf} || fatal_error "Can't lock ${lockf}" g_havemutex="rm -f ${lockf}" chmod u+w ${lockf} echo $$ > ${lockf} chmod u-w ${lockf} - elif qt mywhich lock; then - lock ${lockf} || fatal_error "Can't lock ${lockf}" - g_havemutex="lock -u ${lockf}" else while [ -f ${lockf} -a ${try} -lt ${MUTEX_TIMEOUT} ] ; do sleep 1 signature.asc Description: OpenPGP digital signature -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Shorewall-users mailing list Shorewall-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/shorewall-users
Re: [Shorewall-users] locking processes left behind
On Tue, 2018-07-31 at 10:48 +0200, Matt Darfeuille wrote: > > The attached patch (MUTEX_ON_TAKE1.patch) includes: > - MUTEX_ON.patch > - MUTEX_ON1.patch > - The above correction (changing 'openwrt' to 'lockbin') > - My take on fixing the above error ( "/sbin/shorewall-lite: line 14: > -n: not found") > > Brian/Tom, thoughts? Yeah. The use of "test" in that patch was a new one on me so I had just assumed the use was correct. But even in a bash shell here, that syntax doesn't work: $ foo=[ -n "$foobar" ] bash: -n: command not found Nor do any more fully qualified uses of it (to eliminate shell interpretation as being the cause of the problem): $ foo=\[ -n "$foobar" ] bash: -n: command not found $ foo=/bin/[ -n "$foobar" ] bash: -n: command not found $ foo=/bin/test -n "$foobar" bash: -n: command not found What does work (doesn't produce a syntax error) is: $ foo=$([ -n "$foobar" ]) $ echo $foo $ foobar=foobar $ foo=$([ -n "$foobar" ]) $ echo $foo $ but I can't see how that helps us since $foo is the same when the test passes and fails. The best approximation of what I think Tom was trying to achieve is: $ unset foobar $ [ -n "$foobar" ] $ echo $? 1 $ foobar=foobar $ [ -n "$foobar" ] $ echo $? 0 But that still doesn't give us a "boolean" type value in $openwrt that we can use in if statements. So, I think what we want is: if [ -n "$lockbin" -a -h "$lockbin" ]; then openwrt=true else openwrt=false fi Cheers, b. signature.asc Description: This is a digitally signed message part -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Shorewall-users mailing list Shorewall-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/shorewall-users
[Shorewall-users] locking processes left behind
On 7/30/2018 6:11 PM, Matt Darfeuille wrote: > On 7/28/2018 5:19 PM, Tom Eastep wrote: >> On 07/28/2018 08:16 AM, Brian J. Murrell wrote: >>> On Sat, 2018-07-28 at 08:03 -0700, Tom Eastep wrote: diff --git a/Shorewall-core/lib.common b/Shorewall-core/lib.common index 205fc705f..bbebf0936 100644 --- a/Shorewall-core/lib.common +++ b/Shorewall-core/lib.common @@ -751,6 +751,8 @@ mutex_on() lockf=${LOCKFILE:=${VARDIR}/lock} local lockpid local lockd +local lockbin +local openwrt MUTEX_TIMEOUT=${MUTEX_TIMEOUT:-60} @@ -760,28 +762,33 @@ mutex_on() [ -d "$lockd" ] || mkdir -p "$lockd" + lockbin=$(qt mywhich lock) + openwrt=[ -n "$openwrt" -a -h "$openwrt" ] >>> >>> Did you mean: >>> >>> + openwrt=[ -n "$lockbin" -a -h "$lockbin" ] >>> >>> here? >>> >> >> Yes. >> > > With both patch applied (MUTEX_ON.patch and MUTEX_ON1.patch) and the > above correction I get: > > root@LEDE:~# shorewall-lite restart > /sbin/shorewall-lite: line 14: -n: not found >WARNING: Stale lockfile /lib/shorewall-lite/lock from pid 854 removed > Stopping Shorewall Lite > > What am I missing ? > Ok -- Here's my take on the above: The attached patch (MUTEX_ON_TAKE1.patch) includes: - MUTEX_ON.patch - MUTEX_ON1.patch - The above correction (changing 'openwrt' to 'lockbin') - My take on fixing the above error ( "/sbin/shorewall-lite: line 14: -n: not found") Brian/Tom, thoughts? -Matt -- Matt Darfeuille diff --git a/Shorewall-core/lib.common b/Shorewall-core/lib.common index c373a31ad..36800c887 100644 --- a/Shorewall-core/lib.common +++ b/Shorewall-core/lib.common @@ -751,6 +751,8 @@ mutex_on() lockf=${LOCKFILE:=${VARDIR}/lock} local lockpid local lockd +local lockbin +local openwrt MUTEX_TIMEOUT=${MUTEX_TIMEOUT:-60} @@ -760,29 +762,33 @@ mutex_on() [ -d "$lockd" ] || mkdir -p "$lockd" + lockbin=$(qt mywhich lock) + [ -n "$lockbin" -a -h "$lockbin" ] && openwrt=Yes || openwrt= + if [ -f $lockf ]; then lockpid=`cat ${lockf} 2> /dev/null` if [ -z "$lockpid" ] || [ $lockpid = 0 ]; then rm -f ${lockf} error_message "WARNING: Stale lockfile ${lockf} removed" - elif [ $lockpid -eq $$ ]; then -return 0 - elif ! ps | grep -v grep | qt grep ${lockpid}; then - rm -f ${lockf} - error_message "WARNING: Stale lockfile ${lockf} from pid ${lockpid} removed" + elif [ -z "$openwrt" ]; then + if [ $lockpid -eq $$ ]; then +fatal_error "Mutex_on confusion" + elif ! qt ps --pid ${lockpid}; then + rm -f ${lockf} + error_message "WARNING: Stale lockfile ${lockf} from pid ${lockpid} removed" + fi fi fi - if qt mywhich lockfile; then - lockfile -${MUTEX_TIMEOUT} -r1 ${lockf} + if [ -n "$openwrt" ]; then + lock ${lockf} || fatal_error "Can't lock ${lockf}" + g_havemutex="lock -u ${lockf}" + elif qt mywhich lockfile; then + lockfile -${MUTEX_TIMEOUT} -r1 ${lockf} || fatal_error "Can't lock ${lockf}" g_havemutex="rm -f ${lockf}" chmod u+w ${lockf} echo $$ > ${lockf} chmod u-w ${lockf} - elif qt mywhich lock; then - lock ${lockf} - g_havemutex="lock -u ${lockf} && rm -f ${lockf}" - chmod u=r ${lockf} else while [ -f ${lockf} -a ${try} -lt ${MUTEX_TIMEOUT} ] ; do sleep 1 -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Shorewall-users mailing list Shorewall-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/shorewall-users
Re: [Shorewall-users] locking processes left behind
On 7/28/2018 5:19 PM, Tom Eastep wrote: > On 07/28/2018 08:16 AM, Brian J. Murrell wrote: >> On Sat, 2018-07-28 at 08:03 -0700, Tom Eastep wrote: >>> diff --git a/Shorewall-core/lib.common b/Shorewall-core/lib.common >>> index 205fc705f..bbebf0936 100644 >>> --- a/Shorewall-core/lib.common >>> +++ b/Shorewall-core/lib.common >>> @@ -751,6 +751,8 @@ mutex_on() >>> lockf=${LOCKFILE:=${VARDIR}/lock} >>> local lockpid >>> local lockd >>> +local lockbin >>> +local openwrt >>> >>> MUTEX_TIMEOUT=${MUTEX_TIMEOUT:-60} >>> >>> @@ -760,28 +762,33 @@ mutex_on() >>> >>> [ -d "$lockd" ] || mkdir -p "$lockd" >>> >>> + lockbin=$(qt mywhich lock) >>> + openwrt=[ -n "$openwrt" -a -h "$openwrt" ] >> >> Did you mean: >> >> +openwrt=[ -n "$lockbin" -a -h "$lockbin" ] >> >> here? >> > > Yes. > With both patch applied (MUTEX_ON.patch and MUTEX_ON1.patch) and the above correction I get: root@LEDE:~# shorewall-lite restart /sbin/shorewall-lite: line 14: -n: not found WARNING: Stale lockfile /lib/shorewall-lite/lock from pid 854 removed Stopping Shorewall Lite What am I missing ? -Matt -- Matt Darfeuille -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Shorewall-users mailing list Shorewall-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/shorewall-users
Re: [Shorewall-users] locking processes left behind
On 07/28/2018 08:40 AM, Matt Darfeuille wrote: > On 7/28/2018 5:19 PM, Tom Eastep wrote: >> On 07/28/2018 08:16 AM, Brian J. Murrell wrote: >>> On Sat, 2018-07-28 at 08:03 -0700, Tom Eastep wrote: diff --git a/Shorewall-core/lib.common b/Shorewall-core/lib.common index 205fc705f..bbebf0936 100644 --- a/Shorewall-core/lib.common +++ b/Shorewall-core/lib.common @@ -751,6 +751,8 @@ mutex_on() lockf=${LOCKFILE:=${VARDIR}/lock} local lockpid local lockd +local lockbin +local openwrt MUTEX_TIMEOUT=${MUTEX_TIMEOUT:-60} @@ -760,28 +762,33 @@ mutex_on() [ -d "$lockd" ] || mkdir -p "$lockd" + lockbin=$(qt mywhich lock) + openwrt=[ -n "$openwrt" -a -h "$openwrt" ] >>> >>> Did you mean: >>> >>> + openwrt=[ -n "$lockbin" -a -h "$lockbin" ] >>> >>> here? >>> >> >> Yes. >> > > Tom, would you mind pushing the master branch of the trunk (code) repo? > This last change is not committed yet, and I don't want to push until Brian is happy with the result. -Tom -- Tom Eastep\ Q: What do you get when you cross a mobster with Shoreline, \ an international standard? Washington, USA \ A: Someone who makes you an offer you can't http://shorewall.org \ understand \___ signature.asc Description: OpenPGP digital signature -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Shorewall-users mailing list Shorewall-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/shorewall-users
Re: [Shorewall-users] locking processes left behind
On 7/28/2018 5:19 PM, Tom Eastep wrote: > On 07/28/2018 08:16 AM, Brian J. Murrell wrote: >> On Sat, 2018-07-28 at 08:03 -0700, Tom Eastep wrote: >>> diff --git a/Shorewall-core/lib.common b/Shorewall-core/lib.common >>> index 205fc705f..bbebf0936 100644 >>> --- a/Shorewall-core/lib.common >>> +++ b/Shorewall-core/lib.common >>> @@ -751,6 +751,8 @@ mutex_on() >>> lockf=${LOCKFILE:=${VARDIR}/lock} >>> local lockpid >>> local lockd >>> +local lockbin >>> +local openwrt >>> >>> MUTEX_TIMEOUT=${MUTEX_TIMEOUT:-60} >>> >>> @@ -760,28 +762,33 @@ mutex_on() >>> >>> [ -d "$lockd" ] || mkdir -p "$lockd" >>> >>> + lockbin=$(qt mywhich lock) >>> + openwrt=[ -n "$openwrt" -a -h "$openwrt" ] >> >> Did you mean: >> >> +openwrt=[ -n "$lockbin" -a -h "$lockbin" ] >> >> here? >> > > Yes. > Tom, would you mind pushing the master branch of the trunk (code) repo? -Matt -- Matt Darfeuille -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Shorewall-users mailing list Shorewall-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/shorewall-users
Re: [Shorewall-users] locking processes left behind
On 07/28/2018 08:16 AM, Brian J. Murrell wrote: > On Sat, 2018-07-28 at 08:03 -0700, Tom Eastep wrote: >> diff --git a/Shorewall-core/lib.common b/Shorewall-core/lib.common >> index 205fc705f..bbebf0936 100644 >> --- a/Shorewall-core/lib.common >> +++ b/Shorewall-core/lib.common >> @@ -751,6 +751,8 @@ mutex_on() >> lockf=${LOCKFILE:=${VARDIR}/lock} >> local lockpid >> local lockd >> +local lockbin >> +local openwrt >> >> MUTEX_TIMEOUT=${MUTEX_TIMEOUT:-60} >> >> @@ -760,28 +762,33 @@ mutex_on() >> >> [ -d "$lockd" ] || mkdir -p "$lockd" >> >> +lockbin=$(qt mywhich lock) >> +openwrt=[ -n "$openwrt" -a -h "$openwrt" ] > > Did you mean: > > + openwrt=[ -n "$lockbin" -a -h "$lockbin" ] > > here? > Yes. -Tom -- Tom Eastep\ Q: What do you get when you cross a mobster with Shoreline, \ an international standard? Washington, USA \ A: Someone who makes you an offer you can't http://shorewall.org \ understand \___ signature.asc Description: OpenPGP digital signature -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Shorewall-users mailing list Shorewall-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/shorewall-users
Re: [Shorewall-users] locking processes left behind
On Sat, 2018-07-28 at 08:03 -0700, Tom Eastep wrote: > diff --git a/Shorewall-core/lib.common b/Shorewall-core/lib.common > index 205fc705f..bbebf0936 100644 > --- a/Shorewall-core/lib.common > +++ b/Shorewall-core/lib.common > @@ -751,6 +751,8 @@ mutex_on() > lockf=${LOCKFILE:=${VARDIR}/lock} > local lockpid > local lockd > +local lockbin > +local openwrt > > MUTEX_TIMEOUT=${MUTEX_TIMEOUT:-60} > > @@ -760,28 +762,33 @@ mutex_on() > > [ -d "$lockd" ] || mkdir -p "$lockd" > > + lockbin=$(qt mywhich lock) > + openwrt=[ -n "$openwrt" -a -h "$openwrt" ] Did you mean: + openwrt=[ -n "$lockbin" -a -h "$lockbin" ] here? Cheers, b. signature.asc Description: This is a digitally signed message part -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Shorewall-users mailing list Shorewall-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/shorewall-users
Re: [Shorewall-users] locking processes left behind
On Sat, 2018-07-28 at 15:04 +0200, Matt Darfeuille wrote: > > Tom, with MUTEX_ON.patch applied, on LEDE '--pid' is not available or > is > it done on purpose?: > > root@LEDE:~# ps --pid > ps: unrecognized option: pid > BusyBox v1.25.1 () multi-call binary. > > Usage: ps > > Show list of processes > > w Wide output > root@LEDE:~# Yeah. The busybox ps is pretty dumb, er, I guess, "basic" is the PC term. I didn't notice this in manually trying out the changes Tom's patch is proposing because I typically install the better ps from procps-ng-ps. So, shorewall-lite on LEDE could require that package, but probably the lighter-weight approach is to use [ -d /proc/$pid ] as the test for a process existing, or even "kill -0 $pid". > Brian, any thoughts on the patch? I installed it but have not had opportunity to reboot my router to see how well it operates. Cheers, b. signature.asc Description: This is a digitally signed message part -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Shorewall-users mailing list Shorewall-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/shorewall-users
Re: [Shorewall-users] locking processes left behind
On 07/28/2018 06:04 AM, Matt Darfeuille wrote: > On 7/26/2018 8:41 PM, Tom Eastep wrote: >> On 07/26/2018 09:54 AM, Brian J. Murrell wrote: >>> On Thu, 2018-07-26 at 08:51 -0700, Tom Eastep wrote: Brian, >>> >>> Hi Tom, >>> Can you point me to online documentation that describes how this 'lock' utility is supposed to work? >>> >>> It's a busybox applet added to busybox by OpenWRT. Here's the source >>> for it: >>> >>> https://github.com/openwrt/openwrt/blob/154c0c4006daf41e2cbb6c8b7ad5557f83dfea3e/package/utils/busybox/patches/220-add_lock_util.patch >>> >>> There seems to be minimal documentation at: >>> >>> https://wiki.openwrt.org/inbox/script/manual_busybox_functions_openwrt >>> >> >> Thanks Brian, >> >> Please see if this patch resolves the issue. The lib.common file has not >> changed since 5.1.12.3 (other than the banner version) so you can apply >> the patch on your admin system then copy lib.common to the router. >> > > Tom, with MUTEX_ON.patch applied, on LEDE '--pid' is not available or is > it done on purpose?: > > root@LEDE:~# ps --pid > ps: unrecognized option: pid > BusyBox v1.25.1 () multi-call binary. > > Usage: ps > > Show list of processes > > w Wide output > root@LEDE:~# > > Brian, any thoughts on the patch? > This patch should resolve the 'ps' issue, as 'ps' is no longer invoked on OpenWRT. -Tom -- Tom Eastep\ Q: What do you get when you cross a mobster with Shoreline, \ an international standard? Washington, USA \ A: Someone who makes you an offer you can't http://shorewall.org \ understand \___ diff --git a/Shorewall-core/lib.common b/Shorewall-core/lib.common index 205fc705f..bbebf0936 100644 --- a/Shorewall-core/lib.common +++ b/Shorewall-core/lib.common @@ -751,6 +751,8 @@ mutex_on() lockf=${LOCKFILE:=${VARDIR}/lock} local lockpid local lockd +local lockbin +local openwrt MUTEX_TIMEOUT=${MUTEX_TIMEOUT:-60} @@ -760,28 +762,33 @@ mutex_on() [ -d "$lockd" ] || mkdir -p "$lockd" + lockbin=$(qt mywhich lock) + openwrt=[ -n "$openwrt" -a -h "$openwrt" ] + if [ -f $lockf ]; then lockpid=`cat ${lockf} 2> /dev/null` if [ -z "$lockpid" ] || [ $lockpid = 0 ]; then rm -f ${lockf} error_message "WARNING: Stale lockfile ${lockf} removed" - elif [ $lockpid -eq $$ ]; then -fatal_error "Mutex_on confusion" - elif ! qt ps --pid ${lockpid}; then - rm -f ${lockf} - error_message "WARNING: Stale lockfile ${lockf} from pid ${lockpid} removed" + elif [ -z "$openwrt" ]; then + if [ $lockpid -eq $$ ]; then +fatal_error "Mutex_on confusion" + elif ! qt ps --pid ${lockpid}; then + rm -f ${lockf} + error_message "WARNING: Stale lockfile ${lockf} from pid ${lockpid} removed" + fi fi fi - if qt mywhich lockfile; then + if [ -n "$openwrt" ]; then + lock ${lockf} || fatal_error "Can't lock ${lockf}" + g_havemutex="lock -u ${lockf}" + elif qt mywhich lockfile; then lockfile -${MUTEX_TIMEOUT} -r1 ${lockf} || fatal_error "Can't lock ${lockf}" g_havemutex="rm -f ${lockf}" chmod u+w ${lockf} echo $$ > ${lockf} chmod u-w ${lockf} - elif qt mywhich lock; then - lock ${lockf} || fatal_error "Can't lock ${lockf}" - g_havemutex="lock -u ${lockf}" else while [ -f ${lockf} -a ${try} -lt ${MUTEX_TIMEOUT} ] ; do sleep 1 signature.asc Description: OpenPGP digital signature -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Shorewall-users mailing list Shorewall-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/shorewall-users
Re: [Shorewall-users] locking processes left behind
On 7/26/2018 8:41 PM, Tom Eastep wrote: > On 07/26/2018 09:54 AM, Brian J. Murrell wrote: >> On Thu, 2018-07-26 at 08:51 -0700, Tom Eastep wrote: >>> >>> Brian, >> >> Hi Tom, >> >>> Can you point me to online documentation that describes how this >>> 'lock' >>> utility is supposed to work? >> >> It's a busybox applet added to busybox by OpenWRT. Here's the source >> for it: >> >> https://github.com/openwrt/openwrt/blob/154c0c4006daf41e2cbb6c8b7ad5557f83dfea3e/package/utils/busybox/patches/220-add_lock_util.patch >> >> There seems to be minimal documentation at: >> >> https://wiki.openwrt.org/inbox/script/manual_busybox_functions_openwrt >> > > Thanks Brian, > > Please see if this patch resolves the issue. The lib.common file has not > changed since 5.1.12.3 (other than the banner version) so you can apply > the patch on your admin system then copy lib.common to the router. > Tom, with MUTEX_ON.patch applied, on LEDE '--pid' is not available or is it done on purpose?: root@LEDE:~# ps --pid ps: unrecognized option: pid BusyBox v1.25.1 () multi-call binary. Usage: ps Show list of processes w Wide output root@LEDE:~# Brian, any thoughts on the patch? -Matt -- Matt Darfeuille -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Shorewall-users mailing list Shorewall-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/shorewall-users
Re: [Shorewall-users] locking processes left behind
On 07/26/2018 09:54 AM, Brian J. Murrell wrote: > On Thu, 2018-07-26 at 08:51 -0700, Tom Eastep wrote: >> >> Brian, > > Hi Tom, > >> Can you point me to online documentation that describes how this >> 'lock' >> utility is supposed to work? > > It's a busybox applet added to busybox by OpenWRT. Here's the source > for it: > > https://github.com/openwrt/openwrt/blob/154c0c4006daf41e2cbb6c8b7ad5557f83dfea3e/package/utils/busybox/patches/220-add_lock_util.patch > > There seems to be minimal documentation at: > > https://wiki.openwrt.org/inbox/script/manual_busybox_functions_openwrt > Thanks Brian, Please see if this patch resolves the issue. The lib.common file has not changed since 5.1.12.3 (other than the banner version) so you can apply the patch on your admin system then copy lib.common to the router. Thanks, -Tom -- Tom Eastep\ Q: What do you get when you cross a mobster with Shoreline, \ an international standard? Washington, USA \ A: Someone who makes you an offer you can't http://shorewall.org \ understand \___ diff --git a/Shorewall-core/lib.common b/Shorewall-core/lib.common index 7567123a7..205fc705f 100644 --- a/Shorewall-core/lib.common +++ b/Shorewall-core/lib.common @@ -766,23 +766,22 @@ mutex_on() rm -f ${lockf} error_message "WARNING: Stale lockfile ${lockf} removed" elif [ $lockpid -eq $$ ]; then -return 0 - elif ! ps | grep -v grep | qt grep ${lockpid}; then +fatal_error "Mutex_on confusion" + elif ! qt ps --pid ${lockpid}; then rm -f ${lockf} error_message "WARNING: Stale lockfile ${lockf} from pid ${lockpid} removed" fi fi if qt mywhich lockfile; then - lockfile -${MUTEX_TIMEOUT} -r1 ${lockf} + lockfile -${MUTEX_TIMEOUT} -r1 ${lockf} || fatal_error "Can't lock ${lockf}" g_havemutex="rm -f ${lockf}" chmod u+w ${lockf} echo $$ > ${lockf} chmod u-w ${lockf} elif qt mywhich lock; then - lock ${lockf} - g_havemutex="lock -u ${lockf} && rm -f ${lockf}" - chmod u=r ${lockf} + lock ${lockf} || fatal_error "Can't lock ${lockf}" + g_havemutex="lock -u ${lockf}" else while [ -f ${lockf} -a ${try} -lt ${MUTEX_TIMEOUT} ] ; do sleep 1 signature.asc Description: OpenPGP digital signature -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Shorewall-users mailing list Shorewall-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/shorewall-users
Re: [Shorewall-users] locking processes left behind
On 7/26/2018 6:54 PM, Brian J. Murrell wrote: > On Thu, 2018-07-26 at 08:51 -0700, Tom Eastep wrote: >> >> Brian, > > Hi Tom, > >> Can you point me to online documentation that describes how this >> 'lock' >> utility is supposed to work? > > It's a busybox applet added to busybox by OpenWRT. Here's the source > for it: > > https://github.com/openwrt/openwrt/blob/154c0c4006daf41e2cbb6c8b7ad5557f83dfea3e/package/utils/busybox/patches/220-add_lock_util.patch > > There seems to be minimal documentation at: > > https://wiki.openwrt.org/inbox/script/manual_busybox_functions_openwrt > root@LEDE:~# cat /etc/banner | grep -i reboot \ DE\ /Reboot (17.01.4, r3560-79f57e422d) root@LEDE:~# lock -k Usage: lock [-suw] -s Use shared locking -u Unlock -w Wait for the lock to become free, don't acquire lock -n Don't wait for the lock to become free. Fail with exit code root@LEDE:~# HTH -Matt -- Matt Darfeuille -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Shorewall-users mailing list Shorewall-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/shorewall-users
Re: [Shorewall-users] locking processes left behind
On Thu, 2018-07-26 at 08:51 -0700, Tom Eastep wrote: > > Brian, Hi Tom, > Can you point me to online documentation that describes how this > 'lock' > utility is supposed to work? It's a busybox applet added to busybox by OpenWRT. Here's the source for it: https://github.com/openwrt/openwrt/blob/154c0c4006daf41e2cbb6c8b7ad5557f83dfea3e/package/utils/busybox/patches/220-add_lock_util.patch There seems to be minimal documentation at: https://wiki.openwrt.org/inbox/script/manual_busybox_functions_openwrt Cheers, b. signature.asc Description: This is a digitally signed message part -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Shorewall-users mailing list Shorewall-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/shorewall-users
Re: [Shorewall-users] locking processes left behind
On 07/26/2018 07:01 AM, Brian J. Murrell wrote: > On Thu, 2018-07-26 at 05:48 +0200, Matt Darfeuille wrote: >> >> As illustrated by this lingering thread, issues that are only present >> on >> one platform makes me moved away from OpenWRT/LEDE. > > The platform is not the problem. The platform is just providing the > tools. > > Or are you suggesting that the "lock" tool on OpenWRT/LEDE is actually > buggy? Given that it's just a wrapper around flock() that seems > unlikely. But I'm happy to be proven wrong if you can provide a > reproducer for the bug that I can submit upstream. As much testing as > I have done with the "lock" tool it operates as expected when used as > expected. > > Given the evidence, it seems like the file being locked is getting > removed before the lock is released. > > A reboot of my router this morning has reproduced the situation and > this is what I see: > > # ps -ef | grep lock > root 2700 2666 0 07:13 ?00:00:00 lock > /etc/shorewall-lite/state/lock > root 3234 1 0 07:13 ?00:00:00 lock > /etc/shorewall-lite/state/lock > > # lsof -n -p 3234 > COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME > lock3234 root cwdDIR 0,15 656 258 / > lock3234 root rtdDIR 0,15 656 258 / > lock3234 root txtREG 254,0 308533 1786 /bin/busybox > lock3234 root memREG 254,077040 213 /lib/libgcc_s.so.1 > lock3234 root memREG 254,0 601968 402 /lib/libc.so > lock3234 root0u CHR1,3 0t0 317 /dev/null > lock3234 root1u CHR1,3 0t0 317 /dev/null > lock3234 root2u CHR1,3 0t0 317 /dev/null > lock3234 root3u REG 0,145 61617 > /etc/shorewall-lite/state/lock (deleted) > lock3234 root 13w FIFO0,8 0t0 1732 pipe > > # cat /proc/2700/fd/3 > 3234 > > # strace -f -p 3234 > strace: Process 3234 attached > restart_syscall(<... resuming interrupted syscall_516 ...>) = 0 > nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffcd900) = 0 > nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffcd900) = 0 > nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffcd900) = 0 > nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffcd900) = 0 > nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffcd900) = 0 > nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffcd900) = 0 > nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffcd900) = 0 > nanosleep({tv_sec=1, tv_nsec=0}, ^Cstrace: Process 3234 detached > > > # strace -f -p 2700 > strace: Process 2700 attached > flock(3, LOCK_EX^Cstrace: Process 2700 detached > > > Hrm. Given: > > g_havemutex="lock -u ${lockf} && rm -f ${lockf}" > > Observe this particular set of operations: > > tty1# lock /tmp/mylockfile > tty1# [has the lock and returns] > tty2# lock /tmp/mylockfile > [blocks waiting for locker1 to release the lock as we can see:] > # lsof | grep /tmp/mylockfile > lock 1249root3u REG 0,135 352778 > /tmp/mylockfile > lock 1250root3u REG 0,135 352778 > /tmp/mylockfile > tty1# lock -u /tmp/mylockfile && rm -f /tmp/mylockfile > tty1# [returns, releasing the lock to tty2] > tty2# [returns from blocked state, now holds the lock] > # lsof | grep /tmp/mylockfile > lock 1404root3u REG 0,135 352778 > /tmp/mylockfile (deleted) > tty3# lock /tmp/mylockfile > tty3# [wait, what? it returns even though tty2 has the lock!] > # lsof | grep /tmp/mylockfile > lock 1404root3u REG 0,135 352778 > /tmp/mylockfile (deleted) > lock 1439root3u REG 0,135 362181 > /tmp/mylockfile > > So at this point both tty2 and tty3 believe they have the lock and have > returned, allowing them to do their work on top of each other. > > I don't think a process can simply remove the lock file just because it > has released it's lock on it. It can only be removed if there are no > more outstanding locks on it. Or just don't remove it. lock seems to > function perfectly fine with the file pre-existing. > > I'm not sure I can draw a line from this problem to the stale locks > problem, but it's probably a good thing to fix before continuing to try > to debug the stale locks problem. > Brian, Can you point me to online documentation that describes how this 'lock' utility is supposed to work? Thanks, -Tom -- Tom Eastep\ Q: What do you get when you cross a mobster with Shoreline, \ an international standard? Washington, USA \ A: Someone who makes you an offer you can't http://shorewall.org \ understand \___ signature.asc Description: OpenPGP digital signature -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org!
Re: [Shorewall-users] locking processes left behind
On Thu, 2018-07-26 at 05:48 +0200, Matt Darfeuille wrote: > > As illustrated by this lingering thread, issues that are only present > on > one platform makes me moved away from OpenWRT/LEDE. The platform is not the problem. The platform is just providing the tools. Or are you suggesting that the "lock" tool on OpenWRT/LEDE is actually buggy? Given that it's just a wrapper around flock() that seems unlikely. But I'm happy to be proven wrong if you can provide a reproducer for the bug that I can submit upstream. As much testing as I have done with the "lock" tool it operates as expected when used as expected. Given the evidence, it seems like the file being locked is getting removed before the lock is released. A reboot of my router this morning has reproduced the situation and this is what I see: # ps -ef | grep lock root 2700 2666 0 07:13 ?00:00:00 lock /etc/shorewall-lite/state/lock root 3234 1 0 07:13 ?00:00:00 lock /etc/shorewall-lite/state/lock # lsof -n -p 3234 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME lock3234 root cwdDIR 0,15 656 258 / lock3234 root rtdDIR 0,15 656 258 / lock3234 root txtREG 254,0 308533 1786 /bin/busybox lock3234 root memREG 254,077040 213 /lib/libgcc_s.so.1 lock3234 root memREG 254,0 601968 402 /lib/libc.so lock3234 root0u CHR1,3 0t0 317 /dev/null lock3234 root1u CHR1,3 0t0 317 /dev/null lock3234 root2u CHR1,3 0t0 317 /dev/null lock3234 root3u REG 0,145 61617 /etc/shorewall-lite/state/lock (deleted) lock3234 root 13w FIFO0,8 0t0 1732 pipe # cat /proc/2700/fd/3 3234 # strace -f -p 3234 strace: Process 3234 attached restart_syscall(<... resuming interrupted syscall_516 ...>) = 0 nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffcd900) = 0 nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffcd900) = 0 nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffcd900) = 0 nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffcd900) = 0 nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffcd900) = 0 nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffcd900) = 0 nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffcd900) = 0 nanosleep({tv_sec=1, tv_nsec=0}, ^Cstrace: Process 3234 detached # strace -f -p 2700 strace: Process 2700 attached flock(3, LOCK_EX^Cstrace: Process 2700 detached Hrm. Given: g_havemutex="lock -u ${lockf} && rm -f ${lockf}" Observe this particular set of operations: tty1# lock /tmp/mylockfile tty1# [has the lock and returns] tty2# lock /tmp/mylockfile [blocks waiting for locker1 to release the lock as we can see:] # lsof | grep /tmp/mylockfile lock 1249root3u REG 0,135 352778 /tmp/mylockfile lock 1250root3u REG 0,135 352778 /tmp/mylockfile tty1# lock -u /tmp/mylockfile && rm -f /tmp/mylockfile tty1# [returns, releasing the lock to tty2] tty2# [returns from blocked state, now holds the lock] # lsof | grep /tmp/mylockfile lock 1404root3u REG 0,135 352778 /tmp/mylockfile (deleted) tty3# lock /tmp/mylockfile tty3# [wait, what? it returns even though tty2 has the lock!] # lsof | grep /tmp/mylockfile lock 1404root3u REG 0,135 352778 /tmp/mylockfile (deleted) lock 1439root3u REG 0,135 362181 /tmp/mylockfile So at this point both tty2 and tty3 believe they have the lock and have returned, allowing them to do their work on top of each other. I don't think a process can simply remove the lock file just because it has released it's lock on it. It can only be removed if there are no more outstanding locks on it. Or just don't remove it. lock seems to function perfectly fine with the file pre-existing. I'm not sure I can draw a line from this problem to the stale locks problem, but it's probably a good thing to fix before continuing to try to debug the stale locks problem. Cheers, b. signature.asc Description: This is a digitally signed message part -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Shorewall-users mailing list Shorewall-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/shorewall-users
Re: [Shorewall-users] locking processes left behind
On 7/26/2018 12:56 AM, Tom Eastep wrote: > On 07/25/2018 10:14 AM, Brian J. Murrell wrote: >> On Mon, 2018-07-16 at 07:12 -0400, Brian J. Murrell wrote: >>> >>> I think I finally do have the required versions now, yes? >> >> Am I correct about having the required versions now? >> >>> However, as you can see above, we still have stale/orphan >>> locks/processes hanging around. >> >> If so, any ideas why stale locks are still getting left behind? >> > > It appears that you have the updated versions. But I have no idea why > stale locks are still getting left behind, given that we don't seem to > be seeing them on platforms other than OpenWRT. Matt - are you seeing > this problem on your OpenWRT router(s)? > As illustrated by this lingering thread, issues that are only present on one platform makes me moved away from OpenWRT/LEDE. If I recall correctly, the fix did the job. -Matt -- Matt Darfeuille -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Shorewall-users mailing list Shorewall-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/shorewall-users
Re: [Shorewall-users] locking processes left behind
On 07/25/2018 10:14 AM, Brian J. Murrell wrote: > On Mon, 2018-07-16 at 07:12 -0400, Brian J. Murrell wrote: >> >> I think I finally do have the required versions now, yes? > > Am I correct about having the required versions now? > >> However, as you can see above, we still have stale/orphan >> locks/processes hanging around. > > If so, any ideas why stale locks are still getting left behind? > It appears that you have the updated versions. But I have no idea why stale locks are still getting left behind, given that we don't seem to be seeing them on platforms other than OpenWRT. Matt - are you seeing this problem on your OpenWRT router(s)? -Tom -- Tom Eastep\ Q: What do you get when you cross a mobster with Shoreline, \ an international standard? Washington, USA \ A: Someone who makes you an offer you can't http://shorewall.org \ understand \___ signature.asc Description: OpenPGP digital signature -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Shorewall-users mailing list Shorewall-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/shorewall-users
Re: [Shorewall-users] locking processes left behind
On Mon, 2018-07-16 at 07:12 -0400, Brian J. Murrell wrote: > > I think I finally do have the required versions now, yes? Am I correct about having the required versions now? > However, as you can see above, we still have stale/orphan > locks/processes hanging around. If so, any ideas why stale locks are still getting left behind? Cheers, b. signature.asc Description: This is a digitally signed message part -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Shorewall-users mailing list Shorewall-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/shorewall-users
Re: [Shorewall-users] locking processes left behind
On Sat, 2018-06-30 at 08:25 -0700, Tom Eastep wrote: > > If 'shorewall show version' returns '5.2.0', then you do not have the > fix on your administrative system. If it returns '5.2.0.1', then you > do > have the fix. $ shorewall show version ERROR: Cannot read /etc/shorewall/shorewall.conf! (Hint: Are you root?) $ sudo shorewall show version ERROR: Chain 'version' is not recognized by /sbin/iptables. $ shorewall version 5.2.0.4 $ ssh gw shorewall-lite version\; ps -ef \| grep lock 5.1.12.3 root 3288 1 0 05:07 ?00:00:00 lock /etc/shorewall-lite/state/lock root 8106 1 0 05:09 ?00:00:00 lock /etc/shorewall-lite/state/lock I think I finally do have the required versions now, yes? However, as you can see above, we still have stale/orphan locks/processes hanging around. > The script cannot insure idempotency when it is interrupted at an > arbitrary point. It writes into its 'undo' files after the successful > completion of an 'ip' command, so a failure after the command and > before > the 'undo' record is written can cause incorrect behavior the next > time > that the script is run. Pity. Although, I agree it's a difficult problem. I usually solve those kinds of problems by growing the/an "undo" stack as I "do". That is for every action I take, I push the undo of that operation onto a stack that I can execute if I get stopped at any point. Cheers, b. signature.asc Description: This is a digitally signed message part -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Shorewall-users mailing list Shorewall-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/shorewall-users
Re: [Shorewall-users] locking processes left behind
On 06/28/2018 04:06 AM, Brian J. Murrell wrote: > On Thu, 2018-04-12 at 09:10 -0700, Tom Eastep wrote: >> >> No -- it requires the firewall script to be compiled with the fix, as >> well as having the fix installed on the shorewall[6]-lite firewall. > > # rpm -q shorewall > shorewall-5.2.0-0.01.fc28.noarch> > # opkg info shorewall-lite > Package: shorewall-lite > Version: 5.1.12.3-1 > > So I should have the intended fix, yes? If 'shorewall show version' returns '5.2.0', then you do not have the fix on your administrative system. If it returns '5.2.0.1', then you do have the fix. > > From a reboot of my router this morning: > > # ps -ef | grep lock > root 3166 1 0 06:24 ?00:00:00 lock > /etc/shorewall-lite/state/lock > root 7089 1 0 06:26 ?00:00:00 lock > /etc/shorewall-lite/state/lock > > So the locking appears to be still leaving orphans behind. > > I have been considering an alternative approach to this locking. When > multiple shorewall invocations race, I really only likely care about > the last one winning the race cleanly, since they are most likely > racing just because of an interface status change and the last to enter > the race will configure the firewall with the status of all interfaces > (and other state) already known to him. > > So really, the last shorewall process to enter a race should just kill > off it's predecessors and continue on it's way. > > That requires that the firewall installation script be able to deal > with any kind of previous partial state though. Not sure how well > shorewall is able to do that. It would require the ultimate of > idempotency. > The script cannot insure idempotency when it is interrupted at an arbitrary point. It writes into its 'undo' files after the successful completion of an 'ip' command, so a failure after the command and before the 'undo' record is written can cause incorrect behavior the next time that the script is run. -Tom -- Tom Eastep\ Q: What do you get when you cross a mobster with Shoreline, \ an international standard? Washington, USA \ A: Someone who makes you an offer you can't http://shorewall.org \ understand \___ signature.asc Description: OpenPGP digital signature -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Shorewall-users mailing list Shorewall-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/shorewall-users
Re: [Shorewall-users] locking processes left behind
On Thu, 2018-04-12 at 09:10 -0700, Tom Eastep wrote: > > No -- it requires the firewall script to be compiled with the fix, as > well as having the fix installed on the shorewall[6]-lite firewall. # rpm -q shorewall shorewall-5.2.0-0.01.fc28.noarch # opkg info shorewall-lite Package: shorewall-lite Version: 5.1.12.3-1 So I should have the intended fix, yes? From a reboot of my router this morning: # ps -ef | grep lock root 3166 1 0 06:24 ?00:00:00 lock /etc/shorewall-lite/state/lock root 7089 1 0 06:26 ?00:00:00 lock /etc/shorewall-lite/state/lock So the locking appears to be still leaving orphans behind. I have been considering an alternative approach to this locking. When multiple shorewall invocations race, I really only likely care about the last one winning the race cleanly, since they are most likely racing just because of an interface status change and the last to enter the race will configure the firewall with the status of all interfaces (and other state) already known to him. So really, the last shorewall process to enter a race should just kill off it's predecessors and continue on it's way. That requires that the firewall installation script be able to deal with any kind of previous partial state though. Not sure how well shorewall is able to do that. It would require the ultimate of idempotency. Cheers, b. signature.asc Description: This is a digitally signed message part -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Shorewall-users mailing list Shorewall-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/shorewall-users
Re: [Shorewall-users] locking processes left behind
On 04/11/2018 08:15 PM, Brian J. Murrell wrote: > On Wed, 2018-04-11 at 19:09 -0700, Tom Eastep wrote: >> >> Did you read the 5.1.12.3 release notes? > > Ahhh. No. I read the 5.1.12 release notes but didn't go any further > when I didn't see it. Should have really. > >> 1) Previously, the Shorewall[6][-lite] lock file was not always >> released when an error occurred. This resulted in: >> >> - A warning message saying that a stale lock file has been >> removed >> - 'lock' processes remaining after shorewall[6][-lite] terminated >> (only reported on OpenWRT). >> >> That has been corrected so that the lock file is released at exit >> if it hasn't been released already. > > Awesome. Building it right now. Which brings up the question... in a > shorewall[6]-lite deployment the fix for this is 100% on the -lite > side, yes? No -- it requires the firewall script to be compiled with the fix, as well as having the fix installed on the shorewall[6]-lite firewall. > > As an aside, is shorewall maintained in any public git (or other SCM) > repo? > I see that Matt answered that in an earlier post. -Tom -- Tom Eastep\ Q: What do you get when you cross a mobster with Shoreline, \ an international standard? Washington, USA \ A: Someone who makes you an offer you can't http://shorewall.org \ understand \___ signature.asc Description: OpenPGP digital signature -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Shorewall-users mailing list Shorewall-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/shorewall-users
Re: [Shorewall-users] locking processes left behind
On 4/12/2018 5:15 AM, Brian J. Murrell wrote: > On Wed, 2018-04-11 at 19:09 -0700, Tom Eastep wrote: >> >> Did you read the 5.1.12.3 release notes? > > Ahhh. No. I read the 5.1.12 release notes but didn't go any further > when I didn't see it. Should have really. > >> 1) Previously, the Shorewall[6][-lite] lock file was not always >> released when an error occurred. This resulted in: >> >> - A warning message saying that a stale lock file has been >> removed >> - 'lock' processes remaining after shorewall[6][-lite] terminated >> (only reported on OpenWRT). >> >> That has been corrected so that the lock file is released at exit >> if it hasn't been released already. > > Awesome. Building it right now. Which brings up the question... in a > shorewall[6]-lite deployment the fix for this is 100% on the -lite > side, yes? > > As an aside, is shorewall maintained in any public git (or other SCM) > repo? > http://shorewall.org/Build.html https://sourceforge.net/p/shorewall/_list/git?source=navbar https://sourceforge.net/p/shorewall/mailman/message/36243307/ -Matt -- Matt Darfeuille -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Shorewall-users mailing list Shorewall-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/shorewall-users
Re: [Shorewall-users] locking processes left behind
On Wed, 2018-04-11 at 19:09 -0700, Tom Eastep wrote: > > Did you read the 5.1.12.3 release notes? Ahhh. No. I read the 5.1.12 release notes but didn't go any further when I didn't see it. Should have really. > 1) Previously, the Shorewall[6][-lite] lock file was not always > released when an error occurred. This resulted in: > > - A warning message saying that a stale lock file has been > removed > - 'lock' processes remaining after shorewall[6][-lite] terminated > (only reported on OpenWRT). > > That has been corrected so that the lock file is released at exit > if it hasn't been released already. Awesome. Building it right now. Which brings up the question... in a shorewall[6]-lite deployment the fix for this is 100% on the -lite side, yes? As an aside, is shorewall maintained in any public git (or other SCM) repo? Cheers, b. signature.asc Description: This is a digitally signed message part -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Shorewall-users mailing list Shorewall-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/shorewall-users
Re: [Shorewall-users] locking processes left behind
On 04/11/2018 06:30 PM, Brian J. Murrell wrote: > On Wed, 2018-02-28 at 08:51 -0800, Tom Eastep wrote: >> >> There are quite a few, but they are only an issue for people who have >> to >> rely on the obscure 'lock' utility. The rest just get a 'stale lock >> file >> removed' message the next time that they run shorewall[6][-lite]. >> I'll >> try to come up with a fix against 5.1.12... > > Did any solutions for this end up making it into 5.1.12? > Did you read the 5.1.12.3 release notes?Problems corrected: 1) Previously, the Shorewall[6][-lite] lock file was not always released when an error occurred. This resulted in: - A warning message saying that a stale lock file has been removed - 'lock' processes remaining after shorewall[6][-lite] terminated (only reported on OpenWRT). That has been corrected so that the lock file is released at exit if it hasn't been released already. -Tom -- Tom Eastep\ Q: What do you get when you cross a mobster with Shoreline, \ an international standard? Washington, USA \ A: Someone who makes you an offer you can't http://shorewall.org \ understand \___ signature.asc Description: OpenPGP digital signature -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Shorewall-users mailing list Shorewall-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/shorewall-users
Re: [Shorewall-users] locking processes left behind
On Wed, 2018-02-28 at 08:51 -0800, Tom Eastep wrote: > > There are quite a few, but they are only an issue for people who have > to > rely on the obscure 'lock' utility. The rest just get a 'stale lock > file > removed' message the next time that they run shorewall[6][-lite]. > I'll > try to come up with a fix against 5.1.12... Did any solutions for this end up making it into 5.1.12? Cheers, b. signature.asc Description: This is a digitally signed message part -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Shorewall-users mailing list Shorewall-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/shorewall-users
Re: [Shorewall-users] locking processes left behind
On Wed, 2018-02-28 at 08:51 -0800, Tom Eastep wrote: > > There are quite a few, but they are only an issue for people who have > to > rely on the obscure 'lock' utility. It's from busybox, FWIW. > The rest just get a 'stale lock file > removed' message the next time that they run shorewall[6][-lite]. Sure, but really, code should not be written to depend on the staleness of locks being able to be determined. Any code that takes out a lock should release it when it's done. I realize I am preaching to the choir, yes. :-) So, to that end, when I use something like lock in a shell script I do it like this: lock /tmp/foo trap 'lock -u /tmp/foo' EXIT so that I *know* the lock will be released on every/any code path to script exit. Not sure how feasible that is for you to do even with mutex_on() being shell script but just thought I would share. > I'll > try to come up with a fix against 5.1.12... Awesome. I wonder how many other issues that will resolve. Cheers, b. signature.asc Description: This is a digitally signed message part -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Shorewall-users mailing list Shorewall-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/shorewall-users
Re: [Shorewall-users] locking processes left behind
On 02/28/2018 04:04 AM, Brian J. Murrell wrote: > On Fri, 2018-01-12 at 07:09 -0500, Brian J. Murrell wrote: >> I frequently get the following situation on my shorewall-lite >> machine, >> typically right after boot, where "shorewall-lite restart" has been >> run >> many times, overlapping even, I am sure as interfaces are brought up, >> etc.: >> >> # ps -ef | grep shorewall >> root 1094 1 0 Jan11 ?00:00:01 lock /etc/shorewall- >> lite/state/lock >> root 2507 1 0 Jan11 ?00:00:01 lock /etc/shorewall- >> lite/state/lock >> root 3124 1 0 Jan11 ?00:00:00 lock /etc/shorewall- >> lite/state/lock >> root 7608 6935 0 06:29 pts/100:00:00 grep shorewall >> root 11770 1 0 Jan11 ?00:00:00 lock /etc/shorewall- >> lite/state/lock > ... >> I wonder if anyone has any theories on what is going on here? > > Here's one case where it happens: > > # ps -ef | grep \ lock | grep -v grep; /usr/sbin/shorewall-lite blacklist > 185.170.42.18; ps -ef | grep \ lock | grep -v grep > [notice there are no lock processes from the first ps | grep ] >ERROR: The blacklist command is not supported in the current Shorewall > Lite configuration > root 31693 1 0 07:00 pts/100:00:00 lock > /etc/shorewall-lite/state/lock > # sleep 5 > # ps -ef | grep \ lock | grep -v grep > root 31693 1 0 07:00 pts/100:00:00 lock > /etc/shorewall-lite/state/lock > > Not really sure why shorewall thinks the blacklist command is not > available, but that is orthogonal. The issue here is clearly there is > at least one code path where shorewall exits without cleaning up it's > lock file. I wonder how many other non-happy-path cases there are like > this. There are quite a few, but they are only an issue for people who have to rely on the obscure 'lock' utility. The rest just get a 'stale lock file removed' message the next time that they run shorewall[6][-lite]. I'll try to come up with a fix against 5.1.12... -Tom -- Tom Eastep\ Q: What do you get when you cross a mobster with Shoreline, \ an international standard? Washington, USA \ A: Someone who makes you an offer you can't http://shorewall.org \ understand \___ signature.asc Description: OpenPGP digital signature -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Shorewall-users mailing list Shorewall-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/shorewall-users
Re: [Shorewall-users] locking processes left behind
On Fri, 2018-01-12 at 07:09 -0500, Brian J. Murrell wrote: > I frequently get the following situation on my shorewall-lite > machine, > typically right after boot, where "shorewall-lite restart" has been > run > many times, overlapping even, I am sure as interfaces are brought up, > etc.: > > # ps -ef | grep shorewall > root 1094 1 0 Jan11 ?00:00:01 lock /etc/shorewall- > lite/state/lock > root 2507 1 0 Jan11 ?00:00:01 lock /etc/shorewall- > lite/state/lock > root 3124 1 0 Jan11 ?00:00:00 lock /etc/shorewall- > lite/state/lock > root 7608 6935 0 06:29 pts/100:00:00 grep shorewall > root 11770 1 0 Jan11 ?00:00:00 lock /etc/shorewall- > lite/state/lock ... > I wonder if anyone has any theories on what is going on here? Here's one case where it happens: # ps -ef | grep \ lock | grep -v grep; /usr/sbin/shorewall-lite blacklist 185.170.42.18; ps -ef | grep \ lock | grep -v grep [notice there are no lock processes from the first ps | grep ] ERROR: The blacklist command is not supported in the current Shorewall Lite configuration root 31693 1 0 07:00 pts/100:00:00 lock /etc/shorewall-lite/state/lock # sleep 5 # ps -ef | grep \ lock | grep -v grep root 31693 1 0 07:00 pts/100:00:00 lock /etc/shorewall-lite/state/lock Not really sure why shorewall thinks the blacklist command is not available, but that is orthogonal. The issue here is clearly there is at least one code path where shorewall exits without cleaning up it's lock file. I wonder how many other non-happy-path cases there are like this. Cheers, b. signature.asc Description: This is a digitally signed message part -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Shorewall-users mailing list Shorewall-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/shorewall-users
Re: [Shorewall-users] locking processes left behind
On 1/17/2018 6:40 PM, Matt Darfeuille wrote: > On 1/17/2018 5:10 PM, Brian J. Murrell wrote: >> On Mon, 2018-01-15 at 06:52 +0100, Matt Darfeuille wrote: > >> >>> 1) I don't get that behavier on OpenWRT 15.05.1 with shorewall-lite >>> 5.1.10.2. >> >> Are you building your OpenWRT packages yourself or using something >> built upstream/elsewhere? >> > > I build Shorewall from git: > > http://shorewall.org/Build.html > >>> Note that Shorewall is patched: >>> 1a68d87c9 >>> c518cfaa4 >>> 09980cc75 >>> e0a757ea0 >>> 550003f0f >> >> What is the reference point for those sha1s? Which git repository are >> they referring to? >> > > I cherry-picked those commits on top of the tag 5.1.10.2. > See also the above link. > I just upgraded to 5.1.11 so the git step is not required anymore. -Matt -- Matt Darfeuille -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Shorewall-users mailing list Shorewall-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/shorewall-users
Re: [Shorewall-users] locking processes left behind
On 1/17/2018 5:10 PM, Brian J. Murrell wrote: > On Mon, 2018-01-15 at 06:52 +0100, Matt Darfeuille wrote: >> >> Ok -- You seem to be fixated on not restricting the use of the lock >> utility. > > Please don't be so defensive. That's not it at all. I am trying to > debug a problem here and in trying to do that I am trying to understand > the nature of the problem and the tool(s) involved in the problem. > > I'm just trying to understand why you think "lock" should be exclusive > to OpenWRT and that you suggested previously that even though I am on > OpenWRT that I should "pass other option to lock". It seems like you > might know something I don't. I'm just trying to discover what that > is. > Actually, in your first e-mail you never mentioned the platform you were using: https://sourceforge.net/p/shorewall/mailman/message/36189733/ My understanding is that lock is only used on OpenWRT. Given that I wasn't aware of which platform you were on at the time; I thought that the patch might be useful. > >> 1) I don't get that behavier on OpenWRT 15.05.1 with shorewall-lite >> 5.1.10.2. > > Are you building your OpenWRT packages yourself or using something > built upstream/elsewhere? > I build Shorewall from git: http://shorewall.org/Build.html >> Note that Shorewall is patched: >> 1a68d87c9 >> c518cfaa4 >> 09980cc75 >> e0a757ea0 >> 550003f0f > > What is the reference point for those sha1s? Which git repository are > they referring to? > I cherry-picked those commits on top of the tag 5.1.10.2. See also the above link. >> 2) Using /etc for temporary files. > > Why /etc/ and not /tmp? > What I meant was: Why are the lock files created in /etc/shorewall-lite/* and not in /var. You seem to be using an old version of OpenWRT/shorewall-lite? -Matt -- Matt Darfeuille -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Shorewall-users mailing list Shorewall-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/shorewall-users
Re: [Shorewall-users] locking processes left behind
On Mon, 2018-01-15 at 06:52 +0100, Matt Darfeuille wrote: > > Ok -- You seem to be fixated on not restricting the use of the lock > utility. Please don't be so defensive. That's not it at all. I am trying to debug a problem here and in trying to do that I am trying to understand the nature of the problem and the tool(s) involved in the problem. I'm just trying to understand why you think "lock" should be exclusive to OpenWRT and that you suggested previously that even though I am on OpenWRT that I should "pass other option to lock". It seems like you might know something I don't. I'm just trying to discover what that is. > As I don't see this conversation going anywhere I guess it will only not go anywhere if everyone chooses to not participate in trying to shine a light on a potential bug. That would be a pity. > 1) I don't get that behavier on OpenWRT 15.05.1 with shorewall-lite > 5.1.10.2. Are you building your OpenWRT packages yourself or using something built upstream/elsewhere? > Note that Shorewall is patched: > 1a68d87c9 > c518cfaa4 > 09980cc75 > e0a757ea0 > 550003f0f What is the reference point for those sha1s? Which git repository are they referring to? > 2) Using /etc for temporary files. Why /etc/ and not /tmp? Cheers, b. signature.asc Description: This is a digitally signed message part -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Shorewall-users mailing list Shorewall-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/shorewall-users
Re: [Shorewall-users] locking processes left behind
On 1/14/2018 6:47 PM, Brian J. Murrell wrote: > On Sun, 2018-01-14 at 17:06 +0100, Matt Darfeuille wrote: >> >> The code is only working and tested on OpenWRT. > > Which is my platform. > >> Lock on OpenWRT has limited functionalities. > > Such as what? And why would it being limited (only) on OpenWRT mean > you that make it's use exclusive to OpenWRT. > > Does any of the limited functionalities shed any light on how a lock > can be running without the file it's trying to lock being present? Ok -- You seem to be fixated on not restricting the use of the lock utility. > To be clear, the file is still there, it's just that it's directory entry > has been removed. So something removed it without unlocking it? > As I don't see this conversation going anywhere I'll just say the following: 1) I don't get that behavier on OpenWRT 15.05.1 with shorewall-lite 5.1.10.2. Note that Shorewall is patched: 1a68d87c9 c518cfaa4 09980cc75 e0a757ea0 550003f0f 2) Using /etc for temporary files. -Matt -- Matt Darfeuille -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Shorewall-users mailing list Shorewall-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/shorewall-users
Re: [Shorewall-users] locking processes left behind
On Sun, 2018-01-14 at 17:06 +0100, Matt Darfeuille wrote: > > The code is only working and tested on OpenWRT. Which is my platform. > Lock on OpenWRT has limited functionalities. Such as what? And why would it being limited (only) on OpenWRT mean you that make it's use exclusive to OpenWRT. Does any of the limited functionalities shed any light on how a lock can be running without the file it's trying to lock being present? To be clear, the file is still there, it's just that it's directory entry has been removed. So something removed it without unlocking it? Cheers, b. signature.asc Description: This is a digitally signed message part -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Shorewall-users mailing list Shorewall-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/shorewall-users
Re: [Shorewall-users] locking processes left behind
On 1/14/2018 4:54 PM, Brian J. Murrell wrote: > On Sun, 2018-01-14 at 16:46 +0100, Matt Darfeuille wrote: >> >> If you are not on OpenWRT you may want to apply the attached patch. > > So, don't use "lock" on platforms other than OpenWRT? But it's Ok to > use any of the other locking method on non-OpenWRT machines? > The code is only working and tested on OpenWRT. Given your issue you may want to pass other option to lock. > Why are you promoting the use of "lock" exclusive to OpenWRT? > Lock on OpenWRT has limited functionalities. -Matt -- Matt Darfeuille -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Shorewall-users mailing list Shorewall-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/shorewall-users
Re: [Shorewall-users] locking processes left behind
On Sun, 2018-01-14 at 16:46 +0100, Matt Darfeuille wrote: > > If you are not on OpenWRT you may want to apply the attached patch. So, don't use "lock" on platforms other than OpenWRT? But it's Ok to use any of the other locking method on non-OpenWRT machines? Why are you promoting the use of "lock" exclusive to OpenWRT? Cheers, b. signature.asc Description: This is a digitally signed message part -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Shorewall-users mailing list Shorewall-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/shorewall-users
Re: [Shorewall-users] locking processes left behind
On 1/12/2018 7:33 PM, Tom Eastep wrote: > On 01/12/2018 04:09 AM, Brian J. Murrell wrote: >> I frequently get the following situation on my shorewall-lite machine, >> typically right after boot, where "shorewall-lite restart" has been run >> many times, overlapping even, I am sure as interfaces are brought up, >> etc.: >> >> # ps -ef | grep shorewall >> root 1094 1 0 Jan11 ?00:00:01 lock >> /etc/shorewall-lite/state/lock >> root 2507 1 0 Jan11 ?00:00:01 lock >> /etc/shorewall-lite/state/lock >> root 3124 1 0 Jan11 ?00:00:00 lock >> /etc/shorewall-lite/state/lock >> root 7608 6935 0 06:29 pts/100:00:00 grep shorewall >> root 11770 1 0 Jan11 ?00:00:00 lock >> /etc/shorewall-lite/state/lock >> >> # ls -l /etc/shorewall-lite/state/lock >> ls: /etc/shorewall-lite/state/lock: No such file or directory >> # ls -l /proc/{1094,2507,3124,11770}/fd/ >> /proc/1094/fd/: >> lr-x--1 root root64 Jan 12 06:26 0 -> /dev/null >> l-wx--1 root root64 Jan 12 06:26 1 -> pipe:[1896] >> l-wx--1 root root64 Jan 12 06:26 12 -> pipe:[1896] >> l-wx--1 root root64 Jan 12 06:26 2 -> pipe:[1896] >> lrwx--1 root root64 Jan 12 06:26 3 -> >> /etc/shorewall-lite/state/lock (deleted) >> >> /proc/11770/fd/: >> lrwx--1 root root64 Jan 12 06:30 0 -> /dev/null >> lrwx--1 root root64 Jan 12 06:30 1 -> /dev/null >> l-wx--1 root root64 Jan 12 06:30 13 -> pipe:[1718] >> lrwx--1 root root64 Jan 12 06:30 2 -> /dev/null >> lrwx--1 root root64 Jan 12 06:30 3 -> >> /etc/shorewall-lite/state/lock (deleted) >> >> /proc/2507/fd/: >> lrwx--1 root root64 Jan 12 06:30 0 -> /dev/null >> lrwx--1 root root64 Jan 12 06:30 1 -> /dev/null >> l-wx--1 root root64 Jan 12 06:30 13 -> pipe:[1718] >> lrwx--1 root root64 Jan 12 06:30 2 -> /dev/null >> lrwx--1 root root64 Jan 12 06:30 3 -> >> /etc/shorewall-lite/state/lock (deleted) >> >> /proc/3124/fd/: >> lrwx--1 root root64 Jan 12 06:30 0 -> /dev/null >> lrwx--1 root root64 Jan 12 06:30 1 -> /dev/null >> l-wx--1 root root64 Jan 12 06:30 13 -> pipe:[1718] >> lrwx--1 root root64 Jan 12 06:30 2 -> /dev/null >> lrwx--1 root root64 Jan 12 06:30 3 -> >> /etc/shorewall-lite/state/lock (deleted) >> # lsof -n | grep state/lock >> lock 1094root3u REG 0,145 13407 >> /etc/shorewall-lite/state/lock (deleted) >> lock 2507root3u REG 0,145 13415 >> /etc/shorewall-lite/state/lock (deleted) >> lock 3124root3u REG 0,145 13448 >> /etc/shorewall-lite/state/lock (deleted) >> lock 11770root3u REG 0,146 13663 >> /etc/shorewall-lite/state/lock (deleted) >> >> I wonder if anyone has any theories on what is going on here? >> > > I do not -- here is the only code in Shorewall that invoked 'lock' (one > line might appear folded by my mailer): > > mutex_on() > { > local try > try=0 > local lockf > lockf=${LOCKFILE:=${VARDIR}/lock} > local lockpid > local lockd > > MUTEX_TIMEOUT=${MUTEX_TIMEOUT:-60} > > if [ $MUTEX_TIMEOUT -gt 0 ]; then > > lockd=$(dirname $LOCKFILE) > > [ -d "$lockd" ] || mkdir -p "$lockd" > > if [ -f $lockf ]; then > lockpid=`cat ${lockf} 2> /dev/null` > if [ -z "$lockpid" -o $lockpid = 0 ]; then > rm -f ${lockf} > error_message "WARNING: Stale lockfile ${lockf} removed" > elif [ $lockpid -eq $$ ]; then > return 0 > elif ! ps | grep -v grep | qt grep ${lockpid}; then > rm -f ${lockf} > error_message "WARNING: Stale lockfile ${lockf} from pid > ${lockpid} > removed" > fi > fi > > if qt mywhich lockfile; then > lockfile -${MUTEX_TIMEOUT} -r1 ${lockf} > chmod u+w ${lockf} > echo $$ > ${lockf} > chmod u-w ${lockf} > elif qt mywhich lock; then > lock ${lockf} > chmod u=r ${lockf} > else > while [ -f ${lockf} -a ${try} -lt ${MUTEX_TIMEOUT} ] ; do > sleep 1 > try=$((${try} + 1)) > done > > if [ ${try} -lt ${MUTEX_TIMEOUT} ] ; then > # Create the lockfile > echo $$ > ${lockf} > else > echo "Giving up on lock file ${lockf}" >&2 > fi > fi > fi > } > > The part that invoked 'lock' was contributed, as I recall. > If you are not on
Re: [Shorewall-users] locking processes left behind
On 01/12/2018 04:09 AM, Brian J. Murrell wrote: > I frequently get the following situation on my shorewall-lite machine, > typically right after boot, where "shorewall-lite restart" has been run > many times, overlapping even, I am sure as interfaces are brought up, > etc.: > > # ps -ef | grep shorewall > root 1094 1 0 Jan11 ?00:00:01 lock > /etc/shorewall-lite/state/lock > root 2507 1 0 Jan11 ?00:00:01 lock > /etc/shorewall-lite/state/lock > root 3124 1 0 Jan11 ?00:00:00 lock > /etc/shorewall-lite/state/lock > root 7608 6935 0 06:29 pts/100:00:00 grep shorewall > root 11770 1 0 Jan11 ?00:00:00 lock > /etc/shorewall-lite/state/lock > > # ls -l /etc/shorewall-lite/state/lock > ls: /etc/shorewall-lite/state/lock: No such file or directory > # ls -l /proc/{1094,2507,3124,11770}/fd/ > /proc/1094/fd/: > lr-x--1 root root64 Jan 12 06:26 0 -> /dev/null > l-wx--1 root root64 Jan 12 06:26 1 -> pipe:[1896] > l-wx--1 root root64 Jan 12 06:26 12 -> pipe:[1896] > l-wx--1 root root64 Jan 12 06:26 2 -> pipe:[1896] > lrwx--1 root root64 Jan 12 06:26 3 -> > /etc/shorewall-lite/state/lock (deleted) > > /proc/11770/fd/: > lrwx--1 root root64 Jan 12 06:30 0 -> /dev/null > lrwx--1 root root64 Jan 12 06:30 1 -> /dev/null > l-wx--1 root root64 Jan 12 06:30 13 -> pipe:[1718] > lrwx--1 root root64 Jan 12 06:30 2 -> /dev/null > lrwx--1 root root64 Jan 12 06:30 3 -> > /etc/shorewall-lite/state/lock (deleted) > > /proc/2507/fd/: > lrwx--1 root root64 Jan 12 06:30 0 -> /dev/null > lrwx--1 root root64 Jan 12 06:30 1 -> /dev/null > l-wx--1 root root64 Jan 12 06:30 13 -> pipe:[1718] > lrwx--1 root root64 Jan 12 06:30 2 -> /dev/null > lrwx--1 root root64 Jan 12 06:30 3 -> > /etc/shorewall-lite/state/lock (deleted) > > /proc/3124/fd/: > lrwx--1 root root64 Jan 12 06:30 0 -> /dev/null > lrwx--1 root root64 Jan 12 06:30 1 -> /dev/null > l-wx--1 root root64 Jan 12 06:30 13 -> pipe:[1718] > lrwx--1 root root64 Jan 12 06:30 2 -> /dev/null > lrwx--1 root root64 Jan 12 06:30 3 -> > /etc/shorewall-lite/state/lock (deleted) > # lsof -n | grep state/lock > lock 1094root3u REG 0,145 13407 > /etc/shorewall-lite/state/lock (deleted) > lock 2507root3u REG 0,145 13415 > /etc/shorewall-lite/state/lock (deleted) > lock 3124root3u REG 0,145 13448 > /etc/shorewall-lite/state/lock (deleted) > lock 11770root3u REG 0,146 13663 > /etc/shorewall-lite/state/lock (deleted) > > I wonder if anyone has any theories on what is going on here? > I do not -- here is the only code in Shorewall that invoked 'lock' (one line might appear folded by my mailer): mutex_on() { local try try=0 local lockf lockf=${LOCKFILE:=${VARDIR}/lock} local lockpid local lockd MUTEX_TIMEOUT=${MUTEX_TIMEOUT:-60} if [ $MUTEX_TIMEOUT -gt 0 ]; then lockd=$(dirname $LOCKFILE) [ -d "$lockd" ] || mkdir -p "$lockd" if [ -f $lockf ]; then lockpid=`cat ${lockf} 2> /dev/null` if [ -z "$lockpid" -o $lockpid = 0 ]; then rm -f ${lockf} error_message "WARNING: Stale lockfile ${lockf} removed" elif [ $lockpid -eq $$ ]; then return 0 elif ! ps | grep -v grep | qt grep ${lockpid}; then rm -f ${lockf} error_message "WARNING: Stale lockfile ${lockf} from pid ${lockpid} removed" fi fi if qt mywhich lockfile; then lockfile -${MUTEX_TIMEOUT} -r1 ${lockf} chmod u+w ${lockf} echo $$ > ${lockf} chmod u-w ${lockf} elif qt mywhich lock; then lock ${lockf} chmod u=r ${lockf} else while [ -f ${lockf} -a ${try} -lt ${MUTEX_TIMEOUT} ] ; do sleep 1 try=$((${try} + 1)) done if [ ${try} -lt ${MUTEX_TIMEOUT} ] ; then # Create the lockfile echo $$ > ${lockf} else echo "Giving up on lock file ${lockf}" >&2 fi fi fi } The part that invoked 'lock' was contributed, as I recall. -Tom -- Tom Eastep\ Q: What do you get when you cross a mobster with Shoreline, \ an international standard? Washington, USA \ A: Someone who makes you an
[Shorewall-users] locking processes left behind
I frequently get the following situation on my shorewall-lite machine, typically right after boot, where "shorewall-lite restart" has been run many times, overlapping even, I am sure as interfaces are brought up, etc.: # ps -ef | grep shorewall root 1094 1 0 Jan11 ?00:00:01 lock /etc/shorewall-lite/state/lock root 2507 1 0 Jan11 ?00:00:01 lock /etc/shorewall-lite/state/lock root 3124 1 0 Jan11 ?00:00:00 lock /etc/shorewall-lite/state/lock root 7608 6935 0 06:29 pts/100:00:00 grep shorewall root 11770 1 0 Jan11 ?00:00:00 lock /etc/shorewall-lite/state/lock # ls -l /etc/shorewall-lite/state/lock ls: /etc/shorewall-lite/state/lock: No such file or directory # ls -l /proc/{1094,2507,3124,11770}/fd/ /proc/1094/fd/: lr-x--1 root root64 Jan 12 06:26 0 -> /dev/null l-wx--1 root root64 Jan 12 06:26 1 -> pipe:[1896] l-wx--1 root root64 Jan 12 06:26 12 -> pipe:[1896] l-wx--1 root root64 Jan 12 06:26 2 -> pipe:[1896] lrwx--1 root root64 Jan 12 06:26 3 -> /etc/shorewall-lite/state/lock (deleted) /proc/11770/fd/: lrwx--1 root root64 Jan 12 06:30 0 -> /dev/null lrwx--1 root root64 Jan 12 06:30 1 -> /dev/null l-wx--1 root root64 Jan 12 06:30 13 -> pipe:[1718] lrwx--1 root root64 Jan 12 06:30 2 -> /dev/null lrwx--1 root root64 Jan 12 06:30 3 -> /etc/shorewall-lite/state/lock (deleted) /proc/2507/fd/: lrwx--1 root root64 Jan 12 06:30 0 -> /dev/null lrwx--1 root root64 Jan 12 06:30 1 -> /dev/null l-wx--1 root root64 Jan 12 06:30 13 -> pipe:[1718] lrwx--1 root root64 Jan 12 06:30 2 -> /dev/null lrwx--1 root root64 Jan 12 06:30 3 -> /etc/shorewall-lite/state/lock (deleted) /proc/3124/fd/: lrwx--1 root root64 Jan 12 06:30 0 -> /dev/null lrwx--1 root root64 Jan 12 06:30 1 -> /dev/null l-wx--1 root root64 Jan 12 06:30 13 -> pipe:[1718] lrwx--1 root root64 Jan 12 06:30 2 -> /dev/null lrwx--1 root root64 Jan 12 06:30 3 -> /etc/shorewall-lite/state/lock (deleted) # lsof -n | grep state/lock lock 1094root3u REG 0,145 13407 /etc/shorewall-lite/state/lock (deleted) lock 2507root3u REG 0,145 13415 /etc/shorewall-lite/state/lock (deleted) lock 3124root3u REG 0,145 13448 /etc/shorewall-lite/state/lock (deleted) lock 11770root3u REG 0,146 13663 /etc/shorewall-lite/state/lock (deleted) I wonder if anyone has any theories on what is going on here? Cheers, b. signature.asc Description: This is a digitally signed message part -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Shorewall-users mailing list Shorewall-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/shorewall-users