Re: [Shorewall-users] locking processes left behind

2019-03-02 Thread Brian J. Murrell
On Mon, 2019-02-25 at 07:14 -0500, Brian J. Murrell wrote:
> 
> On the "lite" machine I have
> 5.2.0.4.

~sigh~ Which is one single bugfix release behind what I need.

Cheers,
b.



signature.asc
Description: This is a digitally signed message part
___
Shorewall-users mailing list
Shorewall-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/shorewall-users


Re: [Shorewall-users] locking processes left behind

2019-02-25 Thread Brian J. Murrell
On Wed, 2018-08-01 at 14:27 -0700, Tom Eastep wrote:
> 

So, getting back to this...

> Any results with this patch? I would like to include this fix in
> 5.2.0.5.

On the shorewall (i.e. not the "lite") machine, I have 5.2.0.5
installed that includes this patch.  On the "lite" machine I have
5.2.0.4.

I am still seeing locking problems.  For example:

 1084 ?S  0:06 /usr/sbin/foolsm -c /etc/foolsm/foolsm.conf -p 
/var/run/foolsm.pid
 2332 ?S  0:00  \_ /bin/sh /etc/foolsm/script up Cogeco 
24.226.22.71 eth0.2 root 10 6 5 0 10 0 0 17690 
 2362 ?S  0:00  \_ /bin/sh /etc/shorewall-lite/state/firewall 
enable eth0.2
 2377 ?S  0:00  \_ lock /etc/shorewall-lite/state/lock
 2928 ?S  0:00 lock /etc/shorewall-lite/state/lock

And then once I kill the stale lock:

# kill 2928

the above blocked "firewall enable eth0.2" proceeds but then leaves
behind another stale lock:

root  8558 1  0 06:54 ?00:00:00 lock 
/etc/shorewall-lite/state/lock

So something is still amiss here.

Cheers,
b.



signature.asc
Description: This is a digitally signed message part
___
Shorewall-users mailing list
Shorewall-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/shorewall-users


Re: [Shorewall-users] locking processes left behind

2018-08-01 Thread Tom Eastep
On 07/31/2018 08:26 AM, Tom Eastep wrote:

> 
> In addition to that issue, the preceding line was incorrect ('qt' was
> incorrect). Revised second patch attached.
> 

Any results with this patch? I would like to include this fix in 5.2.0.5.

Thanks,
-Tom
-- 
Tom Eastep\   Q: What do you get when you cross a mobster with
Shoreline, \ an international standard?
Washington, USA \ A: Someone who makes you an offer you can't
http://shorewall.org \   understand
  \___



signature.asc
Description: OpenPGP digital signature
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Shorewall-users mailing list
Shorewall-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/shorewall-users


Re: [Shorewall-users] locking processes left behind

2018-07-31 Thread Tom Eastep
On 07/31/2018 05:19 AM, Brian J. Murrell wrote:
> On Tue, 2018-07-31 at 10:48 +0200, Matt Darfeuille wrote:
>>
>> The attached patch (MUTEX_ON_TAKE1.patch) includes:
>> - MUTEX_ON.patch
>> - MUTEX_ON1.patch
>> - The above correction (changing 'openwrt' to 'lockbin')
>> - My take on fixing the above error ( "/sbin/shorewall-lite: line 14:
>> -n: not found")
>>
>> Brian/Tom, thoughts?
> 
> Yeah.  The use of "test" in that patch was a new one on me so I had
> just assumed the use was correct.  But even in a bash shell here, that
> syntax doesn't work:
> 
> $ foo=[ -n "$foobar" ]
> bash: -n: command not found
> 
> Nor do any more fully qualified uses of it (to eliminate shell
> interpretation as being the cause of the problem):
> 
> $ foo=\[ -n "$foobar" ]
> bash: -n: command not found
> $ foo=/bin/[ -n "$foobar" ]
> bash: -n: command not found
> $ foo=/bin/test -n "$foobar"
> bash: -n: command not found
> 
> What does work (doesn't produce a syntax error) is:
> 
> $ foo=$([ -n "$foobar" ])
> $ echo $foo
> $ foobar=foobar
> $ foo=$([ -n "$foobar" ])
> $ echo $foo
> 
> $
> 
> but I can't see how that helps us since $foo is the same when the test
> passes and fails.  The best approximation of what I think Tom was
> trying to achieve is:
> 
> $ unset foobar
> $ [ -n "$foobar" ]
> $ echo $?
> 1
> $ foobar=foobar
> $ [ -n "$foobar" ]
> $ echo $?
> 0
> 
> But that still doesn't give us a "boolean" type value in $openwrt that
> we can use in if statements.  So, I think what we want is:
> 
> if [ -n "$lockbin" -a -h "$lockbin" ]; then
> openwrt=true
> else
> openwrt=false
> fi
> 

In addition to that issue, the preceding line was incorrect ('qt' was
incorrect). Revised second patch attached.

-Tom
-- 
Tom Eastep\   Q: What do you get when you cross a mobster with
Shoreline, \ an international standard?
Washington, USA \ A: Someone who makes you an offer you can't
http://shorewall.org \   understand
  \___
diff --git a/Shorewall-core/lib.common b/Shorewall-core/lib.common
index 205fc705f..7df2879c7 100644
--- a/Shorewall-core/lib.common
+++ b/Shorewall-core/lib.common
@@ -751,6 +751,8 @@ mutex_on()
 lockf=${LOCKFILE:=${VARDIR}/lock}
 local lockpid
 local lockd
+local lockbin
+local openwrt
 
 MUTEX_TIMEOUT=${MUTEX_TIMEOUT:-60}
 
@@ -760,28 +762,33 @@ mutex_on()
 
 	[ -d "$lockd" ] || mkdir -p "$lockd"
 
+	lockbin=$(mywhich lock)
+	[ -n "$lockbin" -a -h "$lockbin" ] && openwrt=Yes
+
 	if [ -f $lockf ]; then
 	lockpid=`cat ${lockf} 2> /dev/null`
 	if [ -z "$lockpid" ] || [ $lockpid = 0 ]; then
 		rm -f ${lockf}
 		error_message "WARNING: Stale lockfile ${lockf} removed"
-	elif [ $lockpid -eq $$ ]; then
-fatal_error "Mutex_on confusion"
-	elif ! qt ps --pid ${lockpid}; then
-		rm -f ${lockf}
-		error_message "WARNING: Stale lockfile ${lockf} from pid ${lockpid} removed"
+	elif [ -z "$openwrt" ]; then
+		if [ $lockpid -eq $$ ]; then
+fatal_error "Mutex_on confusion"
+		elif ! qt ps --pid ${lockpid}; then
+		rm -f ${lockf}
+		error_message "WARNING: Stale lockfile ${lockf} from pid ${lockpid} removed"
+		fi
 	fi
 	fi
 
-	if qt mywhich lockfile; then
+	if [ -n "$openwrt" ]; then
+	lock ${lockf} || fatal_error "Can't lock ${lockf}"
+	g_havemutex="lock -u ${lockf}"
+	elif qt mywhich lockfile; then
 	lockfile -${MUTEX_TIMEOUT} -r1 ${lockf} || fatal_error "Can't lock ${lockf}"
 	g_havemutex="rm -f ${lockf}"
 	chmod u+w ${lockf}
 	echo $$ > ${lockf}
 	chmod u-w ${lockf}
-	elif qt mywhich lock; then
-	lock ${lockf} || fatal_error "Can't lock ${lockf}"
-	g_havemutex="lock -u ${lockf}"
 	else
 	while [ -f ${lockf} -a ${try} -lt ${MUTEX_TIMEOUT} ] ; do
 		sleep 1


signature.asc
Description: OpenPGP digital signature
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Shorewall-users mailing list
Shorewall-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/shorewall-users


Re: [Shorewall-users] locking processes left behind

2018-07-31 Thread Brian J. Murrell
On Tue, 2018-07-31 at 10:48 +0200, Matt Darfeuille wrote:
> 
> The attached patch (MUTEX_ON_TAKE1.patch) includes:
> - MUTEX_ON.patch
> - MUTEX_ON1.patch
> - The above correction (changing 'openwrt' to 'lockbin')
> - My take on fixing the above error ( "/sbin/shorewall-lite: line 14:
> -n: not found")
> 
> Brian/Tom, thoughts?

Yeah.  The use of "test" in that patch was a new one on me so I had
just assumed the use was correct.  But even in a bash shell here, that
syntax doesn't work:

$ foo=[ -n "$foobar" ]
bash: -n: command not found

Nor do any more fully qualified uses of it (to eliminate shell
interpretation as being the cause of the problem):

$ foo=\[ -n "$foobar" ]
bash: -n: command not found
$ foo=/bin/[ -n "$foobar" ]
bash: -n: command not found
$ foo=/bin/test -n "$foobar"
bash: -n: command not found

What does work (doesn't produce a syntax error) is:

$ foo=$([ -n "$foobar" ])
$ echo $foo
$ foobar=foobar
$ foo=$([ -n "$foobar" ])
$ echo $foo

$

but I can't see how that helps us since $foo is the same when the test
passes and fails.  The best approximation of what I think Tom was
trying to achieve is:

$ unset foobar
$ [ -n "$foobar" ]
$ echo $?
1
$ foobar=foobar
$ [ -n "$foobar" ]
$ echo $?
0

But that still doesn't give us a "boolean" type value in $openwrt that
we can use in if statements.  So, I think what we want is:

if [ -n "$lockbin" -a -h "$lockbin" ]; then
openwrt=true
else
openwrt=false
fi

Cheers,
b.



signature.asc
Description: This is a digitally signed message part
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Shorewall-users mailing list
Shorewall-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/shorewall-users


[Shorewall-users] locking processes left behind

2018-07-31 Thread Matt Darfeuille
On 7/30/2018 6:11 PM, Matt Darfeuille wrote:
> On 7/28/2018 5:19 PM, Tom Eastep wrote:
>> On 07/28/2018 08:16 AM, Brian J. Murrell wrote:
>>> On Sat, 2018-07-28 at 08:03 -0700, Tom Eastep wrote:
 diff --git a/Shorewall-core/lib.common b/Shorewall-core/lib.common
 index 205fc705f..bbebf0936 100644
 --- a/Shorewall-core/lib.common
 +++ b/Shorewall-core/lib.common
 @@ -751,6 +751,8 @@ mutex_on()
  lockf=${LOCKFILE:=${VARDIR}/lock}
  local lockpid
  local lockd
 +local lockbin
 +local openwrt
  
  MUTEX_TIMEOUT=${MUTEX_TIMEOUT:-60}
  
 @@ -760,28 +762,33 @@ mutex_on()
  
[ -d "$lockd" ] || mkdir -p "$lockd"
  
 +  lockbin=$(qt mywhich lock)
 +  openwrt=[ -n "$openwrt" -a -h "$openwrt" ]
>>>
>>> Did you mean:
>>>
>>> +   openwrt=[ -n "$lockbin" -a -h "$lockbin" ]
>>>
>>> here?
>>>
>>
>> Yes.
>>
> 
> With both patch applied (MUTEX_ON.patch and MUTEX_ON1.patch) and the
> above correction I get:
> 
> root@LEDE:~# shorewall-lite restart
> /sbin/shorewall-lite: line 14: -n: not found
>WARNING: Stale lockfile /lib/shorewall-lite/lock from pid 854 removed
> Stopping Shorewall Lite
> 
> What am I missing ?
> 

Ok -- Here's my take on the above:

The attached patch (MUTEX_ON_TAKE1.patch) includes:
- MUTEX_ON.patch
- MUTEX_ON1.patch
- The above correction (changing 'openwrt' to 'lockbin')
- My take on fixing the above error ( "/sbin/shorewall-lite: line 14:
-n: not found")

Brian/Tom, thoughts?

-Matt
-- 
Matt Darfeuille

diff --git a/Shorewall-core/lib.common b/Shorewall-core/lib.common
index c373a31ad..36800c887 100644
--- a/Shorewall-core/lib.common
+++ b/Shorewall-core/lib.common
@@ -751,6 +751,8 @@ mutex_on()
 lockf=${LOCKFILE:=${VARDIR}/lock}
 local lockpid
 local lockd
+local lockbin
+local openwrt
 
 MUTEX_TIMEOUT=${MUTEX_TIMEOUT:-60}
 
@@ -760,29 +762,33 @@ mutex_on()
 
[ -d "$lockd" ] || mkdir -p "$lockd"
 
+   lockbin=$(qt mywhich lock)
+   [ -n "$lockbin" -a -h "$lockbin" ] && openwrt=Yes || openwrt=
+
if [ -f $lockf ]; then
lockpid=`cat ${lockf} 2> /dev/null`
if [ -z "$lockpid" ] || [ $lockpid = 0 ]; then
rm -f ${lockf}
error_message "WARNING: Stale lockfile ${lockf} removed"
-   elif [ $lockpid -eq $$ ]; then
-return 0
-   elif ! ps | grep -v grep | qt grep ${lockpid}; then
-   rm -f ${lockf}
-   error_message "WARNING: Stale lockfile ${lockf} from pid 
${lockpid} removed"
+   elif [ -z "$openwrt" ]; then
+   if [ $lockpid -eq $$ ]; then
+fatal_error "Mutex_on confusion"
+   elif ! qt ps --pid ${lockpid}; then
+   rm -f ${lockf}
+   error_message "WARNING: Stale lockfile ${lockf} from pid 
${lockpid} removed"
+   fi
fi
fi
 
-   if qt mywhich lockfile; then
-   lockfile -${MUTEX_TIMEOUT} -r1 ${lockf}
+   if [ -n "$openwrt" ]; then
+   lock ${lockf} || fatal_error "Can't lock ${lockf}"
+   g_havemutex="lock -u ${lockf}"
+   elif qt mywhich lockfile; then
+   lockfile -${MUTEX_TIMEOUT} -r1 ${lockf} || fatal_error "Can't lock 
${lockf}"
g_havemutex="rm -f ${lockf}"
chmod u+w ${lockf}
echo $$ > ${lockf}
chmod u-w ${lockf}
-   elif qt mywhich lock; then
-   lock ${lockf}
-   g_havemutex="lock -u ${lockf} && rm -f ${lockf}"
-   chmod u=r ${lockf}
else
while [ -f ${lockf} -a ${try} -lt ${MUTEX_TIMEOUT} ] ; do
sleep 1
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Shorewall-users mailing list
Shorewall-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/shorewall-users


Re: [Shorewall-users] locking processes left behind

2018-07-30 Thread Matt Darfeuille
On 7/28/2018 5:19 PM, Tom Eastep wrote:
> On 07/28/2018 08:16 AM, Brian J. Murrell wrote:
>> On Sat, 2018-07-28 at 08:03 -0700, Tom Eastep wrote:
>>> diff --git a/Shorewall-core/lib.common b/Shorewall-core/lib.common
>>> index 205fc705f..bbebf0936 100644
>>> --- a/Shorewall-core/lib.common
>>> +++ b/Shorewall-core/lib.common
>>> @@ -751,6 +751,8 @@ mutex_on()
>>>  lockf=${LOCKFILE:=${VARDIR}/lock}
>>>  local lockpid
>>>  local lockd
>>> +local lockbin
>>> +local openwrt
>>>  
>>>  MUTEX_TIMEOUT=${MUTEX_TIMEOUT:-60}
>>>  
>>> @@ -760,28 +762,33 @@ mutex_on()
>>>  
>>> [ -d "$lockd" ] || mkdir -p "$lockd"
>>>  
>>> +   lockbin=$(qt mywhich lock)
>>> +   openwrt=[ -n "$openwrt" -a -h "$openwrt" ]
>>
>> Did you mean:
>>
>> +openwrt=[ -n "$lockbin" -a -h "$lockbin" ]
>>
>> here?
>>
> 
> Yes.
> 

With both patch applied (MUTEX_ON.patch and MUTEX_ON1.patch) and the
above correction I get:

root@LEDE:~# shorewall-lite restart
/sbin/shorewall-lite: line 14: -n: not found
   WARNING: Stale lockfile /lib/shorewall-lite/lock from pid 854 removed
Stopping Shorewall Lite

What am I missing ?

-Matt
-- 
Matt Darfeuille

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Shorewall-users mailing list
Shorewall-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/shorewall-users


Re: [Shorewall-users] locking processes left behind

2018-07-29 Thread Tom Eastep
On 07/28/2018 08:40 AM, Matt Darfeuille wrote:
> On 7/28/2018 5:19 PM, Tom Eastep wrote:
>> On 07/28/2018 08:16 AM, Brian J. Murrell wrote:
>>> On Sat, 2018-07-28 at 08:03 -0700, Tom Eastep wrote:
 diff --git a/Shorewall-core/lib.common b/Shorewall-core/lib.common
 index 205fc705f..bbebf0936 100644
 --- a/Shorewall-core/lib.common
 +++ b/Shorewall-core/lib.common
 @@ -751,6 +751,8 @@ mutex_on()
  lockf=${LOCKFILE:=${VARDIR}/lock}
  local lockpid
  local lockd
 +local lockbin
 +local openwrt
  
  MUTEX_TIMEOUT=${MUTEX_TIMEOUT:-60}
  
 @@ -760,28 +762,33 @@ mutex_on()
  
[ -d "$lockd" ] || mkdir -p "$lockd"
  
 +  lockbin=$(qt mywhich lock)
 +  openwrt=[ -n "$openwrt" -a -h "$openwrt" ]
>>>
>>> Did you mean:
>>>
>>> +   openwrt=[ -n "$lockbin" -a -h "$lockbin" ]
>>>
>>> here?
>>>
>>
>> Yes.
>>
> 
> Tom, would you mind pushing the master branch of the trunk (code) repo?
> 

This last change is not committed yet, and I don't want to push until
Brian is happy with the result.

-Tom
-- 
Tom Eastep\   Q: What do you get when you cross a mobster with
Shoreline, \ an international standard?
Washington, USA \ A: Someone who makes you an offer you can't
http://shorewall.org \   understand
  \___



signature.asc
Description: OpenPGP digital signature
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Shorewall-users mailing list
Shorewall-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/shorewall-users


Re: [Shorewall-users] locking processes left behind

2018-07-28 Thread Matt Darfeuille
On 7/28/2018 5:19 PM, Tom Eastep wrote:
> On 07/28/2018 08:16 AM, Brian J. Murrell wrote:
>> On Sat, 2018-07-28 at 08:03 -0700, Tom Eastep wrote:
>>> diff --git a/Shorewall-core/lib.common b/Shorewall-core/lib.common
>>> index 205fc705f..bbebf0936 100644
>>> --- a/Shorewall-core/lib.common
>>> +++ b/Shorewall-core/lib.common
>>> @@ -751,6 +751,8 @@ mutex_on()
>>>  lockf=${LOCKFILE:=${VARDIR}/lock}
>>>  local lockpid
>>>  local lockd
>>> +local lockbin
>>> +local openwrt
>>>  
>>>  MUTEX_TIMEOUT=${MUTEX_TIMEOUT:-60}
>>>  
>>> @@ -760,28 +762,33 @@ mutex_on()
>>>  
>>> [ -d "$lockd" ] || mkdir -p "$lockd"
>>>  
>>> +   lockbin=$(qt mywhich lock)
>>> +   openwrt=[ -n "$openwrt" -a -h "$openwrt" ]
>>
>> Did you mean:
>>
>> +openwrt=[ -n "$lockbin" -a -h "$lockbin" ]
>>
>> here?
>>
> 
> Yes.
> 

Tom, would you mind pushing the master branch of the trunk (code) repo?

-Matt
-- 
Matt Darfeuille

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Shorewall-users mailing list
Shorewall-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/shorewall-users


Re: [Shorewall-users] locking processes left behind

2018-07-28 Thread Tom Eastep
On 07/28/2018 08:16 AM, Brian J. Murrell wrote:
> On Sat, 2018-07-28 at 08:03 -0700, Tom Eastep wrote:
>> diff --git a/Shorewall-core/lib.common b/Shorewall-core/lib.common
>> index 205fc705f..bbebf0936 100644
>> --- a/Shorewall-core/lib.common
>> +++ b/Shorewall-core/lib.common
>> @@ -751,6 +751,8 @@ mutex_on()
>>  lockf=${LOCKFILE:=${VARDIR}/lock}
>>  local lockpid
>>  local lockd
>> +local lockbin
>> +local openwrt
>>  
>>  MUTEX_TIMEOUT=${MUTEX_TIMEOUT:-60}
>>  
>> @@ -760,28 +762,33 @@ mutex_on()
>>  
>>  [ -d "$lockd" ] || mkdir -p "$lockd"
>>  
>> +lockbin=$(qt mywhich lock)
>> +openwrt=[ -n "$openwrt" -a -h "$openwrt" ]
> 
> Did you mean:
> 
> + openwrt=[ -n "$lockbin" -a -h "$lockbin" ]
> 
> here?
> 

Yes.

-Tom
-- 
Tom Eastep\   Q: What do you get when you cross a mobster with
Shoreline, \ an international standard?
Washington, USA \ A: Someone who makes you an offer you can't
http://shorewall.org \   understand
  \___



signature.asc
Description: OpenPGP digital signature
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Shorewall-users mailing list
Shorewall-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/shorewall-users


Re: [Shorewall-users] locking processes left behind

2018-07-28 Thread Brian J. Murrell
On Sat, 2018-07-28 at 08:03 -0700, Tom Eastep wrote:
> diff --git a/Shorewall-core/lib.common b/Shorewall-core/lib.common
> index 205fc705f..bbebf0936 100644
> --- a/Shorewall-core/lib.common
> +++ b/Shorewall-core/lib.common
> @@ -751,6 +751,8 @@ mutex_on()
>  lockf=${LOCKFILE:=${VARDIR}/lock}
>  local lockpid
>  local lockd
> +local lockbin
> +local openwrt
>  
>  MUTEX_TIMEOUT=${MUTEX_TIMEOUT:-60}
>  
> @@ -760,28 +762,33 @@ mutex_on()
>  
>   [ -d "$lockd" ] || mkdir -p "$lockd"
>  
> + lockbin=$(qt mywhich lock)
> + openwrt=[ -n "$openwrt" -a -h "$openwrt" ]

Did you mean:

+   openwrt=[ -n "$lockbin" -a -h "$lockbin" ]

here?

Cheers,
b.


signature.asc
Description: This is a digitally signed message part
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Shorewall-users mailing list
Shorewall-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/shorewall-users


Re: [Shorewall-users] locking processes left behind

2018-07-28 Thread Brian J. Murrell
On Sat, 2018-07-28 at 15:04 +0200, Matt Darfeuille wrote:
> 
> Tom, with MUTEX_ON.patch applied, on LEDE '--pid' is not available or
> is
> it done on purpose?:
> 
> root@LEDE:~# ps --pid
> ps: unrecognized option: pid
> BusyBox v1.25.1 () multi-call binary.
> 
> Usage: ps
> 
> Show list of processes
> 
> w   Wide output
> root@LEDE:~#

Yeah.  The busybox ps is pretty dumb, er, I guess, "basic" is the PC
term.  I didn't notice this in manually trying out the changes Tom's
patch is proposing because I typically install the better ps from
procps-ng-ps.

So, shorewall-lite on LEDE could require that package, but probably the
lighter-weight approach is to use [ -d /proc/$pid ] as the test for a
process existing, or even "kill -0 $pid".

> Brian, any thoughts on the patch?

I installed it but have not had opportunity to reboot my router to see
how well it operates.

Cheers,
b.


signature.asc
Description: This is a digitally signed message part
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Shorewall-users mailing list
Shorewall-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/shorewall-users


Re: [Shorewall-users] locking processes left behind

2018-07-28 Thread Tom Eastep
On 07/28/2018 06:04 AM, Matt Darfeuille wrote:
> On 7/26/2018 8:41 PM, Tom Eastep wrote:
>> On 07/26/2018 09:54 AM, Brian J. Murrell wrote:
>>> On Thu, 2018-07-26 at 08:51 -0700, Tom Eastep wrote:

 Brian,
>>>
>>> Hi Tom,
>>>
 Can you point me to online documentation that describes how this
 'lock'
 utility is supposed to work?
>>>
>>> It's a busybox applet added to busybox by OpenWRT.  Here's the source
>>> for it:
>>>
>>> https://github.com/openwrt/openwrt/blob/154c0c4006daf41e2cbb6c8b7ad5557f83dfea3e/package/utils/busybox/patches/220-add_lock_util.patch
>>>
>>> There seems to be minimal documentation at:
>>>
>>> https://wiki.openwrt.org/inbox/script/manual_busybox_functions_openwrt
>>>
>>
>> Thanks Brian,
>>
>> Please see if this patch resolves the issue. The lib.common file has not
>> changed since 5.1.12.3 (other than the banner version) so you can apply
>> the patch on your admin system then copy lib.common to the router.
>>
> 
> Tom, with MUTEX_ON.patch applied, on LEDE '--pid' is not available or is
> it done on purpose?:
> 
> root@LEDE:~# ps --pid
> ps: unrecognized option: pid
> BusyBox v1.25.1 () multi-call binary.
> 
> Usage: ps
> 
> Show list of processes
> 
> w   Wide output
> root@LEDE:~#
> 
> Brian, any thoughts on the patch?
> 

This patch should resolve the 'ps' issue, as 'ps' is no longer invoked
on OpenWRT.

-Tom
-- 
Tom Eastep\   Q: What do you get when you cross a mobster with
Shoreline, \ an international standard?
Washington, USA \ A: Someone who makes you an offer you can't
http://shorewall.org \   understand
  \___
diff --git a/Shorewall-core/lib.common b/Shorewall-core/lib.common
index 205fc705f..bbebf0936 100644
--- a/Shorewall-core/lib.common
+++ b/Shorewall-core/lib.common
@@ -751,6 +751,8 @@ mutex_on()
 lockf=${LOCKFILE:=${VARDIR}/lock}
 local lockpid
 local lockd
+local lockbin
+local openwrt
 
 MUTEX_TIMEOUT=${MUTEX_TIMEOUT:-60}
 
@@ -760,28 +762,33 @@ mutex_on()
 
 	[ -d "$lockd" ] || mkdir -p "$lockd"
 
+	lockbin=$(qt mywhich lock)
+	openwrt=[ -n "$openwrt" -a -h "$openwrt" ]
+
 	if [ -f $lockf ]; then
 	lockpid=`cat ${lockf} 2> /dev/null`
 	if [ -z "$lockpid" ] || [ $lockpid = 0 ]; then
 		rm -f ${lockf}
 		error_message "WARNING: Stale lockfile ${lockf} removed"
-	elif [ $lockpid -eq $$ ]; then
-fatal_error "Mutex_on confusion"
-	elif ! qt ps --pid ${lockpid}; then
-		rm -f ${lockf}
-		error_message "WARNING: Stale lockfile ${lockf} from pid ${lockpid} removed"
+	elif [ -z "$openwrt" ]; then
+		if [ $lockpid -eq $$ ]; then
+fatal_error "Mutex_on confusion"
+		elif ! qt ps --pid ${lockpid}; then
+		rm -f ${lockf}
+		error_message "WARNING: Stale lockfile ${lockf} from pid ${lockpid} removed"
+		fi
 	fi
 	fi
 
-	if qt mywhich lockfile; then
+	if [ -n "$openwrt" ]; then
+	lock ${lockf} || fatal_error "Can't lock ${lockf}"
+	g_havemutex="lock -u ${lockf}"
+	elif qt mywhich lockfile; then
 	lockfile -${MUTEX_TIMEOUT} -r1 ${lockf} || fatal_error "Can't lock ${lockf}"
 	g_havemutex="rm -f ${lockf}"
 	chmod u+w ${lockf}
 	echo $$ > ${lockf}
 	chmod u-w ${lockf}
-	elif qt mywhich lock; then
-	lock ${lockf} || fatal_error "Can't lock ${lockf}"
-	g_havemutex="lock -u ${lockf}"
 	else
 	while [ -f ${lockf} -a ${try} -lt ${MUTEX_TIMEOUT} ] ; do
 		sleep 1


signature.asc
Description: OpenPGP digital signature
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Shorewall-users mailing list
Shorewall-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/shorewall-users


Re: [Shorewall-users] locking processes left behind

2018-07-28 Thread Matt Darfeuille
On 7/26/2018 8:41 PM, Tom Eastep wrote:
> On 07/26/2018 09:54 AM, Brian J. Murrell wrote:
>> On Thu, 2018-07-26 at 08:51 -0700, Tom Eastep wrote:
>>>
>>> Brian,
>>
>> Hi Tom,
>>
>>> Can you point me to online documentation that describes how this
>>> 'lock'
>>> utility is supposed to work?
>>
>> It's a busybox applet added to busybox by OpenWRT.  Here's the source
>> for it:
>>
>> https://github.com/openwrt/openwrt/blob/154c0c4006daf41e2cbb6c8b7ad5557f83dfea3e/package/utils/busybox/patches/220-add_lock_util.patch
>>
>> There seems to be minimal documentation at:
>>
>> https://wiki.openwrt.org/inbox/script/manual_busybox_functions_openwrt
>>
> 
> Thanks Brian,
> 
> Please see if this patch resolves the issue. The lib.common file has not
> changed since 5.1.12.3 (other than the banner version) so you can apply
> the patch on your admin system then copy lib.common to the router.
> 

Tom, with MUTEX_ON.patch applied, on LEDE '--pid' is not available or is
it done on purpose?:

root@LEDE:~# ps --pid
ps: unrecognized option: pid
BusyBox v1.25.1 () multi-call binary.

Usage: ps

Show list of processes

w   Wide output
root@LEDE:~#

Brian, any thoughts on the patch?

-Matt
-- 
Matt Darfeuille

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Shorewall-users mailing list
Shorewall-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/shorewall-users


Re: [Shorewall-users] locking processes left behind

2018-07-26 Thread Tom Eastep
On 07/26/2018 09:54 AM, Brian J. Murrell wrote:
> On Thu, 2018-07-26 at 08:51 -0700, Tom Eastep wrote:
>>
>> Brian,
> 
> Hi Tom,
> 
>> Can you point me to online documentation that describes how this
>> 'lock'
>> utility is supposed to work?
> 
> It's a busybox applet added to busybox by OpenWRT.  Here's the source
> for it:
> 
> https://github.com/openwrt/openwrt/blob/154c0c4006daf41e2cbb6c8b7ad5557f83dfea3e/package/utils/busybox/patches/220-add_lock_util.patch
> 
> There seems to be minimal documentation at:
> 
> https://wiki.openwrt.org/inbox/script/manual_busybox_functions_openwrt
> 

Thanks Brian,

Please see if this patch resolves the issue. The lib.common file has not
changed since 5.1.12.3 (other than the banner version) so you can apply
the patch on your admin system then copy lib.common to the router.

Thanks,
-Tom
-- 
Tom Eastep\   Q: What do you get when you cross a mobster with
Shoreline, \ an international standard?
Washington, USA \ A: Someone who makes you an offer you can't
http://shorewall.org \   understand
  \___
diff --git a/Shorewall-core/lib.common b/Shorewall-core/lib.common
index 7567123a7..205fc705f 100644
--- a/Shorewall-core/lib.common
+++ b/Shorewall-core/lib.common
@@ -766,23 +766,22 @@ mutex_on()
 		rm -f ${lockf}
 		error_message "WARNING: Stale lockfile ${lockf} removed"
 	elif [ $lockpid -eq $$ ]; then
-return 0
-	elif ! ps | grep -v grep | qt grep ${lockpid}; then
+fatal_error "Mutex_on confusion"
+	elif ! qt ps --pid ${lockpid}; then
 		rm -f ${lockf}
 		error_message "WARNING: Stale lockfile ${lockf} from pid ${lockpid} removed"
 	fi
 	fi
 
 	if qt mywhich lockfile; then
-	lockfile -${MUTEX_TIMEOUT} -r1 ${lockf}
+	lockfile -${MUTEX_TIMEOUT} -r1 ${lockf} || fatal_error "Can't lock ${lockf}"
 	g_havemutex="rm -f ${lockf}"
 	chmod u+w ${lockf}
 	echo $$ > ${lockf}
 	chmod u-w ${lockf}
 	elif qt mywhich lock; then
-	lock ${lockf}
-	g_havemutex="lock -u ${lockf} && rm -f ${lockf}"
-	chmod u=r ${lockf}
+	lock ${lockf} || fatal_error "Can't lock ${lockf}"
+	g_havemutex="lock -u ${lockf}"
 	else
 	while [ -f ${lockf} -a ${try} -lt ${MUTEX_TIMEOUT} ] ; do
 		sleep 1


signature.asc
Description: OpenPGP digital signature
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Shorewall-users mailing list
Shorewall-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/shorewall-users


Re: [Shorewall-users] locking processes left behind

2018-07-26 Thread Matt Darfeuille
On 7/26/2018 6:54 PM, Brian J. Murrell wrote:
> On Thu, 2018-07-26 at 08:51 -0700, Tom Eastep wrote:
>>
>> Brian,
> 
> Hi Tom,
> 
>> Can you point me to online documentation that describes how this
>> 'lock'
>> utility is supposed to work?
> 
> It's a busybox applet added to busybox by OpenWRT.  Here's the source
> for it:
> 
> https://github.com/openwrt/openwrt/blob/154c0c4006daf41e2cbb6c8b7ad5557f83dfea3e/package/utils/busybox/patches/220-add_lock_util.patch
> 
> There seems to be minimal documentation at:
> 
> https://wiki.openwrt.org/inbox/script/manual_busybox_functions_openwrt
> 

root@LEDE:~# cat /etc/banner | grep -i reboot
   \  DE\  /Reboot (17.01.4, r3560-79f57e422d)
root@LEDE:~# lock -k
Usage: lock [-suw] 
-s  Use shared locking
-u  Unlock
-w  Wait for the lock to become free, don't acquire lock
-n  Don't wait for the lock to become free. Fail with exit code

root@LEDE:~#

HTH

-Matt
-- 
Matt Darfeuille

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Shorewall-users mailing list
Shorewall-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/shorewall-users


Re: [Shorewall-users] locking processes left behind

2018-07-26 Thread Brian J. Murrell
On Thu, 2018-07-26 at 08:51 -0700, Tom Eastep wrote:
> 
> Brian,

Hi Tom,

> Can you point me to online documentation that describes how this
> 'lock'
> utility is supposed to work?

It's a busybox applet added to busybox by OpenWRT.  Here's the source
for it:

https://github.com/openwrt/openwrt/blob/154c0c4006daf41e2cbb6c8b7ad5557f83dfea3e/package/utils/busybox/patches/220-add_lock_util.patch

There seems to be minimal documentation at:

https://wiki.openwrt.org/inbox/script/manual_busybox_functions_openwrt

Cheers,
b.


signature.asc
Description: This is a digitally signed message part
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Shorewall-users mailing list
Shorewall-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/shorewall-users


Re: [Shorewall-users] locking processes left behind

2018-07-26 Thread Tom Eastep
On 07/26/2018 07:01 AM, Brian J. Murrell wrote:
> On Thu, 2018-07-26 at 05:48 +0200, Matt Darfeuille wrote:
>>
>> As illustrated by this lingering thread, issues that are only present
>> on
>> one platform makes me moved away from OpenWRT/LEDE.
> 
> The platform is not the problem.  The platform is just providing the
> tools.
> 
> Or are you suggesting that the "lock" tool on OpenWRT/LEDE is actually
> buggy?  Given that it's just a wrapper around flock() that seems
> unlikely.  But I'm happy to be proven wrong if you can provide a
> reproducer for the bug that I can submit upstream.  As much testing as
> I have done with the "lock" tool it operates as expected when used as
> expected.
> 
> Given the evidence, it seems like the file being locked is getting
> removed before the lock is released.
> 
> A reboot of my router this morning has reproduced the situation and
> this is what I see:
> 
> # ps -ef | grep lock
> root  2700  2666  0 07:13 ?00:00:00 lock 
> /etc/shorewall-lite/state/lock
> root  3234 1  0 07:13 ?00:00:00 lock 
> /etc/shorewall-lite/state/lock
> 
> # lsof -n -p 3234
> COMMAND  PID USER   FD   TYPE DEVICE SIZE/OFF  NODE NAME
> lock3234 root  cwdDIR   0,15  656   258 /
> lock3234 root  rtdDIR   0,15  656   258 /
> lock3234 root  txtREG  254,0   308533  1786 /bin/busybox
> lock3234 root  memREG  254,077040   213 /lib/libgcc_s.so.1
> lock3234 root  memREG  254,0   601968   402 /lib/libc.so
> lock3234 root0u   CHR1,3  0t0   317 /dev/null
> lock3234 root1u   CHR1,3  0t0   317 /dev/null
> lock3234 root2u   CHR1,3  0t0   317 /dev/null
> lock3234 root3u   REG   0,145 61617 
> /etc/shorewall-lite/state/lock (deleted)
> lock3234 root   13w  FIFO0,8  0t0  1732 pipe
> 
> # cat /proc/2700/fd/3
> 3234
> 
> # strace -f -p 3234
> strace: Process 3234 attached
> restart_syscall(<... resuming interrupted syscall_516 ...>) = 0
> nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffcd900) = 0
> nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffcd900) = 0
> nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffcd900) = 0
> nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffcd900) = 0
> nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffcd900) = 0
> nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffcd900) = 0
> nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffcd900) = 0
> nanosleep({tv_sec=1, tv_nsec=0}, ^Cstrace: Process 3234 detached
>  
> 
> # strace -f -p 2700
> strace: Process 2700 attached
> flock(3, LOCK_EX^Cstrace: Process 2700 detached
>  
> 
> Hrm.  Given:
> 
>   g_havemutex="lock -u ${lockf} && rm -f ${lockf}"
> 
> Observe this particular set of operations:
> 
> tty1# lock /tmp/mylockfile
> tty1# [has the lock and returns]
> tty2# lock /tmp/mylockfile
> [blocks waiting for locker1 to release the lock as we can see:]
> # lsof | grep /tmp/mylockfile 
> lock   1249root3u  REG   0,135 352778 
> /tmp/mylockfile
> lock   1250root3u  REG   0,135 352778 
> /tmp/mylockfile
> tty1# lock -u /tmp/mylockfile && rm -f /tmp/mylockfile
> tty1# [returns, releasing the lock to tty2]
> tty2# [returns from blocked state, now holds the lock]
> # lsof | grep /tmp/mylockfile 
> lock   1404root3u  REG   0,135 352778 
> /tmp/mylockfile (deleted)
> tty3# lock /tmp/mylockfile 
> tty3# [wait, what?  it returns even though tty2 has the lock!]
> # lsof | grep /tmp/mylockfile 
> lock   1404root3u  REG   0,135 352778 
> /tmp/mylockfile (deleted)
> lock   1439root3u  REG   0,135 362181 
> /tmp/mylockfile
> 
> So at this point both tty2 and tty3 believe they have the lock and have
> returned, allowing them to do their work on top of each other.
> 
> I don't think a process can simply remove the lock file just because it
> has released it's lock on it.  It can only be removed if there are no
> more outstanding locks on it.  Or just don't remove it.  lock seems to
> function perfectly fine with the file pre-existing.
> 
> I'm not sure I can draw a line from this problem to the stale locks
> problem, but it's probably a good thing to fix before continuing to try
> to debug the stale locks problem.
> 

Brian,

Can you point me to online documentation that describes how this 'lock'
utility is supposed to work?

Thanks,
-Tom

-- 
Tom Eastep\   Q: What do you get when you cross a mobster with
Shoreline, \ an international standard?
Washington, USA \ A: Someone who makes you an offer you can't
http://shorewall.org \   understand
  \___



signature.asc
Description: OpenPGP digital signature
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! 

Re: [Shorewall-users] locking processes left behind

2018-07-26 Thread Brian J. Murrell
On Thu, 2018-07-26 at 05:48 +0200, Matt Darfeuille wrote:
> 
> As illustrated by this lingering thread, issues that are only present
> on
> one platform makes me moved away from OpenWRT/LEDE.

The platform is not the problem.  The platform is just providing the
tools.

Or are you suggesting that the "lock" tool on OpenWRT/LEDE is actually
buggy?  Given that it's just a wrapper around flock() that seems
unlikely.  But I'm happy to be proven wrong if you can provide a
reproducer for the bug that I can submit upstream.  As much testing as
I have done with the "lock" tool it operates as expected when used as
expected.

Given the evidence, it seems like the file being locked is getting
removed before the lock is released.

A reboot of my router this morning has reproduced the situation and
this is what I see:

# ps -ef | grep lock
root  2700  2666  0 07:13 ?00:00:00 lock 
/etc/shorewall-lite/state/lock
root  3234 1  0 07:13 ?00:00:00 lock 
/etc/shorewall-lite/state/lock

# lsof -n -p 3234
COMMAND  PID USER   FD   TYPE DEVICE SIZE/OFF  NODE NAME
lock3234 root  cwdDIR   0,15  656   258 /
lock3234 root  rtdDIR   0,15  656   258 /
lock3234 root  txtREG  254,0   308533  1786 /bin/busybox
lock3234 root  memREG  254,077040   213 /lib/libgcc_s.so.1
lock3234 root  memREG  254,0   601968   402 /lib/libc.so
lock3234 root0u   CHR1,3  0t0   317 /dev/null
lock3234 root1u   CHR1,3  0t0   317 /dev/null
lock3234 root2u   CHR1,3  0t0   317 /dev/null
lock3234 root3u   REG   0,145 61617 
/etc/shorewall-lite/state/lock (deleted)
lock3234 root   13w  FIFO0,8  0t0  1732 pipe

# cat /proc/2700/fd/3
3234

# strace -f -p 3234
strace: Process 3234 attached
restart_syscall(<... resuming interrupted syscall_516 ...>) = 0
nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffcd900) = 0
nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffcd900) = 0
nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffcd900) = 0
nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffcd900) = 0
nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffcd900) = 0
nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffcd900) = 0
nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffcd900) = 0
nanosleep({tv_sec=1, tv_nsec=0}, ^Cstrace: Process 3234 detached
 

# strace -f -p 2700
strace: Process 2700 attached
flock(3, LOCK_EX^Cstrace: Process 2700 detached
 

Hrm.  Given:

g_havemutex="lock -u ${lockf} && rm -f ${lockf}"

Observe this particular set of operations:

tty1# lock /tmp/mylockfile
tty1# [has the lock and returns]
tty2# lock /tmp/mylockfile
[blocks waiting for locker1 to release the lock as we can see:]
# lsof | grep /tmp/mylockfile 
lock   1249root3u  REG   0,135 352778 
/tmp/mylockfile
lock   1250root3u  REG   0,135 352778 
/tmp/mylockfile
tty1# lock -u /tmp/mylockfile && rm -f /tmp/mylockfile
tty1# [returns, releasing the lock to tty2]
tty2# [returns from blocked state, now holds the lock]
# lsof | grep /tmp/mylockfile 
lock   1404root3u  REG   0,135 352778 
/tmp/mylockfile (deleted)
tty3# lock /tmp/mylockfile 
tty3# [wait, what?  it returns even though tty2 has the lock!]
# lsof | grep /tmp/mylockfile 
lock   1404root3u  REG   0,135 352778 
/tmp/mylockfile (deleted)
lock   1439root3u  REG   0,135 362181 
/tmp/mylockfile

So at this point both tty2 and tty3 believe they have the lock and have
returned, allowing them to do their work on top of each other.

I don't think a process can simply remove the lock file just because it
has released it's lock on it.  It can only be removed if there are no
more outstanding locks on it.  Or just don't remove it.  lock seems to
function perfectly fine with the file pre-existing.

I'm not sure I can draw a line from this problem to the stale locks
problem, but it's probably a good thing to fix before continuing to try
to debug the stale locks problem.

Cheers,
b.


signature.asc
Description: This is a digitally signed message part
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Shorewall-users mailing list
Shorewall-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/shorewall-users


Re: [Shorewall-users] locking processes left behind

2018-07-25 Thread Matt Darfeuille



On 7/26/2018 12:56 AM, Tom Eastep wrote:
> On 07/25/2018 10:14 AM, Brian J. Murrell wrote:
>> On Mon, 2018-07-16 at 07:12 -0400, Brian J. Murrell wrote:
>>>
>>> I think I finally do have the required versions now, yes?
>>
>> Am I correct about having the required versions now?
>>
>>> However, as you can see above, we still have stale/orphan
>>> locks/processes hanging around.
>>
>> If so, any ideas why stale locks are still getting left behind?
>>
> 
> It appears that you have the updated versions. But I have no idea why
> stale locks are still getting left behind, given that we don't seem to
> be seeing them on platforms other than OpenWRT. Matt - are you seeing
> this problem on your OpenWRT router(s)?
> 

As illustrated by this lingering thread, issues that are only present on
one platform makes me moved away from OpenWRT/LEDE.

If I recall correctly, the fix did the job.

-Matt
-- 
Matt Darfeuille

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Shorewall-users mailing list
Shorewall-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/shorewall-users


Re: [Shorewall-users] locking processes left behind

2018-07-25 Thread Tom Eastep
On 07/25/2018 10:14 AM, Brian J. Murrell wrote:
> On Mon, 2018-07-16 at 07:12 -0400, Brian J. Murrell wrote:
>>
>> I think I finally do have the required versions now, yes?
> 
> Am I correct about having the required versions now?
> 
>> However, as you can see above, we still have stale/orphan
>> locks/processes hanging around.
> 
> If so, any ideas why stale locks are still getting left behind?
> 

It appears that you have the updated versions. But I have no idea why
stale locks are still getting left behind, given that we don't seem to
be seeing them on platforms other than OpenWRT. Matt - are you seeing
this problem on your OpenWRT router(s)?

-Tom
-- 
Tom Eastep\   Q: What do you get when you cross a mobster with
Shoreline, \ an international standard?
Washington, USA \ A: Someone who makes you an offer you can't
http://shorewall.org \   understand
  \___



signature.asc
Description: OpenPGP digital signature
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Shorewall-users mailing list
Shorewall-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/shorewall-users


Re: [Shorewall-users] locking processes left behind

2018-07-25 Thread Brian J. Murrell
On Mon, 2018-07-16 at 07:12 -0400, Brian J. Murrell wrote:
> 
> I think I finally do have the required versions now, yes?

Am I correct about having the required versions now?

> However, as you can see above, we still have stale/orphan
> locks/processes hanging around.

If so, any ideas why stale locks are still getting left behind?

Cheers,
b.


signature.asc
Description: This is a digitally signed message part
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Shorewall-users mailing list
Shorewall-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/shorewall-users


Re: [Shorewall-users] locking processes left behind

2018-07-16 Thread Brian J. Murrell
On Sat, 2018-06-30 at 08:25 -0700, Tom Eastep wrote:
> 
> If 'shorewall show version' returns '5.2.0', then you do not have the
> fix on your administrative system. If it returns '5.2.0.1', then you
> do
> have the fix.

$ shorewall show version
   ERROR: Cannot read /etc/shorewall/shorewall.conf! (Hint: Are you root?)
$ sudo shorewall show version
   ERROR: Chain 'version' is not recognized by /sbin/iptables.
$ shorewall version
5.2.0.4
$ ssh gw shorewall-lite version\; ps -ef \| grep lock
5.1.12.3
root  3288 1  0 05:07 ?00:00:00 lock 
/etc/shorewall-lite/state/lock
root  8106 1  0 05:09 ?00:00:00 lock 
/etc/shorewall-lite/state/lock

I think I finally do have the required versions now, yes?

However, as you can see above, we still have stale/orphan
locks/processes hanging around.

> The script cannot insure idempotency when it is interrupted at an
> arbitrary point. It writes into its 'undo' files after the successful
> completion of an 'ip' command, so a failure after the command and
> before
> the 'undo' record is written can cause incorrect behavior the next
> time
> that the script is run.

Pity.  Although, I agree it's a difficult problem.  I usually solve
those kinds of problems by growing the/an "undo" stack as I "do".  That
is for every action I take, I push the undo of that operation onto a
stack that I can execute if I get stopped at any point.

Cheers,
b.


signature.asc
Description: This is a digitally signed message part
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Shorewall-users mailing list
Shorewall-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/shorewall-users


Re: [Shorewall-users] locking processes left behind

2018-06-30 Thread Tom Eastep
On 06/28/2018 04:06 AM, Brian J. Murrell wrote:
> On Thu, 2018-04-12 at 09:10 -0700, Tom Eastep wrote:
>>
>> No -- it requires the firewall script to be compiled with the fix, as
>> well as having the fix installed on the shorewall[6]-lite firewall.
> 
> # rpm -q shorewall
> shorewall-5.2.0-0.01.fc28.noarch>
> # opkg info shorewall-lite
> Package: shorewall-lite
> Version: 5.1.12.3-1
> 
> So I should have the intended fix, yes?

If 'shorewall show version' returns '5.2.0', then you do not have the
fix on your administrative system. If it returns '5.2.0.1', then you do
have the fix.

> 
> From a reboot of my router this morning:
> 
> # ps -ef | grep lock
> root  3166 1  0 06:24 ?00:00:00 lock 
> /etc/shorewall-lite/state/lock
> root  7089 1  0 06:26 ?00:00:00 lock 
> /etc/shorewall-lite/state/lock
> 
> So the locking appears to be still leaving orphans behind.
> 
> I have been considering an alternative approach to this locking.  When
> multiple shorewall invocations race, I really only likely care about
> the last one winning the race cleanly, since they are most likely
> racing just because of an interface status change and the last to enter
> the race will configure the firewall with the status of all interfaces
> (and other state) already known to him.
> 
> So really, the last shorewall process to enter a race should just kill
> off it's predecessors and continue on it's way.
> 
> That requires that the firewall installation script be able to deal
> with any kind of previous partial state though.  Not sure how well
> shorewall is able to do that.  It would require the ultimate of
> idempotency.
> 

The script cannot insure idempotency when it is interrupted at an
arbitrary point. It writes into its 'undo' files after the successful
completion of an 'ip' command, so a failure after the command and before
the 'undo' record is written can cause incorrect behavior the next time
that the script is run.

-Tom
-- 
Tom Eastep\   Q: What do you get when you cross a mobster with
Shoreline, \ an international standard?
Washington, USA \ A: Someone who makes you an offer you can't
http://shorewall.org \   understand
  \___



signature.asc
Description: OpenPGP digital signature
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Shorewall-users mailing list
Shorewall-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/shorewall-users


Re: [Shorewall-users] locking processes left behind

2018-06-28 Thread Brian J. Murrell
On Thu, 2018-04-12 at 09:10 -0700, Tom Eastep wrote:
> 
> No -- it requires the firewall script to be compiled with the fix, as
> well as having the fix installed on the shorewall[6]-lite firewall.

# rpm -q shorewall
shorewall-5.2.0-0.01.fc28.noarch

# opkg info shorewall-lite
Package: shorewall-lite
Version: 5.1.12.3-1

So I should have the intended fix, yes?

From a reboot of my router this morning:

# ps -ef | grep lock
root  3166 1  0 06:24 ?00:00:00 lock 
/etc/shorewall-lite/state/lock
root  7089 1  0 06:26 ?00:00:00 lock 
/etc/shorewall-lite/state/lock

So the locking appears to be still leaving orphans behind.

I have been considering an alternative approach to this locking.  When
multiple shorewall invocations race, I really only likely care about
the last one winning the race cleanly, since they are most likely
racing just because of an interface status change and the last to enter
the race will configure the firewall with the status of all interfaces
(and other state) already known to him.

So really, the last shorewall process to enter a race should just kill
off it's predecessors and continue on it's way.

That requires that the firewall installation script be able to deal
with any kind of previous partial state though.  Not sure how well
shorewall is able to do that.  It would require the ultimate of
idempotency.

Cheers,
b.


signature.asc
Description: This is a digitally signed message part
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Shorewall-users mailing list
Shorewall-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/shorewall-users


Re: [Shorewall-users] locking processes left behind

2018-04-12 Thread Tom Eastep
On 04/11/2018 08:15 PM, Brian J. Murrell wrote:
> On Wed, 2018-04-11 at 19:09 -0700, Tom Eastep wrote:
>>
>> Did you read the 5.1.12.3 release notes?
> 
> Ahhh.  No.  I read the 5.1.12 release notes but didn't go any further
> when I didn't see it.  Should have really.
> 
>> 1)  Previously, the Shorewall[6][-lite] lock file was not always
>> released when an error occurred. This resulted in:
>>
>> - A warning message saying that a stale lock file has been
>> removed
>> - 'lock' processes remaining after shorewall[6][-lite] terminated
>>   (only reported on OpenWRT).
>>
>> That has been corrected so that the lock file is released at exit
>> if it hasn't been released already.
> 
> Awesome.  Building it right now.  Which brings up the question... in a
> shorewall[6]-lite deployment the fix for this is 100% on the -lite
> side, yes?

No -- it requires the firewall script to be compiled with the fix, as
well as having the fix installed on the shorewall[6]-lite firewall.

> 
> As an aside, is shorewall maintained in any public git (or other SCM)
> repo?
> 

I see that Matt answered that in an earlier post.

-Tom
-- 
Tom Eastep\   Q: What do you get when you cross a mobster with
Shoreline, \ an international standard?
Washington, USA \ A: Someone who makes you an offer you can't
http://shorewall.org \   understand
  \___



signature.asc
Description: OpenPGP digital signature
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Shorewall-users mailing list
Shorewall-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/shorewall-users


Re: [Shorewall-users] locking processes left behind

2018-04-11 Thread Matt Darfeuille
On 4/12/2018 5:15 AM, Brian J. Murrell wrote:
> On Wed, 2018-04-11 at 19:09 -0700, Tom Eastep wrote:
>>
>> Did you read the 5.1.12.3 release notes?
> 
> Ahhh.  No.  I read the 5.1.12 release notes but didn't go any further
> when I didn't see it.  Should have really.
> 
>> 1)  Previously, the Shorewall[6][-lite] lock file was not always
>> released when an error occurred. This resulted in:
>>
>> - A warning message saying that a stale lock file has been
>> removed
>> - 'lock' processes remaining after shorewall[6][-lite] terminated
>>   (only reported on OpenWRT).
>>
>> That has been corrected so that the lock file is released at exit
>> if it hasn't been released already.
> 
> Awesome.  Building it right now.  Which brings up the question... in a
> shorewall[6]-lite deployment the fix for this is 100% on the -lite
> side, yes?
> 
> As an aside, is shorewall maintained in any public git (or other SCM)
> repo?
> 

http://shorewall.org/Build.html
https://sourceforge.net/p/shorewall/_list/git?source=navbar

https://sourceforge.net/p/shorewall/mailman/message/36243307/

-Matt
-- 
Matt Darfeuille

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Shorewall-users mailing list
Shorewall-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/shorewall-users


Re: [Shorewall-users] locking processes left behind

2018-04-11 Thread Brian J. Murrell
On Wed, 2018-04-11 at 19:09 -0700, Tom Eastep wrote:
> 
> Did you read the 5.1.12.3 release notes?

Ahhh.  No.  I read the 5.1.12 release notes but didn't go any further
when I didn't see it.  Should have really.

> 1)  Previously, the Shorewall[6][-lite] lock file was not always
> released when an error occurred. This resulted in:
> 
> - A warning message saying that a stale lock file has been
> removed
> - 'lock' processes remaining after shorewall[6][-lite] terminated
>   (only reported on OpenWRT).
> 
> That has been corrected so that the lock file is released at exit
> if it hasn't been released already.

Awesome.  Building it right now.  Which brings up the question... in a
shorewall[6]-lite deployment the fix for this is 100% on the -lite
side, yes?

As an aside, is shorewall maintained in any public git (or other SCM)
repo?

Cheers,
b.


signature.asc
Description: This is a digitally signed message part
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Shorewall-users mailing list
Shorewall-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/shorewall-users


Re: [Shorewall-users] locking processes left behind

2018-04-11 Thread Tom Eastep
On 04/11/2018 06:30 PM, Brian J. Murrell wrote:
> On Wed, 2018-02-28 at 08:51 -0800, Tom Eastep wrote:
>>
>> There are quite a few, but they are only an issue for people who have
>> to
>> rely on the obscure 'lock' utility. The rest just get a 'stale lock
>> file
>> removed' message the next time that they run shorewall[6][-lite].
>> I'll
>> try to come up with a fix against 5.1.12...
> 
> Did any solutions for this end up making it into 5.1.12?
> 

Did you read the 5.1.12.3 release notes?


Problems corrected:

1)  Previously, the Shorewall[6][-lite] lock file was not always
released when an error occurred. This resulted in:

- A warning message saying that a stale lock file has been removed
- 'lock' processes remaining after shorewall[6][-lite] terminated
  (only reported on OpenWRT).

That has been corrected so that the lock file is released at exit
if it hasn't been released already.


-Tom
-- 
Tom Eastep\   Q: What do you get when you cross a mobster with
Shoreline, \ an international standard?
Washington, USA \ A: Someone who makes you an offer you can't
http://shorewall.org \   understand
  \___



signature.asc
Description: OpenPGP digital signature
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Shorewall-users mailing list
Shorewall-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/shorewall-users


Re: [Shorewall-users] locking processes left behind

2018-04-11 Thread Brian J. Murrell
On Wed, 2018-02-28 at 08:51 -0800, Tom Eastep wrote:
> 
> There are quite a few, but they are only an issue for people who have
> to
> rely on the obscure 'lock' utility. The rest just get a 'stale lock
> file
> removed' message the next time that they run shorewall[6][-lite].
> I'll
> try to come up with a fix against 5.1.12...

Did any solutions for this end up making it into 5.1.12?

Cheers,
b.


signature.asc
Description: This is a digitally signed message part
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Shorewall-users mailing list
Shorewall-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/shorewall-users


Re: [Shorewall-users] locking processes left behind

2018-02-28 Thread Brian J. Murrell
On Wed, 2018-02-28 at 08:51 -0800, Tom Eastep wrote:
> 
> There are quite a few, but they are only an issue for people who have
> to
> rely on the obscure 'lock' utility.

It's from busybox, FWIW.

> The rest just get a 'stale lock file
> removed' message the next time that they run shorewall[6][-lite]. 

Sure, but really, code should not be written to depend on the staleness
of locks being able to be determined.  Any code that takes out a lock
should release it when it's done.  I realize I am preaching to the
choir, yes.  :-)

So, to that end, when I use something like lock in a shell script I do
it like this:

lock /tmp/foo
trap 'lock -u /tmp/foo' EXIT

so that I *know* the lock will be released on every/any code path to
script exit.

Not sure how feasible that is for you to do even with mutex_on() being
shell script but just thought I would share.

> I'll
> try to come up with a fix against 5.1.12...

Awesome.  I wonder how many other issues that will resolve.

Cheers,
b.


signature.asc
Description: This is a digitally signed message part
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Shorewall-users mailing list
Shorewall-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/shorewall-users


Re: [Shorewall-users] locking processes left behind

2018-02-28 Thread Tom Eastep
On 02/28/2018 04:04 AM, Brian J. Murrell wrote:
> On Fri, 2018-01-12 at 07:09 -0500, Brian J. Murrell wrote:
>> I frequently get the following situation on my shorewall-lite
>> machine,
>> typically right after boot, where "shorewall-lite restart" has been
>> run
>> many times, overlapping even, I am sure as interfaces are brought up,
>> etc.:
>>
>> # ps -ef | grep shorewall
>> root  1094 1  0 Jan11 ?00:00:01 lock /etc/shorewall-
>> lite/state/lock
>> root  2507 1  0 Jan11 ?00:00:01 lock /etc/shorewall-
>> lite/state/lock
>> root  3124 1  0 Jan11 ?00:00:00 lock /etc/shorewall-
>> lite/state/lock
>> root  7608  6935  0 06:29 pts/100:00:00 grep shorewall
>> root 11770 1  0 Jan11 ?00:00:00 lock /etc/shorewall-
>> lite/state/lock
> ...
>> I wonder if anyone has any theories on what is going on here?
> 
> Here's one case where it happens:
> 
> # ps -ef | grep \ lock | grep -v grep; /usr/sbin/shorewall-lite blacklist 
> 185.170.42.18; ps -ef | grep \ lock | grep -v grep
> [notice there are no lock processes from the first ps | grep ]
>ERROR: The blacklist command is not supported in the current Shorewall 
> Lite configuration
> root 31693 1  0 07:00 pts/100:00:00 lock 
> /etc/shorewall-lite/state/lock
> # sleep 5
> # ps -ef | grep \ lock | grep -v grep
> root 31693 1  0 07:00 pts/100:00:00 lock 
> /etc/shorewall-lite/state/lock
> 
> Not really sure why shorewall thinks the blacklist command is not
> available, but that is orthogonal.  The issue here is clearly there is
> at least one code path where shorewall exits without cleaning up it's
> lock file.  I wonder how many other non-happy-path cases there are like
> this.

There are quite a few, but they are only an issue for people who have to
rely on the obscure 'lock' utility. The rest just get a 'stale lock file
removed' message the next time that they run shorewall[6][-lite]. I'll
try to come up with a fix against 5.1.12...

-Tom
-- 
Tom Eastep\   Q: What do you get when you cross a mobster with
Shoreline, \ an international standard?
Washington, USA \ A: Someone who makes you an offer you can't
http://shorewall.org \   understand
  \___



signature.asc
Description: OpenPGP digital signature
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Shorewall-users mailing list
Shorewall-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/shorewall-users


Re: [Shorewall-users] locking processes left behind

2018-02-28 Thread Brian J. Murrell
On Fri, 2018-01-12 at 07:09 -0500, Brian J. Murrell wrote:
> I frequently get the following situation on my shorewall-lite
> machine,
> typically right after boot, where "shorewall-lite restart" has been
> run
> many times, overlapping even, I am sure as interfaces are brought up,
> etc.:
> 
> # ps -ef | grep shorewall
> root  1094 1  0 Jan11 ?00:00:01 lock /etc/shorewall-
> lite/state/lock
> root  2507 1  0 Jan11 ?00:00:01 lock /etc/shorewall-
> lite/state/lock
> root  3124 1  0 Jan11 ?00:00:00 lock /etc/shorewall-
> lite/state/lock
> root  7608  6935  0 06:29 pts/100:00:00 grep shorewall
> root 11770 1  0 Jan11 ?00:00:00 lock /etc/shorewall-
> lite/state/lock
...
> I wonder if anyone has any theories on what is going on here?

Here's one case where it happens:

# ps -ef | grep \ lock | grep -v grep; /usr/sbin/shorewall-lite blacklist 
185.170.42.18; ps -ef | grep \ lock | grep -v grep
[notice there are no lock processes from the first ps | grep ]
   ERROR: The blacklist command is not supported in the current Shorewall Lite 
configuration
root 31693 1  0 07:00 pts/100:00:00 lock 
/etc/shorewall-lite/state/lock
# sleep 5
# ps -ef | grep \ lock | grep -v grep
root 31693 1  0 07:00 pts/100:00:00 lock 
/etc/shorewall-lite/state/lock

Not really sure why shorewall thinks the blacklist command is not
available, but that is orthogonal.  The issue here is clearly there is
at least one code path where shorewall exits without cleaning up it's
lock file.  I wonder how many other non-happy-path cases there are like
this.

Cheers,
b.


signature.asc
Description: This is a digitally signed message part
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Shorewall-users mailing list
Shorewall-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/shorewall-users


Re: [Shorewall-users] locking processes left behind

2018-01-18 Thread Matt Darfeuille
On 1/17/2018 6:40 PM, Matt Darfeuille wrote:
> On 1/17/2018 5:10 PM, Brian J. Murrell wrote:
>> On Mon, 2018-01-15 at 06:52 +0100, Matt Darfeuille wrote:
> 
>>
>>> 1)  I don't get that behavier on OpenWRT 15.05.1 with shorewall-lite
>>> 5.1.10.2.
>>
>> Are you building your OpenWRT packages yourself or using something
>> built upstream/elsewhere?
>>
> 
> I build Shorewall from git:
> 
> http://shorewall.org/Build.html
> 
>>> Note that Shorewall is patched:
>>> 1a68d87c9
>>> c518cfaa4
>>> 09980cc75
>>> e0a757ea0
>>> 550003f0f
>>
>> What is the reference point for those sha1s?  Which git repository are
>> they referring to?
>>
> 
> I cherry-picked those commits on top of the tag 5.1.10.2.
> See also the above link.
> 

I just upgraded to 5.1.11 so the git step is not required anymore.

-Matt
-- 
Matt Darfeuille

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Shorewall-users mailing list
Shorewall-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/shorewall-users


Re: [Shorewall-users] locking processes left behind

2018-01-17 Thread Matt Darfeuille
On 1/17/2018 5:10 PM, Brian J. Murrell wrote:
> On Mon, 2018-01-15 at 06:52 +0100, Matt Darfeuille wrote:
>>
>> Ok -- You seem to be fixated on not restricting the use of the lock
>> utility.
> 
> Please don't be so defensive.  That's not it at all.  I am trying to
> debug a problem here and in trying to do that I am trying to understand
> the nature of the problem and the tool(s) involved in the problem.
> 
> I'm just trying to understand why you think "lock" should be exclusive
> to OpenWRT and that you suggested previously that even though I am on
> OpenWRT that I should "pass other option to lock".  It seems like you
> might know something I don't.  I'm just trying to discover what that
> is.
> 

Actually, in your first e-mail you never mentioned the platform you were
using:

https://sourceforge.net/p/shorewall/mailman/message/36189733/

My understanding is that lock is only used on OpenWRT.
Given that I wasn't aware of which platform you were on at the time; I
thought that the patch might be useful.

> 
>> 1)  I don't get that behavier on OpenWRT 15.05.1 with shorewall-lite
>> 5.1.10.2.
> 
> Are you building your OpenWRT packages yourself or using something
> built upstream/elsewhere?
> 

I build Shorewall from git:

http://shorewall.org/Build.html

>> Note that Shorewall is patched:
>> 1a68d87c9
>> c518cfaa4
>> 09980cc75
>> e0a757ea0
>> 550003f0f
> 
> What is the reference point for those sha1s?  Which git repository are
> they referring to?
> 

I cherry-picked those commits on top of the tag 5.1.10.2.
See also the above link.

>> 2) Using /etc for temporary files.
> 
> Why /etc/ and not /tmp?
> 

What I meant was:
Why are the lock files created in /etc/shorewall-lite/* and not in /var.

You seem to be using an old version of OpenWRT/shorewall-lite?

-Matt
-- 
Matt Darfeuille

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Shorewall-users mailing list
Shorewall-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/shorewall-users


Re: [Shorewall-users] locking processes left behind

2018-01-17 Thread Brian J. Murrell
On Mon, 2018-01-15 at 06:52 +0100, Matt Darfeuille wrote:
> 
> Ok -- You seem to be fixated on not restricting the use of the lock
> utility.

Please don't be so defensive.  That's not it at all.  I am trying to
debug a problem here and in trying to do that I am trying to understand
the nature of the problem and the tool(s) involved in the problem.

I'm just trying to understand why you think "lock" should be exclusive
to OpenWRT and that you suggested previously that even though I am on
OpenWRT that I should "pass other option to lock".  It seems like you
might know something I don't.  I'm just trying to discover what that
is.

> As I don't see this conversation going anywhere

I guess it will only not go anywhere if everyone chooses to not
participate in trying to shine a light on a potential bug.  That would
be a pity.

> 1)  I don't get that behavier on OpenWRT 15.05.1 with shorewall-lite
> 5.1.10.2.

Are you building your OpenWRT packages yourself or using something
built upstream/elsewhere?

> Note that Shorewall is patched:
> 1a68d87c9
> c518cfaa4
> 09980cc75
> e0a757ea0
> 550003f0f

What is the reference point for those sha1s?  Which git repository are
they referring to?

> 2) Using /etc for temporary files.

Why /etc/ and not /tmp?

Cheers,
b.


signature.asc
Description: This is a digitally signed message part
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Shorewall-users mailing list
Shorewall-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/shorewall-users


Re: [Shorewall-users] locking processes left behind

2018-01-14 Thread Matt Darfeuille
On 1/14/2018 6:47 PM, Brian J. Murrell wrote:
> On Sun, 2018-01-14 at 17:06 +0100, Matt Darfeuille wrote:
>>
>> The code is only working and tested on OpenWRT.
> 
> Which is my platform.
> 
>> Lock on OpenWRT has limited functionalities.
> 
> Such as what?  And why would it being limited (only) on OpenWRT mean
> you that make it's use exclusive to OpenWRT.
> 
> Does any of the limited functionalities shed any light on how a lock
> can be running without the file it's trying to lock being present?

Ok -- You seem to be fixated on not restricting the use of the lock utility.

> To be clear, the file is still there, it's just that it's directory entry
> has been removed.  So something removed it without unlocking it?
> 

As I don't see this conversation going anywhere I'll just say the following:

1)  I don't get that behavier on OpenWRT 15.05.1 with shorewall-lite
5.1.10.2.
Note that Shorewall is patched:
1a68d87c9
c518cfaa4
09980cc75
e0a757ea0
550003f0f

2) Using /etc for temporary files.

-Matt
-- 
Matt Darfeuille

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Shorewall-users mailing list
Shorewall-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/shorewall-users


Re: [Shorewall-users] locking processes left behind

2018-01-14 Thread Brian J. Murrell
On Sun, 2018-01-14 at 17:06 +0100, Matt Darfeuille wrote:
> 
> The code is only working and tested on OpenWRT.

Which is my platform.

> Lock on OpenWRT has limited functionalities.

Such as what?  And why would it being limited (only) on OpenWRT mean
you that make it's use exclusive to OpenWRT.

Does any of the limited functionalities shed any light on how a lock
can be running without the file it's trying to lock being present?  To
be clear, the file is still there, it's just that it's directory entry
has been removed.  So something removed it without unlocking it?

Cheers,
b.


signature.asc
Description: This is a digitally signed message part
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Shorewall-users mailing list
Shorewall-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/shorewall-users


Re: [Shorewall-users] locking processes left behind

2018-01-14 Thread Matt Darfeuille


On 1/14/2018 4:54 PM, Brian J. Murrell wrote:
> On Sun, 2018-01-14 at 16:46 +0100, Matt Darfeuille wrote:
>>
>> If you are not on OpenWRT you may want to apply the attached patch.
> 
> So, don't use "lock" on platforms other than OpenWRT?  But it's Ok to
> use any of the other locking method on non-OpenWRT machines?
> 

The code is only working and tested on OpenWRT.
Given your issue you may want to pass other option to lock.

> Why are you promoting the use of "lock" exclusive to OpenWRT?
> 

Lock on OpenWRT has limited functionalities.

-Matt
-- 
Matt Darfeuille

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Shorewall-users mailing list
Shorewall-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/shorewall-users


Re: [Shorewall-users] locking processes left behind

2018-01-14 Thread Brian J. Murrell
On Sun, 2018-01-14 at 16:46 +0100, Matt Darfeuille wrote:
> 
> If you are not on OpenWRT you may want to apply the attached patch.

So, don't use "lock" on platforms other than OpenWRT?  But it's Ok to
use any of the other locking method on non-OpenWRT machines?

Why are you promoting the use of "lock" exclusive to OpenWRT?

Cheers,
b.



signature.asc
Description: This is a digitally signed message part
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Shorewall-users mailing list
Shorewall-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/shorewall-users


Re: [Shorewall-users] locking processes left behind

2018-01-14 Thread Matt Darfeuille
On 1/12/2018 7:33 PM, Tom Eastep wrote:
> On 01/12/2018 04:09 AM, Brian J. Murrell wrote:
>> I frequently get the following situation on my shorewall-lite machine,
>> typically right after boot, where "shorewall-lite restart" has been run
>> many times, overlapping even, I am sure as interfaces are brought up,
>> etc.:
>>
>> # ps -ef | grep shorewall
>> root  1094 1  0 Jan11 ?00:00:01 lock 
>> /etc/shorewall-lite/state/lock
>> root  2507 1  0 Jan11 ?00:00:01 lock 
>> /etc/shorewall-lite/state/lock
>> root  3124 1  0 Jan11 ?00:00:00 lock 
>> /etc/shorewall-lite/state/lock
>> root  7608  6935  0 06:29 pts/100:00:00 grep shorewall
>> root 11770 1  0 Jan11 ?00:00:00 lock 
>> /etc/shorewall-lite/state/lock
>>
>> # ls -l /etc/shorewall-lite/state/lock
>> ls: /etc/shorewall-lite/state/lock: No such file or directory
>> # ls -l /proc/{1094,2507,3124,11770}/fd/
>> /proc/1094/fd/:
>> lr-x--1 root root64 Jan 12 06:26 0 -> /dev/null
>> l-wx--1 root root64 Jan 12 06:26 1 -> pipe:[1896]
>> l-wx--1 root root64 Jan 12 06:26 12 -> pipe:[1896]
>> l-wx--1 root root64 Jan 12 06:26 2 -> pipe:[1896]
>> lrwx--1 root root64 Jan 12 06:26 3 -> 
>> /etc/shorewall-lite/state/lock (deleted)
>>
>> /proc/11770/fd/:
>> lrwx--1 root root64 Jan 12 06:30 0 -> /dev/null
>> lrwx--1 root root64 Jan 12 06:30 1 -> /dev/null
>> l-wx--1 root root64 Jan 12 06:30 13 -> pipe:[1718]
>> lrwx--1 root root64 Jan 12 06:30 2 -> /dev/null
>> lrwx--1 root root64 Jan 12 06:30 3 -> 
>> /etc/shorewall-lite/state/lock (deleted)
>>
>> /proc/2507/fd/:
>> lrwx--1 root root64 Jan 12 06:30 0 -> /dev/null
>> lrwx--1 root root64 Jan 12 06:30 1 -> /dev/null
>> l-wx--1 root root64 Jan 12 06:30 13 -> pipe:[1718]
>> lrwx--1 root root64 Jan 12 06:30 2 -> /dev/null
>> lrwx--1 root root64 Jan 12 06:30 3 -> 
>> /etc/shorewall-lite/state/lock (deleted)
>>
>> /proc/3124/fd/:
>> lrwx--1 root root64 Jan 12 06:30 0 -> /dev/null
>> lrwx--1 root root64 Jan 12 06:30 1 -> /dev/null
>> l-wx--1 root root64 Jan 12 06:30 13 -> pipe:[1718]
>> lrwx--1 root root64 Jan 12 06:30 2 -> /dev/null
>> lrwx--1 root root64 Jan 12 06:30 3 -> 
>> /etc/shorewall-lite/state/lock (deleted)
>> # lsof -n | grep state/lock
>> lock   1094root3u  REG   0,145  13407 
>> /etc/shorewall-lite/state/lock (deleted)
>> lock   2507root3u  REG   0,145  13415 
>> /etc/shorewall-lite/state/lock (deleted)
>> lock   3124root3u  REG   0,145  13448 
>> /etc/shorewall-lite/state/lock (deleted)
>> lock  11770root3u  REG   0,146  13663 
>> /etc/shorewall-lite/state/lock (deleted)
>>
>> I wonder if anyone has any theories on what is going on here?
>>
> 
> I do not -- here is the only code in Shorewall that invoked 'lock' (one
> line might appear folded by my mailer):
> 
> mutex_on()
> {
> local try
> try=0
> local lockf
> lockf=${LOCKFILE:=${VARDIR}/lock}
> local lockpid
> local lockd
> 
> MUTEX_TIMEOUT=${MUTEX_TIMEOUT:-60}
> 
> if [ $MUTEX_TIMEOUT -gt 0 ]; then
> 
>   lockd=$(dirname $LOCKFILE)
> 
>   [ -d "$lockd" ] || mkdir -p "$lockd"
> 
>   if [ -f $lockf ]; then
>   lockpid=`cat ${lockf} 2> /dev/null`
>   if [ -z "$lockpid" -o $lockpid = 0 ]; then
>   rm -f ${lockf}
>   error_message "WARNING: Stale lockfile ${lockf} removed"
>   elif [ $lockpid -eq $$ ]; then
> return 0
>   elif ! ps | grep -v grep | qt grep ${lockpid}; then
>   rm -f ${lockf}
>   error_message "WARNING: Stale lockfile ${lockf} from pid 
> ${lockpid}
> removed"
>   fi
>   fi
> 
>   if qt mywhich lockfile; then
>   lockfile -${MUTEX_TIMEOUT} -r1 ${lockf}
>   chmod u+w ${lockf}
>   echo $$ > ${lockf}
>   chmod u-w ${lockf}
>   elif qt mywhich lock; then
> lock ${lockf}
> chmod u=r ${lockf}
>   else
>   while [ -f ${lockf} -a ${try} -lt ${MUTEX_TIMEOUT} ] ; do
>   sleep 1
>   try=$((${try} + 1))
>   done
> 
>   if  [ ${try} -lt ${MUTEX_TIMEOUT} ] ; then
>   # Create the lockfile
>   echo $$ > ${lockf}
>   else
>   echo "Giving up on lock file ${lockf}" >&2
>   fi
>   fi
> fi
> }
> 
> The part that invoked 'lock' was contributed, as I recall.
> 

If you are not on 

Re: [Shorewall-users] locking processes left behind

2018-01-12 Thread Tom Eastep
On 01/12/2018 04:09 AM, Brian J. Murrell wrote:
> I frequently get the following situation on my shorewall-lite machine,
> typically right after boot, where "shorewall-lite restart" has been run
> many times, overlapping even, I am sure as interfaces are brought up,
> etc.:
> 
> # ps -ef | grep shorewall
> root  1094 1  0 Jan11 ?00:00:01 lock 
> /etc/shorewall-lite/state/lock
> root  2507 1  0 Jan11 ?00:00:01 lock 
> /etc/shorewall-lite/state/lock
> root  3124 1  0 Jan11 ?00:00:00 lock 
> /etc/shorewall-lite/state/lock
> root  7608  6935  0 06:29 pts/100:00:00 grep shorewall
> root 11770 1  0 Jan11 ?00:00:00 lock 
> /etc/shorewall-lite/state/lock
> 
> # ls -l /etc/shorewall-lite/state/lock
> ls: /etc/shorewall-lite/state/lock: No such file or directory
> # ls -l /proc/{1094,2507,3124,11770}/fd/
> /proc/1094/fd/:
> lr-x--1 root root64 Jan 12 06:26 0 -> /dev/null
> l-wx--1 root root64 Jan 12 06:26 1 -> pipe:[1896]
> l-wx--1 root root64 Jan 12 06:26 12 -> pipe:[1896]
> l-wx--1 root root64 Jan 12 06:26 2 -> pipe:[1896]
> lrwx--1 root root64 Jan 12 06:26 3 -> 
> /etc/shorewall-lite/state/lock (deleted)
> 
> /proc/11770/fd/:
> lrwx--1 root root64 Jan 12 06:30 0 -> /dev/null
> lrwx--1 root root64 Jan 12 06:30 1 -> /dev/null
> l-wx--1 root root64 Jan 12 06:30 13 -> pipe:[1718]
> lrwx--1 root root64 Jan 12 06:30 2 -> /dev/null
> lrwx--1 root root64 Jan 12 06:30 3 -> 
> /etc/shorewall-lite/state/lock (deleted)
> 
> /proc/2507/fd/:
> lrwx--1 root root64 Jan 12 06:30 0 -> /dev/null
> lrwx--1 root root64 Jan 12 06:30 1 -> /dev/null
> l-wx--1 root root64 Jan 12 06:30 13 -> pipe:[1718]
> lrwx--1 root root64 Jan 12 06:30 2 -> /dev/null
> lrwx--1 root root64 Jan 12 06:30 3 -> 
> /etc/shorewall-lite/state/lock (deleted)
> 
> /proc/3124/fd/:
> lrwx--1 root root64 Jan 12 06:30 0 -> /dev/null
> lrwx--1 root root64 Jan 12 06:30 1 -> /dev/null
> l-wx--1 root root64 Jan 12 06:30 13 -> pipe:[1718]
> lrwx--1 root root64 Jan 12 06:30 2 -> /dev/null
> lrwx--1 root root64 Jan 12 06:30 3 -> 
> /etc/shorewall-lite/state/lock (deleted)
> # lsof -n | grep state/lock
> lock   1094root3u  REG   0,145  13407 
> /etc/shorewall-lite/state/lock (deleted)
> lock   2507root3u  REG   0,145  13415 
> /etc/shorewall-lite/state/lock (deleted)
> lock   3124root3u  REG   0,145  13448 
> /etc/shorewall-lite/state/lock (deleted)
> lock  11770root3u  REG   0,146  13663 
> /etc/shorewall-lite/state/lock (deleted)
> 
> I wonder if anyone has any theories on what is going on here?
> 

I do not -- here is the only code in Shorewall that invoked 'lock' (one
line might appear folded by my mailer):

mutex_on()
{
local try
try=0
local lockf
lockf=${LOCKFILE:=${VARDIR}/lock}
local lockpid
local lockd

MUTEX_TIMEOUT=${MUTEX_TIMEOUT:-60}

if [ $MUTEX_TIMEOUT -gt 0 ]; then

lockd=$(dirname $LOCKFILE)

[ -d "$lockd" ] || mkdir -p "$lockd"

if [ -f $lockf ]; then
lockpid=`cat ${lockf} 2> /dev/null`
if [ -z "$lockpid" -o $lockpid = 0 ]; then
rm -f ${lockf}
error_message "WARNING: Stale lockfile ${lockf} removed"
elif [ $lockpid -eq $$ ]; then
return 0
elif ! ps | grep -v grep | qt grep ${lockpid}; then
rm -f ${lockf}
error_message "WARNING: Stale lockfile ${lockf} from pid 
${lockpid}
removed"
fi
fi

if qt mywhich lockfile; then
lockfile -${MUTEX_TIMEOUT} -r1 ${lockf}
chmod u+w ${lockf}
echo $$ > ${lockf}
chmod u-w ${lockf}
elif qt mywhich lock; then
lock ${lockf}
chmod u=r ${lockf}
else
while [ -f ${lockf} -a ${try} -lt ${MUTEX_TIMEOUT} ] ; do
sleep 1
try=$((${try} + 1))
done

if  [ ${try} -lt ${MUTEX_TIMEOUT} ] ; then
# Create the lockfile
echo $$ > ${lockf}
else
echo "Giving up on lock file ${lockf}" >&2
fi
fi
fi
}

The part that invoked 'lock' was contributed, as I recall.

-Tom
-- 
Tom Eastep\   Q: What do you get when you cross a mobster with
Shoreline, \ an international standard?
Washington, USA \ A: Someone who makes you an 

[Shorewall-users] locking processes left behind

2018-01-12 Thread Brian J. Murrell
I frequently get the following situation on my shorewall-lite machine,
typically right after boot, where "shorewall-lite restart" has been run
many times, overlapping even, I am sure as interfaces are brought up,
etc.:

# ps -ef | grep shorewall
root  1094 1  0 Jan11 ?00:00:01 lock 
/etc/shorewall-lite/state/lock
root  2507 1  0 Jan11 ?00:00:01 lock 
/etc/shorewall-lite/state/lock
root  3124 1  0 Jan11 ?00:00:00 lock 
/etc/shorewall-lite/state/lock
root  7608  6935  0 06:29 pts/100:00:00 grep shorewall
root 11770 1  0 Jan11 ?00:00:00 lock 
/etc/shorewall-lite/state/lock

# ls -l /etc/shorewall-lite/state/lock
ls: /etc/shorewall-lite/state/lock: No such file or directory
# ls -l /proc/{1094,2507,3124,11770}/fd/
/proc/1094/fd/:
lr-x--1 root root64 Jan 12 06:26 0 -> /dev/null
l-wx--1 root root64 Jan 12 06:26 1 -> pipe:[1896]
l-wx--1 root root64 Jan 12 06:26 12 -> pipe:[1896]
l-wx--1 root root64 Jan 12 06:26 2 -> pipe:[1896]
lrwx--1 root root64 Jan 12 06:26 3 -> 
/etc/shorewall-lite/state/lock (deleted)

/proc/11770/fd/:
lrwx--1 root root64 Jan 12 06:30 0 -> /dev/null
lrwx--1 root root64 Jan 12 06:30 1 -> /dev/null
l-wx--1 root root64 Jan 12 06:30 13 -> pipe:[1718]
lrwx--1 root root64 Jan 12 06:30 2 -> /dev/null
lrwx--1 root root64 Jan 12 06:30 3 -> 
/etc/shorewall-lite/state/lock (deleted)

/proc/2507/fd/:
lrwx--1 root root64 Jan 12 06:30 0 -> /dev/null
lrwx--1 root root64 Jan 12 06:30 1 -> /dev/null
l-wx--1 root root64 Jan 12 06:30 13 -> pipe:[1718]
lrwx--1 root root64 Jan 12 06:30 2 -> /dev/null
lrwx--1 root root64 Jan 12 06:30 3 -> 
/etc/shorewall-lite/state/lock (deleted)

/proc/3124/fd/:
lrwx--1 root root64 Jan 12 06:30 0 -> /dev/null
lrwx--1 root root64 Jan 12 06:30 1 -> /dev/null
l-wx--1 root root64 Jan 12 06:30 13 -> pipe:[1718]
lrwx--1 root root64 Jan 12 06:30 2 -> /dev/null
lrwx--1 root root64 Jan 12 06:30 3 -> 
/etc/shorewall-lite/state/lock (deleted)
# lsof -n | grep state/lock
lock   1094root3u  REG   0,145  13407 
/etc/shorewall-lite/state/lock (deleted)
lock   2507root3u  REG   0,145  13415 
/etc/shorewall-lite/state/lock (deleted)
lock   3124root3u  REG   0,145  13448 
/etc/shorewall-lite/state/lock (deleted)
lock  11770root3u  REG   0,146  13663 
/etc/shorewall-lite/state/lock (deleted)

I wonder if anyone has any theories on what is going on here?

Cheers,
b.


signature.asc
Description: This is a digitally signed message part
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Shorewall-users mailing list
Shorewall-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/shorewall-users