Re: /lib/lsb/init-functions on LXC servers
Hi Reco, On 02/14/18 14:55, Reco wrote: True. There's one tiny bit though - try pidof -o %PPID -x /usr/sbin/sshd and watch it output several pids as well. Yes, indeed. If pidofproc would rely upon the pidfile only, then there is no reason to call pidof. And you don't have to spawn yet another sshd, a simple ssh login will suffice. This was just an example to show how easy it is to break the pgrep approach. This particular part of pidofproc does not use pidof to get pid. It uses pidof to guess the status of the process. Indeed. /bin/pidof returns the process id of the service running in a container, if the local service is not running. pidofproc assumes it is running, even though there is no local pidfile. 2 bad things can happen here: - the service in the container is killed - the service on the host is not started Using "/bin/pidof -c" the "foreign" root directory in the container is detected and pidofproc returns less false positives, AFAICT. I tried this on my servers in the office. RedHat EL uses "pidof -c" as well, check /etc/rc.d/init.d/functions . I didn't examine other Linux distros. In the case of sshd pidofproc can break in a funny way indeed. IMHO pidofproc (and hence the startup scripts) should rely upon not loosing important information (the pid file). True, but that particular part haven't wrote itself, someone did it on purpose. I don't want to blame anybody, but I would like to get this fixed. "pidof -c" does an additional check to verify the PID to return, so this seems to be a safe workaround without introducing pgrep. Regards Harri
Re: /lib/lsb/init-functions on LXC servers
Hi. On Wed, Feb 14, 2018 at 02:33:16PM +0100, Harald Dunkel wrote: > Hi Reco, > > wrt "pgrep --ns 1 -f /usr/sbin/sshd": > > The executable path simply doesn't tell if this is the right service > to stop. If I run 2 services in parallel (e.g. for different network > interfaces), then this approach is already broken. Sample: > > # pgrep --ns 1 -f /usr/sbin/sshd > 12602 > # ps -ef | grep ssh[d] > root 126021 0 Feb02 ?00:00:20 /usr/sbin/sshd -D > # /usr/sbin/sshd -p > # pgrep --ns 1 -f /usr/sbin/sshd > 1933 > 12602 True. There's one tiny bit though - try pidof -o %PPID -x /usr/sbin/sshd and watch it output several pids as well. And you don't have to spawn yet another sshd, a simple ssh login will suffice. This particular part of pidofproc does not use pidof to get pid. It uses pidof to guess the status of the process. In the case of sshd pidofproc can break in a funny way indeed. > IMHO pidofproc (and hence the startup scripts) should rely upon not > loosing important information (the pid file). True, but that particular part haven't wrote itself, someone did it on purpose. Reco
Re: /lib/lsb/init-functions on LXC servers
Hi Reco, wrt "pgrep --ns 1 -f /usr/sbin/sshd": The executable path simply doesn't tell if this is the right service to stop. If I run 2 services in parallel (e.g. for different network interfaces), then this approach is already broken. Sample: # pgrep --ns 1 -f /usr/sbin/sshd 12602 # ps -ef | grep ssh[d] root 126021 0 Feb02 ?00:00:20 /usr/sbin/sshd -D # /usr/sbin/sshd -p # pgrep --ns 1 -f /usr/sbin/sshd 1933 12602 IMHO pidofproc (and hence the startup scripts) should rely upon not loosing important information (the pid file). Regards Harri
Re: /lib/lsb/init-functions on LXC servers
Hi. On Mon, Feb 05, 2018 at 04:04:37PM +0100, Harald Dunkel wrote: > Hi Reco, > > you mean this is a known issue??? Well, it's known to me (since then) at least as I merely read the contents of /lib/lsb/init-functions in my Debian system. Pinpointing the problem is easy, anyone who has access to the manpages and has general understanding the way LXC works can do it. Fixing the problem is harder - I wrote that I'm unsure of the solutions proposed. Reco
Re: /lib/lsb/init-functions on LXC servers
Hi Reco, you mean this is a known issue??? Harri
Re: /lib/lsb/init-functions on LXC servers
Hi. On Fri, Feb 02, 2018 at 11:35:04AM +0100, Harald Dunkel wrote: > Hi folks, > > I see a weird effect of pidofproc (defined in /lib/lsb/init-functions): > If there is no local daemon with a given search path running, then it > returns the PIDs the daemons running in the LXC containers. AFAICT this > affects the startup scripts of > > apache2 > opensmtpd > rpcbind > > and maybe others. #888743 > > Is this just me? Can anybody reproduce? No, it's everyone. That's the problematic part of this script: # pid file doesn't exist, try to find the pid nevertheless if [ -x /bin/pidof ] && [ ! "$specified" ]; then status="0" /bin/pidof -o %PPID -x $1 || status="$?" With those arguments pidof finds each and every process regardless of which mount namespace (aka container) they belong to. In the case of LXC, adding "-c" switch to pidof should solve this issue. Maybe. I'm unsure. I did not consider all the cornercases. The way I see it, a correct way of solving this is to rewrite problematic part altogether: # pid file doesn't exist, try to find the pid nevertheless if [ -x /usr/bin/pgrep ] && [ ! "$specified" ]; then status="0" /usr/bin/pgrep --ns 1 -f $1 || status="$?" But that opens several can of worms at once: extra dependencies, binary in /usr, etc. Reco