Hi,

I've upgraded lprng from 3.6.13 to 3.7.4 on a Sun box with Solaris
7.0. Several times a day we experience "hanging lpq" and at the same
time one lpd process is starting to consume a lot of cpu. The only
solution I've found is to kill the offending process.

It appears that two lpd-processes is causing a "dead lock" by setting
advisory locks on /var/spool/lpd/<bounce queue>/lpq and 
/var/spool/lpd/<real queue>/lpq.0. Yes, I'm using bounce queues for 
filtering.

The print server is supporting 30 printers and 600-700 clients.

Has anyone seen this behavior before?


Below is the output from top, truss, pfiles and ls.


(1) A "hanging lpq". The output from top shows that process 25884
    (lpd) consumes a lot of cpu (one of two cpu:s)

# top

last pid: 27159;  load averages:  1.08,  1.09,  1.11 
64 processes:  61 sleeping, 1 stopped, 2 on cpu
CPU states: 32.9% idle, 17.8% user, 47.5% kernel,  1.8% iowait,  0.0%
swap
Memory: 224M real, 19M free, 32M swap in use, 523M swap free

  PID USERNAME THR PRI NICE  SIZE   RES STATE    TIME    CPU COMMAND
25884 root       1   0    0 2316K 1648K cpu/0   70:21 47.62% lpd
27159 daemon     1  60    0 2268K 1624K sleep    0:00  0.72% ifhp
 1444 root       1  58    0 1588K 1288K cpu/2   11:48  0.36%
top-sun4m-5.7
...
...


(2) Output from truss reveals that the process is trying very hard to
    obtain a lock on the file "lpq".

# truss -p 25884

open("lpq", O_RDWR|O_APPEND|O_CREAT|O_NOCTTY, 0600) = 113
umask(077)                                      = 0
fstat(113, 0xEFFFBEA0)                          = 0
fcntl(113, F_SETLK, 0xEFFFBD04)                 Err#11 EAGAIN
fcntl(113, F_SETLKW, 0xEFFFBD04)                Err#45 EDEADLK
close(113)                                      = 0
umask(0)                                        = 077
open("lpq", O_RDWR|O_APPEND|O_CREAT|O_NOCTTY, 0600) = 113
umask(077)                                      = 0
fstat(113, 0xEFFFBEA0)                          = 0
fcntl(113, F_SETLK, 0xEFFFBD04)                 Err#11 EAGAIN
fcntl(113, F_SETLKW, 0xEFFFBD04)                Err#45 EDEADLK
close(113)                                      = 0
umask(0)                                        = 077
open("lpq", O_RDWR|O_APPEND|O_CREAT|O_NOCTTY, 0600) = 113
umask(077)                                      = 0
fstat(113, 0xEFFFBEA0)                          = 0
fcntl(113, F_SETLK, 0xEFFFBD04)                 Err#11 EAGAIN
fcntl(113, F_SETLKW, 0xEFFFBD04)                Err#45 EDEADLK
close(113)                                      = 0
umask(0)                                        = 077
open("lpq", O_RDWR|O_APPEND|O_CREAT|O_NOCTTY, 0600) = 113
umask(077)                                      = 0

(3) Running pfiles on the process gives fstat and fcntl information
    for all open files. Note, file descriptor 113 is locked by process
    25885 (inode=132494).

# /usr/proc/bin/pfiles 25884
25884:  /opt/fw/LPRng/current/sbin/lpd
  Current rlimit: 1024 file descriptors 
...
  23: S_IFREG mode:0600 dev:32,136 ino:29459 uid:1 gid:12 size:68
      O_RDWR|O_APPEND
  24: S_IFREG mode:0600 dev:32,136 ino:67728 uid:1 gid:12 size:68
      O_RDWR|O_APPEND
  25: S_IFREG mode:0600 dev:32,136 ino:8847 uid:1 gid:12 size:68
      O_RDWR|O_APPEND
...
 111: S_IFREG mode:0600 dev:32,136 ino:53010 uid:1 gid:12 size:68
      O_RDWR|O_APPEND
 112: S_IFREG mode:0600 dev:32,136 ino:55955 uid:1 gid:12 size:68
      O_RDWR|O_APPEND
 113: S_IFREG mode:0600 dev:32,136 ino:132494 uid:1 gid:12 size:68
      O_RDWR|O_APPEND
      advisory write lock set by process 25885


(4) Running truss on pid 25885 shows that the process is put to
    sleep while waiting for a lock on file descriptor 7.

# truss -p 25885 
fcntl(7, F_SETLKW, 0xEFFF8C54)  (sleeping...)

(5) Running pfiles on the same process shows that file descriptor 7 is
    locked by pid 25884 (inode=67728). Hence, pid 25884 and 25885 is
    causing a dead lock. One process is sleeping while the other
    is looping forever.

# /usr/proc/bin/pfiles 25885
25885:  /opt/fw/LPRng/current/sbin/lpd
  Current rlimit: 1024 file descriptors
   0: S_IFCHR mode:0666 dev:85,10 ino:91367 uid:0 gid:3 rdev:13,2
      O_RDONLY
   1: S_IFCHR mode:0666 dev:85,10 ino:91367 uid:0 gid:3 rdev:13,2
      O_WRONLY
   2: S_IFCHR mode:0666 dev:85,10 ino:91367 uid:0 gid:3 rdev:13,2
      O_WRONLY
   3: S_IFIFO mode:0000 dev:170,0 ino:128364 uid:1 gid:12 size:0
      O_RDWR
   4: S_IFSOCK mode:0666 dev:169,0 ino:24393 uid:0 gid:0 size:0
      O_RDWR
   5: S_IFDOOR mode:0444 dev:174,0 ino:36473 uid:0 gid:0 size:0
      O_RDONLY|O_LARGEFILE FD_CLOEXEC  door to nscd[3614]
   6: S_IFREG mode:0600 dev:32,136 ino:132494 uid:1 gid:12 size:68
      O_RDWR|O_APPEND
   7: S_IFREG mode:0600 dev:32,136 ino:67728 uid:1 gid:12 size:68
      O_RDWR|O_APPEND
      advisory write lock set by process 25884


(6) The file names corresponding to inodes 132494 and 67728 are

(bounce queue)
# ls -i /var/spool/lpd/tommy_hqoh/lpq

    132494 /var/spool/lpd/tommy_hqoh/lpq

(real queue)
# ls -i /var/spool/lpd/tommy_real/lpq.0

    67728 /var/spool/lpd/tommy_real/lpq.0
 

(7) Printcap entries

    tommy_real:cm=tek560
        :ifhp=model=phaser,dev=tommy%9100
        :lp=/dev/null
        :rw
        :sd=/var/spool/lpd/%P
        :lf=log
        :ps=status
        :if=/opt/fw/LPRng/current/libexec/filters/ifhp
        :server

    tommy_hqoh:cm=tek560
        :lp=tommy_real@localhost
        :lpd_bounce
        :ab
        :ifhp=model=phaser,appsocket@
        :if=/opt/fw/LPRng/current/libexec/filters/tek560_hq
        :bp=/opt/fw/LPRng/current/libexec/filters/psbanner.tek560
        :sd=/var/spool/lpd/%P
        :server

    

(8) Misc

    LPRng 3.7.4 was compiled using

    # ./configure --prefix=/opt/fw/LPRng/3.7.4 --disable-gdbm

    and gcc 2.95.2



-Anders

--
Anders Sundin                         phone:    +46 920 202020
EPL/I/I                               fax:      +46 920 202906
Ericsson Erisoft AB

-----------------------------------------------------------------------------
YOU MUST BE A LIST MEMBER IN ORDER TO POST TO THE LPRNG MAILING LIST
The address you post from MUST be your subscription address

If you need help, send email to [EMAIL PROTECTED] (or lprng-requests
or lprng-digest-requests) with the word 'help' in the body.  For the impatient,
to subscribe to a list with name LIST,  send mail to [EMAIL PROTECTED]
with:                           | example:
subscribe LIST <mailaddr>       |  subscribe lprng-digest [EMAIL PROTECTED]
unsubscribe LIST <mailaddr>     |  unsubscribe lprng [EMAIL PROTECTED]

If you have major problems,  send email to [EMAIL PROTECTED] with the word
LPRNGLIST in the SUBJECT line.
-----------------------------------------------------------------------------

Reply via email to