Hi,

Following on from the previous post.  Everything has been pretty stable it's 
just the pop3d locking up which just happens randomly.  I've not ran under gdb 
yet but I did get some debug output and a few other bits.

Under normal load dbmail-pop3d is doing around 5-6% CPU and is accepting 250 
connections per minute   (grep 'connection accepted' over 1 minute at various 
times today)
53559 nobody        2  20    0   192M   112M uwait  11  40:40  4.88% 
dbmail-pop3d

At completely random times (once thus far today - UK time at 12:20~) pop3d 
locks up in that it answers no new queries but continues to run and the CPU 
spikes up to 50% or more 
  PID USERNAME    THR PRI NICE   SIZE    RES STATE   C   TIME   WCPU COMMAND
53559 nobody        2  20    0   172M 95588K uwait  11  37:25 50.88% 
dbmail-pop3d

On freebsd I ran truss against the process which is just a constant scroll of 
the following:-

write(15,"gv\r\n+Skf/hFP/wCXEU37AviO3VWk+L"...,262143) ERR#35 'Resource 
temporarily unavailable'
write(15,"gv\r\n+Skf/hFP/wCXEU37AviO3VWk+L"...,262143) ERR#35 'Resource 
temporarily unavailable'
write(15,"gv\r\n+Skf/hFP/wCXEU37AviO3VWk+L"...,262143) ERR#35 'Resource 
temporarily unavailable'
write(15,"gv\r\n+Skf/hFP/wCXEU37AviO3VWk+L"...,262143) ERR#35 'Resource 
temporarily unavailable'
write(15,"gv\r\n+Skf/hFP/wCXEU37AviO3VWk+L"...,262143) ERR#35 'Resource 
temporarily unavailable'
write(15,"gv\r\n+Skf/hFP/wCXEU37AviO3VWk+L"...,262143) ERR#35 'Resource 
temporarily unavailable'
write(15,"gv\r\n+Skf/hFP/wCXEU37AviO3VWk+L"...,262143) ERR#35 'Resource 
temporarily unavailable'
write(15,"gv\r\n+Skf/hFP/wCXEU37AviO3VWk+L"...,262143) ERR#35 'Resource 
temporarily unavailable'
write(15,"gv\r\n+Skf/hFP/wCXEU37AviO3VWk+L"...,262143) ERR#35 'Resource 
temporarily unavailable'
write(15,"gv\r\n+Skf/hFP/wCXEU37AviO3VWk+L"...,262143) ERR#35 'Resource 
temporarily unavailable'
write(15,"gv\r\n+Skf/hFP/wCXEU37AviO3VWk+L"...,262143) ERR#35 'Resource 
temporarily unavailable'
write(15,"gv\r\n+Skf/hFP/wCXEU37AviO3VWk+L"...,262143) ERR#35 'Resource 
temporarily unavailable'
write(15,"gv\r\n+Skf/hFP/wCXEU37AviO3VWk+L"...,262143) ERR#35 'Resource 
temporarily unavailable'
write(15,"gv\r\n+Skf/hFP/wCXEU37AviO3VWk+L"...,262143) ERR#35 'Resource 
temporarily unavailable'
write(15,"gv\r\n+Skf/hFP/wCXEU37AviO3VWk+L"...,262143) ERR#35 'Resource 
temporarily unavailable'
write(15,"gv\r\n+Skf/hFP/wCXEU37AviO3VWk+L"...,262143) ERR#35 'Resource 
temporarily unavailable'

mytop at the time shows little to no activity:-

MySQL on localhost (5.6.12-log)                                                 
                                                                                
                                                                        up 
0+22:09:06 [12:30:15]
 Queries: 121.6M qps: 1599 Slow:    22.0         Se/In/Up/De(%):    05/01/00/00
             qps now:   56 Slow qps: 0.0  Threads:   16 (   1/   6) 07/00/00/00
 Key Efficiency: 100.0%  Bps in/out: 133.3k/305.1k   Now in/out: 812.2/16.8k
 Master: mysql-bin.003692/80665838 do:  ign:

  Id      User         Host/IP         DB      Time    Cmd Query or State
  --      ----         -------         --      ----    --- ----------
   1    dbmail       localhost     dbmail         0  Sleep
1397      root       localhost       test         0  Query show full processlist
   2    dbmail       localhost     dbmail         2  Sleep
   3    dbmail       localhost     dbmail         6  Sleep
   4    dbmail       localhost     dbmail        18  Sleep
1393      root       localhost                   69  Sleep
   5    dbmail       localhost     dbmail        87  Sleep
   6    dbmail       localhost     dbmail        87  Sleep
 514    dbmail       localhost     dbmail       203  Sleep
 515    dbmail       localhost     dbmail       207  Sleep
   7    dbmail       localhost     dbmail       356  Sleep
   8    dbmail       localhost     dbmail       667  Sleep
   9    dbmail       localhost     dbmail       667  Sleep
 766    dbmail       localhost     dbmail      2087  Sleep
  10    dbmail       localhost     dbmail      2156  Sleep
  11    dbmail       localhost     dbmail      2156  Sleep


To put it into a little more perspective with debugging enabled at the moment 
it normally logs between 50000 to 80000 lines 'per minute' in the log file but 
when pop3d locks up it's logging around 20 lines a minute.

[root@redacted-4a ~]# cat dbmail | grep " 12:20" | wc -l
   80757
[root@ redacted -4a ~]# cat dbmail | grep " 12:24" | wc -l
   61002
[root@ redacted -4a ~]# cat dbmail | grep " 12:25" | wc -l
   56135
[root@ redacted -4a ~]# cat dbmail | grep " 12:26" | wc -l
   63500
[root@ redacted -4a ~]# cat dbmail | grep " 12:27" | wc -l
      20

The debug output for pop3 during the issue (the 20 lines from 12:27) is below 
and it just seems like it's spending a whole minute dealing with 1 users 
mailbox:-

[root@mail4-db-4a ~]# cat dbmail | grep " 12:27"
Mar 14 12:27:00 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase] 
ci_write(+343): [0x805334000] S > [16040/1222118:gwQyZ31AUn^M 
RxR4h6+QsBAMUZnWBV/7gdwugzE]
Mar 14 12:27:03 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase] 
ci_write(+343): [0x805334000] S > [16040/1206078:EGzXAi+Xx4tL4PofLBol72tlxXk/^M 
uIcxMLvT4]
Mar 14 12:27:05 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase] 
ci_write(+343): [0x805334000] S > 
[17448/1190038:puZkW65uMzYTpQK5Mx1HPY4jjLPwu2ajaDrQ2Ee]
Mar 14 12:27:08 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase] 
ci_write(+343): [0x805334000] S > 
[16040/1172590:w8L0ZpbHRlci9GbGF0ZURlY29kZS9MZW5ndGggM]
Mar 14 12:27:11 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase] 
ci_write(+343): [0x805334000] S > [17448/1156550:WMYMB9^M 
gNLuXhXn3sTLjaRfsZ5jh2vWdvXsnlC]
Mar 14 12:27:14 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase] 
ci_write(+343): [0x805334000] S > [16424/1139102:tY1FPeh2Xl/a9ffIuZPHZX^M 
IRqKnCTz2dz5Jbt]
Mar 14 12:27:18 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase] 
ci_write(+343): [0x805334000] S > [19112/1122678:WUYRjHUbm72iU5Ll5GUAqOyKaj^M 
d3Vsq3IqZtv]
Mar 14 12:27:21 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase] 
ci_write(+343): [0x805334000] S > [16424/1103566:Ljs040^M 
WrbdLM6r9sDW7PxK/hXX9Om3293pbzQ]
Mar 14 12:27:24 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase] 
ci_write(+343): [0x805334000] S > [17832/1087142:E3VqVKfeU8^M 
E9DXq4mE6kVCC3v+h6tqcqMoz2N]
Mar 14 12:27:26 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase] 
ci_write(+343): [0x805334000] S > [17832/1069310:OBk0DWpptxhT^M 
JOTtbHPXvTooZI5FchSO65rpo]
Mar 14 12:27:30 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase] 
ci_write(+343): [0x805334000] S > [16424/1051478:8A7PU/8FT/APkC^M 
rNZ38ik/8IjqQLEls6NN/wD]
Mar 14 12:27:33 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase] 
ci_write(+343): [0x805334000] S > [16424/1035054:qOHpypuS3jGU8RSk43^M 
+1yRT+zofHVfFWHO/ZY]
Mar 14 12:27:36 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase] 
ci_write(+343): [0x805334000] S > [16424/1018630:iORk2Ixxn7ua0jmsbFWlfU^M 
KfH4iMbEq5UEetX]
Mar 14 12:27:39 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase] 
ci_write(+343): [0x805334000] S > [16424/1002206:8AwViP/moP7Ip/zfgFH/ESV+3j^M 
/wBEv+EX/hO]
Mar 14 12:27:42 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase] 
ci_write(+343): [0x805334000] S > 
[19112/985782:kTWEUkp1Kzg1I6jBcfaInRnW20uCL7^M Q8Uhh2R]
Mar 14 12:27:45 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase] 
ci_write(+343): [0x805334000] S > [16424/966670:KKACigAooA^M 
KKACigAooAKKACigAooAKKACigA]
Mar 14 12:27:49 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase] 
ci_write(+343): [0x805334000] S > [16424/950246:v/AA/+Oq+Gf2cP^M 
+CeP7Udp8AvEVv4WNlq3wL8]
Mar 14 12:27:51 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase] 
ci_write(+343): [0x805334000] S > [16424/933822:gFH/D9/wD6w0fsFf8A^M 
iO//AN3Uf8TR/wDVD8O]
Mar 14 12:27:54 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase] 
ci_write(+343): [0x805334000] S > [16424/917398:xGVrDPFUaFGtXq0vrXvULU^M 
8PSq1KjlFOTVGE3]
Mar 14 12:27:58 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase] 
ci_write(+343): [0x805334000] S > [16424/900974:pXnhfZRpYyTj7DDUKvPRot1HH2^M 
sav1/iVg+HO]


I'm happy to send a copy of the debug off list including the section from 12:26 
if it helps at all.  Not sure if it will.

Best regards, 
 
James Greig

-----Original Message-----
From: dbmail-boun...@dbmail.org [mailto:dbmail-boun...@dbmail.org] On Behalf Of 
Paul J Stevens
Sent: 13 March 2014 07:51
To: DBMail mailinglist
Subject: Re: [Dbmail] Iphone HTML emails

On 12-03-14 22:05, James Greig wrote:
> That should read 'no seg faults' at all :)

Most excellent!

> Should also point out dbmail-imapd and dbmail-lmtpd both running on 
> the same box continue to run fine and accept connections whilst pop3d
> is unavailable.   The machine itself is pretty busy but at the time
> it happens there's no sudden rush of connections or anything that 
> stands out, but like I say it's a pretty busy box anyway.

Next time that happens, you might do:

lsof -np `pidof dbmail-imapd`

perhaps there's a large amount of connections in WAIT_STATE or a similar 
suspicious state.

a short strace might be helpful, or even a gdb backtrace. Anything that might 
help discover what pop3d is *doing*. Is it blocking, stuck in a loop, suffering 
some kind of resource depletion, anything.


--
________________________________________________________________
Paul J Stevens       pjstevns @ gmail, twitter, github, linkedin
           www.nfg.nl/i...@nfg.nl/+31.85.877.99.97
_______________________________________________
DBmail mailing list
DBmail@dbmail.org
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail
_______________________________________________
DBmail mailing list
DBmail@dbmail.org
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail

Reply via email to