Hi, Following on from the previous post. Everything has been pretty stable it's just the pop3d locking up which just happens randomly. I've not ran under gdb yet but I did get some debug output and a few other bits.
Under normal load dbmail-pop3d is doing around 5-6% CPU and is accepting 250 connections per minute (grep 'connection accepted' over 1 minute at various times today) 53559 nobody 2 20 0 192M 112M uwait 11 40:40 4.88% dbmail-pop3d At completely random times (once thus far today - UK time at 12:20~) pop3d locks up in that it answers no new queries but continues to run and the CPU spikes up to 50% or more PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 53559 nobody 2 20 0 172M 95588K uwait 11 37:25 50.88% dbmail-pop3d On freebsd I ran truss against the process which is just a constant scroll of the following:- write(15,"gv\r\n+Skf/hFP/wCXEU37AviO3VWk+L"...,262143) ERR#35 'Resource temporarily unavailable' write(15,"gv\r\n+Skf/hFP/wCXEU37AviO3VWk+L"...,262143) ERR#35 'Resource temporarily unavailable' write(15,"gv\r\n+Skf/hFP/wCXEU37AviO3VWk+L"...,262143) ERR#35 'Resource temporarily unavailable' write(15,"gv\r\n+Skf/hFP/wCXEU37AviO3VWk+L"...,262143) ERR#35 'Resource temporarily unavailable' write(15,"gv\r\n+Skf/hFP/wCXEU37AviO3VWk+L"...,262143) ERR#35 'Resource temporarily unavailable' write(15,"gv\r\n+Skf/hFP/wCXEU37AviO3VWk+L"...,262143) ERR#35 'Resource temporarily unavailable' write(15,"gv\r\n+Skf/hFP/wCXEU37AviO3VWk+L"...,262143) ERR#35 'Resource temporarily unavailable' write(15,"gv\r\n+Skf/hFP/wCXEU37AviO3VWk+L"...,262143) ERR#35 'Resource temporarily unavailable' write(15,"gv\r\n+Skf/hFP/wCXEU37AviO3VWk+L"...,262143) ERR#35 'Resource temporarily unavailable' write(15,"gv\r\n+Skf/hFP/wCXEU37AviO3VWk+L"...,262143) ERR#35 'Resource temporarily unavailable' write(15,"gv\r\n+Skf/hFP/wCXEU37AviO3VWk+L"...,262143) ERR#35 'Resource temporarily unavailable' write(15,"gv\r\n+Skf/hFP/wCXEU37AviO3VWk+L"...,262143) ERR#35 'Resource temporarily unavailable' write(15,"gv\r\n+Skf/hFP/wCXEU37AviO3VWk+L"...,262143) ERR#35 'Resource temporarily unavailable' write(15,"gv\r\n+Skf/hFP/wCXEU37AviO3VWk+L"...,262143) ERR#35 'Resource temporarily unavailable' write(15,"gv\r\n+Skf/hFP/wCXEU37AviO3VWk+L"...,262143) ERR#35 'Resource temporarily unavailable' write(15,"gv\r\n+Skf/hFP/wCXEU37AviO3VWk+L"...,262143) ERR#35 'Resource temporarily unavailable' mytop at the time shows little to no activity:- MySQL on localhost (5.6.12-log) up 0+22:09:06 [12:30:15] Queries: 121.6M qps: 1599 Slow: 22.0 Se/In/Up/De(%): 05/01/00/00 qps now: 56 Slow qps: 0.0 Threads: 16 ( 1/ 6) 07/00/00/00 Key Efficiency: 100.0% Bps in/out: 133.3k/305.1k Now in/out: 812.2/16.8k Master: mysql-bin.003692/80665838 do: ign: Id User Host/IP DB Time Cmd Query or State -- ---- ------- -- ---- --- ---------- 1 dbmail localhost dbmail 0 Sleep 1397 root localhost test 0 Query show full processlist 2 dbmail localhost dbmail 2 Sleep 3 dbmail localhost dbmail 6 Sleep 4 dbmail localhost dbmail 18 Sleep 1393 root localhost 69 Sleep 5 dbmail localhost dbmail 87 Sleep 6 dbmail localhost dbmail 87 Sleep 514 dbmail localhost dbmail 203 Sleep 515 dbmail localhost dbmail 207 Sleep 7 dbmail localhost dbmail 356 Sleep 8 dbmail localhost dbmail 667 Sleep 9 dbmail localhost dbmail 667 Sleep 766 dbmail localhost dbmail 2087 Sleep 10 dbmail localhost dbmail 2156 Sleep 11 dbmail localhost dbmail 2156 Sleep To put it into a little more perspective with debugging enabled at the moment it normally logs between 50000 to 80000 lines 'per minute' in the log file but when pop3d locks up it's logging around 20 lines a minute. [root@redacted-4a ~]# cat dbmail | grep " 12:20" | wc -l 80757 [root@ redacted -4a ~]# cat dbmail | grep " 12:24" | wc -l 61002 [root@ redacted -4a ~]# cat dbmail | grep " 12:25" | wc -l 56135 [root@ redacted -4a ~]# cat dbmail | grep " 12:26" | wc -l 63500 [root@ redacted -4a ~]# cat dbmail | grep " 12:27" | wc -l 20 The debug output for pop3 during the issue (the 20 lines from 12:27) is below and it just seems like it's spending a whole minute dealing with 1 users mailbox:- [root@mail4-db-4a ~]# cat dbmail | grep " 12:27" Mar 14 12:27:00 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase] ci_write(+343): [0x805334000] S > [16040/1222118:gwQyZ31AUn^M RxR4h6+QsBAMUZnWBV/7gdwugzE] Mar 14 12:27:03 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase] ci_write(+343): [0x805334000] S > [16040/1206078:EGzXAi+Xx4tL4PofLBol72tlxXk/^M uIcxMLvT4] Mar 14 12:27:05 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase] ci_write(+343): [0x805334000] S > [17448/1190038:puZkW65uMzYTpQK5Mx1HPY4jjLPwu2ajaDrQ2Ee] Mar 14 12:27:08 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase] ci_write(+343): [0x805334000] S > [16040/1172590:w8L0ZpbHRlci9GbGF0ZURlY29kZS9MZW5ndGggM] Mar 14 12:27:11 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase] ci_write(+343): [0x805334000] S > [17448/1156550:WMYMB9^M gNLuXhXn3sTLjaRfsZ5jh2vWdvXsnlC] Mar 14 12:27:14 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase] ci_write(+343): [0x805334000] S > [16424/1139102:tY1FPeh2Xl/a9ffIuZPHZX^M IRqKnCTz2dz5Jbt] Mar 14 12:27:18 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase] ci_write(+343): [0x805334000] S > [19112/1122678:WUYRjHUbm72iU5Ll5GUAqOyKaj^M d3Vsq3IqZtv] Mar 14 12:27:21 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase] ci_write(+343): [0x805334000] S > [16424/1103566:Ljs040^M WrbdLM6r9sDW7PxK/hXX9Om3293pbzQ] Mar 14 12:27:24 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase] ci_write(+343): [0x805334000] S > [17832/1087142:E3VqVKfeU8^M E9DXq4mE6kVCC3v+h6tqcqMoz2N] Mar 14 12:27:26 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase] ci_write(+343): [0x805334000] S > [17832/1069310:OBk0DWpptxhT^M JOTtbHPXvTooZI5FchSO65rpo] Mar 14 12:27:30 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase] ci_write(+343): [0x805334000] S > [16424/1051478:8A7PU/8FT/APkC^M rNZ38ik/8IjqQLEls6NN/wD] Mar 14 12:27:33 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase] ci_write(+343): [0x805334000] S > [16424/1035054:qOHpypuS3jGU8RSk43^M +1yRT+zofHVfFWHO/ZY] Mar 14 12:27:36 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase] ci_write(+343): [0x805334000] S > [16424/1018630:iORk2Ixxn7ua0jmsbFWlfU^M KfH4iMbEq5UEetX] Mar 14 12:27:39 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase] ci_write(+343): [0x805334000] S > [16424/1002206:8AwViP/moP7Ip/zfgFH/ESV+3j^M /wBEv+EX/hO] Mar 14 12:27:42 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase] ci_write(+343): [0x805334000] S > [19112/985782:kTWEUkp1Kzg1I6jBcfaInRnW20uCL7^M Q8Uhh2R] Mar 14 12:27:45 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase] ci_write(+343): [0x805334000] S > [16424/966670:KKACigAooA^M KKACigAooAKKACigAooAKKACigA] Mar 14 12:27:49 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase] ci_write(+343): [0x805334000] S > [16424/950246:v/AA/+Oq+Gf2cP^M +CeP7Udp8AvEVv4WNlq3wL8] Mar 14 12:27:51 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase] ci_write(+343): [0x805334000] S > [16424/933822:gFH/D9/wD6w0fsFf8A^M iO//AN3Uf8TR/wDVD8O] Mar 14 12:27:54 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase] ci_write(+343): [0x805334000] S > [16424/917398:xGVrDPFUaFGtXq0vrXvULU^M 8PSq1KjlFOTVGE3] Mar 14 12:27:58 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase] ci_write(+343): [0x805334000] S > [16424/900974:pXnhfZRpYyTj7DDUKvPRot1HH2^M sav1/iVg+HO] I'm happy to send a copy of the debug off list including the section from 12:26 if it helps at all. Not sure if it will. Best regards, James Greig -----Original Message----- From: dbmail-boun...@dbmail.org [mailto:dbmail-boun...@dbmail.org] On Behalf Of Paul J Stevens Sent: 13 March 2014 07:51 To: DBMail mailinglist Subject: Re: [Dbmail] Iphone HTML emails On 12-03-14 22:05, James Greig wrote: > That should read 'no seg faults' at all :) Most excellent! > Should also point out dbmail-imapd and dbmail-lmtpd both running on > the same box continue to run fine and accept connections whilst pop3d > is unavailable. The machine itself is pretty busy but at the time > it happens there's no sudden rush of connections or anything that > stands out, but like I say it's a pretty busy box anyway. Next time that happens, you might do: lsof -np `pidof dbmail-imapd` perhaps there's a large amount of connections in WAIT_STATE or a similar suspicious state. a short strace might be helpful, or even a gdb backtrace. Anything that might help discover what pop3d is *doing*. Is it blocking, stuck in a loop, suffering some kind of resource depletion, anything. -- ________________________________________________________________ Paul J Stevens pjstevns @ gmail, twitter, github, linkedin www.nfg.nl/i...@nfg.nl/+31.85.877.99.97 _______________________________________________ DBmail mailing list DBmail@dbmail.org http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail _______________________________________________ DBmail mailing list DBmail@dbmail.org http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail