You can do the same thing I have listed in reverse (I do that also) Change the Count to find out if there have been any inbound in the last x minutes and if not then send an email
Fred From: Action Request System discussion list(ARSList) [mailto:[email protected]] On Behalf Of Rick Westbrock Sent: Thursday, August 28, 2014 5:39 PM To: [email protected] Subject: Re: E-mail engine not getting POP3 messages (Linux) but not logging errors ** Thanks for all the helpful suggestions. Our outbound e-mail is SMTP so it is decoupled from the inbound mail which is where I’m having the problem. I guess what I am looking for is really something I can run from a different server that can check the number of messages in the mailbox via POP3 without pulling any of them down. If the check only runs every 10 minutes then the Remedy e-mail engine will have had four chances to pick up the messages and I could trigger an alert on that. I’ll have to exercise some good-old fashioned Google-fu to see what I can come up with. The frustrating part is that there was nothing logged at all, the engine just failed to grab the messages. I had already thought of calling an external script to use sendmail to send the alert for queued up outbound messages but fortunately that isn’t a problem. -Rick From: Action Request System discussion list(ARSList) [mailto:[email protected]] On Behalf Of Thad Esser Sent: Thursday, August 28, 2014 10:51 AM To: [email protected]<mailto:[email protected]> Subject: Re: E-mail engine not getting POP3 messages (Linux) but not logging errors ** I essentially do the same thing as Fred, but since we have Windows servers, I use powershell to send the email, if it helps anyone: powershell -command Send-MailMessage -smtpServer <smtpserver> -to $Char Param 01$ -from $SERVER$@<companyname> -subject \"=== ALERT === AR System Email Messages Count: $z1D Integer 01$ ($SERVER$)\" -body \"$z1D Char 01$\" A set fields prior to this run process sets the char field for the body. I also have similar filters to check for Application Pending records that might not be processing and a few other things. It's all attached to an "Automation" form, with separate records and filters for the things I'm checking. Thad On Thu, Aug 28, 2014 at 10:30 AM, Grooms, Frederick W wrote: ** This is how I do it… First … I have a form with only a single record in it that we use to hold configuration info I have an escalation that runs against this “Config” form that set’s a Display only field to trigger filter workflow. The filter workflow does: Filter 1 xxxyyyzzz-1 Check_Counts Set Fields zTmp_Integer_1 = $SERVERTIMESTAMP$ SQL Set Fields zTmp_Integer_2 = SELECT COUNT(*) from AR_SYSTEM_EMAIL_MESSAGES WHERE MESSAGE_TYPE = 1 AND SEND_MESSAGE = 1 AND CREATE_DATE <= ($zTmp_Integer_1$ - 150) Filter 2 xxxyyyzzz-2 SendEmail Run-If ‘zTmp_Integer_2’ > 25 Set Fields zTmp_String_1 = $PROCESS$ echo "Subject: Email Count Error Server $SERVER$ has $zTmp_Integer_2$ messages waiting to send . " | /usr/lib/sendmail [email protected],[email protected]<mailto:[email protected],[email protected]> Filter 1 gets me a count of Emails waiting to send that are at least 2 1/2 minutes old. Filter 2 says if there are more than 25 emails waiting to send alert people using sendmail This is the same basic logic I use to monitor other parts of the system, except instead of using sendmail I push to the email messages form Fred From: Action Request System discussion list(ARSList) [mailto:[email protected]<mailto:[email protected]>] On Behalf Of William Rentfrow Sent: Thursday, August 28, 2014 11:32 AM To: [email protected]<mailto:[email protected]> Subject: Re: E-mail engine not getting POP3 messages (Linux) but not logging errors ** I've seen this in Linux from time to time as well. It's not really frequent but it does happen. We're on SuSe linux running 7.6.04 sp 5. Another environment is on SuSe with 8.1 - and it's happened to both. There's not a great way to test it honestly, since when it dies this way it doesn't appear to do anything bad. There's nothing in the log files for the monitoring tools to grab. In fact, a couple of weeks ago this died on a Saturday and for some reason no one noticed until Tuesday morning. Then I fixed it....and it sent 200,000+ emails out. I was *very* popular that day.... We've kicked around a couple of idea like writing workflow to notify us of this, but the problem there is that everyone wants to get notified by email...so....that's not going to work. It turns out a broken email process won't send email either :) I think long term the best solution would be for BMC to separate the email process completely from the AR server and do a check-in like it does for the server group. Right now in a server group if email dies but the ar server itself stays up the email process won't hop to another machine. It's annoying and completely fixable, but BMC has not yet chosen to do that. If it did have a check-in then armonitor could kill it when it wasn't responding, regardless of if you were in a server group or not. Right now we just check it intermittently and hope for the best. Fortunately our email volume is high enough that our customers usually notice within an hour or two. From: Action Request System discussion list(ARSList) [mailto:[email protected]] On Behalf Of Rick Westbrock Sent: Thursday, August 28, 2014 10:25 AM To: [email protected]<mailto:[email protected]> Subject: E-mail engine not getting POP3 messages (Linux) but not logging errors ** Hi all- I had an interesting issue today and wondered if someone else had run into it before. I am running my e-mail engine (7.1) on a Linux server (RHEL 5.10) and using POP3 to get messages from a remote mail server. Normally if there’s a problem the Email Error form fills up with connection errors but this time it failed to pull down messages for over 24 hours but never logged an error. I used the emaild.sh script with the stop parameter to kill the process and normally it stops it immediately, then a monitoring script sees that it isn’t running and starts it up again. However today the stop script appeared to hang and after five minutes I finally did a kill -9 on the PID to kill the process. The monitoring script started it back up immediately with a new PID and it processed the 124 waiting messages via POP3 within 30 seconds. Any ideas what would cause the engine to hang without logging an error? Any suggestions on how to monitor and alert on this situation? To date I have just been visually looking at the Inbox via Outlook on my local machine to make sure there are no messages waiting (the e-mail engine polls every two minutes) but that is obviously not an optimal solution. Apparently I forgot to check it yesterday, hence the 24 hour backup of messages. Thanks in advance, Rick _________________________ Rick Westbrock AppOps Engineer | IT Department 24 Hour Fitness USA, Inc. _ARSlist: "Where the Answers Are" and have been for 20 years_ _ARSlist: "Where the Answers Are" and have been for 20 years_ _ARSlist: "Where the Answers Are" and have been for 20 years_ _______________________________________________________________________________ UNSUBSCRIBE or access ARSlist Archives at www.arslist.org "Where the Answers Are, and have been for 20 years"

