Re: PATCH: Possible reasons for qmgr loading the system?
To apply this patch, cd into the Postfix-2.5.* top-level source directory and execute: $ patch thismessage We were able to reproduce the scheduler looping problem, and it does not recur with the patched version A question ... what' the way to make this patch to be included in Ubuntu Server postfix packages? I mean, should I submit your message+patch to the package maintainers of Ubuntu / Debian / Redhat so that new postfix packages with the bug corrected are released as updates for users? Or ... you just publish the patch / bug somewhere and then the package maintainers update their sources automatically without we or you needing to contact them? :? I can patch postfix's sources, but then I loose Ubuntu package security updates and will force me to maintain postfix from sources since this moment. The best way would be your patch to be integrated in postfix and new security postfix packages to be released by package maintainers, but I don't know how to force that. Thanks. -- Santiago Romero
Re: PATCH: Possible reasons for qmgr loading the system?
On Fri, 06 Mar 2009 10:07:26 +0100 Santiago Romero srom...@servicom2000.com wrote: A question ... what' the way to make this patch to be included in Ubuntu Server postfix packages? I mean, should I submit your message+patch to the package maintainers of Ubuntu / Debian / Redhat so that new postfix packages with the bug corrected are released as updates for users? Or ... you just publish the patch / bug somewhere and then the package maintainers update their sources automatically without we or you needing to contact them? :? I can patch postfix's sources, but then I loose Ubuntu package security updates and will force me to maintain postfix from sources since this moment. The best way would be your patch to be integrated in postfix and new security postfix packages to be released by package maintainers, but I don't know how to force that. In a perfect world, the program maintainers would know about the patch and take steps to correct their package/port or whatever. You might want to contact the maintainer of Postfix for your Distro and see if they are planning on updating the package/port. Usually, they do get a little annoyed if you start bugging them 5 seconds after the patch is released. Some of them actually have day jobs. -- Gerard postfix.u...@yahoo.com TO REPORT A PROBLEM see http://www.postfix.org/DEBUG_README.html#mail TO (UN)SUBSCRIBE see http://www.postfix.org/lists.html Cheese -- milk's leap toward immortality. Clifton Fadiman, Any Number Can Play signature.asc Description: PGP signature
Re: PATCH: Possible reasons for qmgr loading the system?
Gerard escribió: In a perfect world, the program maintainers would know about the patch and take steps to correct their package/port or whatever. You might want to contact the maintainer of Postfix for your Distro and see if they are planning on updating the package/port. Usually, they do get a little annoyed if you start bugging them 5 seconds after the patch is released. Some of them actually have day jobs Well, I'm not planning to bug them with the patch. I don't know if the integration of the patch with the current package versions is automatic or author / bug discoverers must or should notify them to package maintainers... That's what I was asking: if the process is automatic or should I notify / help in any way. -- Santiago Romero
Re: PATCH: Possible reasons for qmgr loading the system?
Santiago Romero: To apply this patch, cd into the Postfix-2.5.* top-level source directory and execute: $ patch thismessage We were able to reproduce the scheduler looping problem, and it does not recur with the patched version A question ... what' the way to make this patch to be included in Ubuntu Server postfix packages? I will release this as part of Postfix 2.5.7. Meanwhile, you can use oqmgr and it it will an all likelihood perform just as well. I mean, should I submit your message+patch to the package maintainers of Ubuntu / Debian / Redhat so that new postfix packages with the bug corrected are released as updates for users? Or ... you just publish the patch / bug somewhere and then the package maintainers update their sources automatically without we or you needing to contact them? :? I can patch postfix's sources, but then I loose Ubuntu package security updates and will force me to maintain postfix from sources since this moment. The best way would be your patch to be integrated in postfix and new security postfix packages to be released by package maintainers, but I don't know how to force that. I have no control over vendors and distributors. Wietse
Re: PATCH: Possible reasons for qmgr loading the system?
Wietse Venema wrote: You might want to repeat your precise Postfix version at this point, and which queue manager version is configured in your master.cf. Current Postfix versions have (qmgr=new, oqmgr=old) in master.cf. Older Postfix versions have (nqmgr=new, qmgr=old) instead. The programs are the same except for the job selection algorithm. r...@egeo:~# postconf mail_version mail_version = 2.5.1 r...@egeo:~# grep -i qmgr /etc/postfix/master.cf qmgr fifo n - n 300 1 qmgr #qmgr fifo n - - 300 1 oqmgr If you are using the new queue manager, it is worthwhile to see if the problem persists when you switch to the old queue manager. It seems I'm using the new one... OK, leave the above settings and see if this helps (Postfix 2.5 or later). I have not been able to reproduce the problem, but there was some bogosity with the handling of _destination_rate_delay. diff --exclude=man --exclude=html --exclude=README_FILES --exclude=.indent.pro --exclude=Makefile.in -cr src/qmgr/qmgr_entry.c- src/qmgr/qmgr_entry.c Well, I'm using postfix's ubuntu package, so it's not compiled from source code because I need all my ~=100 Linux machines to be easily updatable (apt-get update apt-get upgrade). In this case, I'm going to recompile .deb source package including your patch to see if that solves the problem ... Please, allow me a couple of days to recompile / install it (it's a production system, I need to find a working window with customers). I'll inform you in this list if the problem happens again or if the patch seemed to fix the problem. Do you want any kind of aditional change / logging / config to make the problem more easy to happen? (I mean, setting rate_ values higher or lower so that the problem reproduces again faster, because it passed 5 days between the last 2 times qmgr ate the CPU...). Thanks. -- Santiago Romero
Re: PATCH: Possible reasons for qmgr loading the system?
Santiago Romero: (I mean, setting rate_ values higher or lower so that the problem reproduces again faster, because it passed 5 days between the last 2 times qmgr ate the CPU...). Just run the same test. Thanks, Wietse
Re: PATCH: Possible reasons for qmgr loading the system?
On Thu, Mar 05, 2009 at 12:20:06PM +0100, Santiago Romero wrote: Well, I'm using postfix's ubuntu package, so it's not compiled from source code because I need all my ~=100 Linux machines to be easily updatable (apt-get update apt-get upgrade). In this case, I'm going to recompile .deb source package including your patch to see if that solves the problem ... Please, allow me a couple of days to recompile / install it (it's a production system, I need to find a working window with customers). I'll inform you in this list if the problem happens again or if the patch seemed to fix the problem. Do you want any kind of aditional change / logging / config to make the problem more easy to happen? Please wait for an updated patch, we believe we have identified the cause and reproduced the symptoms (in that order). I have a candidate patch, but I expect Wietse will send an updated more polished version in the not too distant future. The issue found applies only to rate-limited transports, if you are not using such transports, you don't need the patch. The patch ensures that work done at the completion of a delivery with a normal transport is correctly split between before suspend and after resume. The original 2.5.x code is correct for oqmgr, but not for qmgr (aka nqmgr), which requires additional internal state adjustments when destinations are blocked and unblocked. -- Viktor. Disclaimer: off-list followups get on-list replies or get ignored. Please do not ignore the Reply-To header. To unsubscribe from the postfix-users list, visit http://www.postfix.org/lists.html or click the link below: mailto:majord...@postfix.org?body=unsubscribe%20postfix-users If my response solves your problem, the best way to thank me is to not send an it worked, thanks follow-up. If you must respond, please put It worked, thanks in the Subject so I can delete these quickly.
Re: PATCH: Possible reasons for qmgr loading the system?
Please wait for an updated patch, we believe we have identified the cause and reproduced the symptoms (in that order). I have a candidate patch, but I expect Wietse will send an updated more polished version in the not too distant future. Ok, I'll wait for it. I'm going to roll back to ubuntu packages (I already applied the patch and was testing it). The original 2.5.x code is correct for oqmgr, but not for qmgr (aka nqmgr), which requires additional internal state adjustments when destinations are blocked and unblocked I've changed to oqmgr in master.cf for the machine that uses that special slow transport. Would I notice any difference in postfix behaviour because of using oqmgr instead of qmgr (less performance or something like that)? Thanks. -- Santiago Romero
Re: PATCH: Possible reasons for qmgr loading the system?
On Thu, Mar 05, 2009 at 04:21:01PM +0100, Santiago Romero wrote: Please wait for an updated patch, we believe we have identified the cause and reproduced the symptoms (in that order). I have a candidate patch, but I expect Wietse will send an updated more polished version in the not too distant future. Ok, I'll wait for it. I'm going to roll back to ubuntu packages (I already applied the patch and was testing it). The original 2.5.x code is correct for oqmgr, but not for qmgr (aka nqmgr), which requires additional internal state adjustments when destinations are blocked and unblocked I've changed to oqmgr in master.cf for the machine that uses that special slow transport. Would I notice any difference in postfix behaviour because of using oqmgr instead of qmgr (less performance or something like that)? With oqmgr, list messages with a lot (multiple thousands to perhaps hundreds of thousands) of recipients can dominate the queue, and delay small messages. Also if you don't define relay_domains correctly, on a high-volume border gateway outbound smtp traffic can starve inbound smtp traffic when both use the same transport, especially if outbound traffic exhibits high latency. - Avoid mixing (very large) list mail with regular traffic in the same queue with oqmgr - Avoid delivering inbound/outbound traffic via the same transport. - Avoid outbound congestion caused by lack of recipient validation. -- Viktor. Disclaimer: off-list followups get on-list replies or get ignored. Please do not ignore the Reply-To header. To unsubscribe from the postfix-users list, visit http://www.postfix.org/lists.html or click the link below: mailto:majord...@postfix.org?body=unsubscribe%20postfix-users If my response solves your problem, the best way to thank me is to not send an it worked, thanks follow-up. If you must respond, please put It worked, thanks in the Subject so I can delete these quickly.
Re: PATCH: Possible reasons for qmgr loading the system?
Santiago Romero: Please wait for an updated patch, we believe we have identified the cause and reproduced the symptoms (in that order). I have a candidate patch, but I expect Wietse will send an updated more polished version in the not too distant future. Ok, I'll wait for it. I'm going to roll back to ubuntu packages (I already applied the patch and was testing it). It will be later today. I don't have much time so I want to have it really right the first time. Code that is right takes more work than code that works. Wietse
Re: PATCH: Possible reasons for qmgr loading the system?
On Thu, 5 Mar 2009 13:03:11 -0500 (EST) wie...@porcupine.org (Wietse Venema) wrote: It will be later today. I don't have much time so I want to have it really right the first time. Code that is right takes more work than code that works. Reminds me of a plaque I have in my office. There is never enough time to do it right; however, there is always enough time to do it over. anonymous -- Gerard postfix.u...@yahoo.com TO REPORT A PROBLEM see http://www.postfix.org/DEBUG_README.html#mail TO (UN)SUBSCRIBE see http://www.postfix.org/lists.html BYTE editors are people who separate the wheat from the chaff, and then carefully print the chaff. signature.asc Description: PGP signature
Re: PATCH: Possible reasons for qmgr loading the system?
Wietse Venema: Santiago Romero: Please wait for an updated patch, we believe we have identified the cause and reproduced the symptoms (in that order). I have a candidate patch, but I expect Wietse will send an updated more polished version in the not too distant future. Ok, I'll wait for it. I'm going to roll back to ubuntu packages (I already applied the patch and was testing it). It will be later today. I don't have much time so I want to have it really right the first time. Code that is right takes more work than code that works. To apply this patch, cd into the Postfix-2.5.* top-level source directory and execute: $ patch thismessage We were able to reproduce the scheduler looping problem, and it does not recur with the patched version. Wietse diff -cr /var/tmp/postfix-2.5.6/src/oqmgr/qmgr_transport.c src/oqmgr/qmgr_transport.c *** /var/tmp/postfix-2.5.6/src/oqmgr/qmgr_transport.c Sun Dec 2 13:13:26 2007 --- src/oqmgr/qmgr_transport.c Thu Mar 5 16:06:43 2009 *** *** 286,291 --- 286,293 continue; need = xport-pending + 1; for (queue = xport-queue_list.next; queue; queue = queue-peers.next) { + if (QMGR_QUEUE_READY(queue) == 0) + continue; if ((need -= MIN5af51743e4eef(queue-window - queue-busy_refcount, queue-todo_refcount)) = 0) { QMGR_LIST_ROTATE(qmgr_transport_list, xport); diff -cr /var/tmp/postfix-2.5.6/src/qmgr/qmgr.h src/qmgr/qmgr.h *** /var/tmp/postfix-2.5.6/src/qmgr/qmgr.h Sat Dec 8 11:01:59 2007 --- src/qmgr/qmgr.h Thu Mar 5 16:36:32 2009 *** *** 436,441 --- 436,442 extern QMGR_ENTRY *qmgr_job_entry_select(QMGR_TRANSPORT *); extern QMGR_PEER *qmgr_peer_select(QMGR_JOB *); + extern void qmgr_job_blocker_update(QMGR_QUEUE *); extern QMGR_JOB *qmgr_job_obtain(QMGR_MESSAGE *, QMGR_TRANSPORT *); extern void qmgr_job_free(QMGR_JOB *); diff -cr /var/tmp/postfix-2.5.6/src/qmgr/qmgr_entry.c src/qmgr/qmgr_entry.c *** /var/tmp/postfix-2.5.6/src/qmgr/qmgr_entry.cFri Dec 14 17:47:21 2007 --- src/qmgr/qmgr_entry.c Thu Mar 5 16:29:46 2009 *** *** 299,327 } /* ! * If the queue was blocking some of the jobs on the job list, check if ! * the concurrency limit has lifted. If there are still some pending ! * deliveries, give it a try and unmark all transport blockers at once. ! * The qmgr_job_entry_select() will do the rest. In either case make sure ! * the queue is not marked as a blocker anymore, with extra handling of ! * queues which were declared dead. * ! * Note that changing the blocker status also affects the candidate cache. ! * Most of the cases would be automatically recognized by the current job ! * change, but we play safe and reset the cache explicitly below. ! * ! * Keeping the transport blocker tag odd is an easy way to make sure the tag ! * never matches jobs that are not explicitly marked as blockers. */ ! if (queue-blocker_tag == transport-blocker_tag) { ! if (queue-window queue-busy_refcount queue-todo.next != 0) { ! transport-blocker_tag += 2; ! transport-job_current = transport-job_list.next; ! transport-candidate_cache_current = 0; ! } ! if (queue-window queue-busy_refcount || QMGR_QUEUE_THROTTLED(queue)) ! queue-blocker_tag = 0; } /* * When there are no more entries for this peer, discard the peer --- 299,323 } /* ! * We implement a rate-limited queue by emulating a slow delivery ! * channel. We insert the artificial delays with qmgr_queue_suspend(). * ! * When a queue is suspended, we must postpone any job scheduling decisions ! * until the queue is resumed. Otherwise, we make those decisions now. ! * The job scheduling decisions are made by qmgr_job_blocker_update(). */ ! if (which == QMGR_QUEUE_BUSY transport-rate_delay 0) { ! if (queue-window 1) ! msg_panic(%s: queue %s/%s: window %d 1 on rate-limited service, ! myname, transport-name, queue-name, queue-window); ! if (QMGR_QUEUE_THROTTLED(queue))/* XXX */ ! qmgr_queue_unthrottle(queue); ! if (QMGR_QUEUE_READY(queue)) ! qmgr_queue_suspend(queue, transport-rate_delay); } + if (!QMGR_QUEUE_SUSPENDED(queue) +queue-blocker_tag == transport-blocker_tag) + qmgr_job_blocker_update(queue); /* * When there are no more entries for this peer, discard the peer *** *** 336,354 */ if (which == QMGR_QUEUE_BUSY) queue-last_done = event_time(); - - /* - * Suspend a rate-limited queue, so that mail trickles out. - */ - if (which == QMGR_QUEUE_BUSY