Re: postfix causing very high load average
On Thu, 14 Nov 2002, Sagi Bashari wrote: Hi, I just setuped a new server. It is only running postfix at this time, relaying mail from another server. The distribution is RedHat 7.3 with all of the updates. There is a large amount of mail in the queue (about 17k mails). The load average goes upto 8.x. If I kill postfix, it goes back down to 0.x. The strange thing is that top shows that the cpu usage is pretty low: CPU0 states:1.2% user, 3.2% system, 0.4% nice, 95.1% idle CPU1 states:1.4% user, 2.3% system, 0.4% nice, 95.3% idle Looks like the machine is over-swapping. Hardware is not the problem. The machine is a dual Athlon MP 2000 with 1GB of DDRAM and 2 IDE hdds with 8MB cache in RAID1. It should be able to handle that kind of work without any problem. I'm using ext3. Any ideas/suggestions? Maybe limit the number of postfix processes (of some kind?) Use postfix's various concurrecy limitations. Have a look at http://postfix.openu.ac.il/rate.html -- Tzafrir Cohen mailto:tzafrir;technion.ac.il http://www.technion.ac.il/~tzafrir = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: postfix causing very high load average
I/O bound? Being killed by the journalling overhead of ext3? Insufficient RAM to cache the files being accessed in the disk (improbable)? My first guess is that this has to do with interaction of postfix with ext3 journalling. Things to check/try: - Is the system actually I/O bound? - What happens if you put the E-mail files in an ext2 partition? - How about putting the ext3 journalling file in another disk drive? On Thu, 14 Nov 2002, Sagi Bashari wrote: Hi, I just setuped a new server. It is only running postfix at this time, relaying mail from another server. The distribution is RedHat 7.3 with all of the updates. There is a large amount of mail in the queue (about 17k mails). The load average goes upto 8.x. If I kill postfix, it goes back down to 0.x. The strange thing is that top shows that the cpu usage is pretty low: CPU0 states:1.2% user, 3.2% system, 0.4% nice, 95.1% idle CPU1 states:1.4% user, 2.3% system, 0.4% nice, 95.3% idle Hardware is not the problem. The machine is a dual Athlon MP 2000 with 1GB of DDRAM and 2 IDE hdds with 8MB cache in RAID1. It should be able to handle that kind of work without any problem. I'm using ext3. --- Omer WARNING TO SPAMMERS: at http://www.zak.co.il/spamwarning.html = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: postfix causing very high load average
Hello Sagi, Maybe limit the number of postfix processes (of some kind?) No, it's not that: [sagi@black sagi]$ ps auxww|grep -ic postfix 77 [sagi@black sagi]$ Command w or uptime shows number of processes that are waiting for CPU AND number of processes that stuck for one or other reason in kernel space. Most of the time it is equivalent to load average as uptime shows: [~]$ uptime 14:55:25 up 31 days, 20:47, 38 users, load average: 0.02, 0.01, 0.00 [~]$ This is the case for load average to jump when you have zombie processes or when remote nfs server disconnected and you try to access a file on the imported file system. In your case (I think) it is postfix processes that are contending for file locks using system calls. I think the solution is to reduce number of pollers or whatever postfix has. Use postfix's various concurrecy limitations. Have a look at http://postfix.openu.ac.il/rate.html --- Bye, | Fax: (972)-2-6796453 Arieh | Phone: (972)-6795364 = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: postfix causing very high load average
Take a look here: http://www.stahl.bau.tu-bs.de/~hildeb/postfix/ext3.shtml Cheers, Henry Sagi Bashari wrote: On 14/11/2002 13:51, Tzafrir Cohen wrote: On Thu, 14 Nov 2002, Sagi Bashari wrote: Hi, I just setuped a new server. It is only running postfix at this time, relaying mail from another server. The distribution is RedHat 7.3 with all of the updates. There is a large amount of mail in the queue (about 17k mails). The load average goes upto 8.x. If I kill postfix, it goes back down to 0.x. The strange thing is that top shows that the cpu usage is pretty low: CPU0 states:1.2% user, 3.2% system, 0.4% nice, 95.1% idle CPU1 states:1.4% user, 2.3% system, 0.4% nice, 95.3% idle Looks like the machine is over-swapping. Actually, it's not swapping at all: [sagi@black sagi]$ free -m total used free sharedbuffers cached Mem: 1006991 14 0151547 -/+ buffers/cache:293712 Swap: 1992 0 1992 [sagi@black sagi]$ Hardware is not the problem. The machine is a dual Athlon MP 2000 with 1GB of DDRAM and 2 IDE hdds with 8MB cache in RAID1. It should be able to handle that kind of work without any problem. I'm using ext3. Any ideas/suggestions? Maybe limit the number of postfix processes (of some kind?) No, it's not that: [sagi@black sagi]$ ps auxww|grep -ic postfix 77 [sagi@black sagi]$ Use postfix's various concurrecy limitations. Have a look at http://postfix.openu.ac.il/rate.html After looking in the postfix users archive, I suspect that it is a filesystem issue. I ran 'chattr -R -S +A' on the queue spool dir like someone suggested there (to disable atime/sync updates). It did reduce the load a bit, but it is still too high. Sagi = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED] = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: postfix causing very high load average
That's where I took the original command from. I can't change the partition settings or repartition the harddisk because /var is a very big partition that is also used for data (database,web). However i have empty 6GB partition on the harddisk. I don't need that much for spool directory, is it possible to repartition the harddisk while the RAID (linux MD raid) is working? the linux software raid howto says that you shouldn't. Is there any reason for this? (after you turn off the RAID on the partition that you want to delete, ofcourse). Is it possible to create virtual partition on a file (using another filesystem format)? Sagi On 14/11/2002 15:15, Henry Ficher wrote: Take a look here: http://www.stahl.bau.tu-bs.de/~hildeb/postfix/ext3.shtml Cheers, Henry Sagi Bashari wrote: On 14/11/2002 13:51, Tzafrir Cohen wrote: On Thu, 14 Nov 2002, Sagi Bashari wrote: Hi, I just setuped a new server. It is only running postfix at this time, relaying mail from another server. The distribution is RedHat 7.3 with all of the updates. There is a large amount of mail in the queue (about 17k mails). The load average goes upto 8.x. If I kill postfix, it goes back down to 0.x. The strange thing is that top shows that the cpu usage is pretty low: CPU0 states:1.2% user, 3.2% system, 0.4% nice, 95.1% idle CPU1 states:1.4% user, 2.3% system, 0.4% nice, 95.3% idle Looks like the machine is over-swapping. Actually, it's not swapping at all: [sagi@black sagi]$ free -m total used free sharedbuffers cached Mem: 1006991 14 0151 547 -/+ buffers/cache:293712 Swap: 1992 0 1992 [sagi@black sagi]$ Hardware is not the problem. The machine is a dual Athlon MP 2000 with 1GB of DDRAM and 2 IDE hdds with 8MB cache in RAID1. It should be able to handle that kind of work without any problem. I'm using ext3. Any ideas/suggestions? Maybe limit the number of postfix processes (of some kind?) No, it's not that: [sagi@black sagi]$ ps auxww|grep -ic postfix 77 [sagi@black sagi]$ Use postfix's various concurrecy limitations. Have a look at http://postfix.openu.ac.il/rate.html After looking in the postfix users archive, I suspect that it is a filesystem issue. I ran 'chattr -R -S +A' on the queue spool dir like someone suggested there (to disable atime/sync updates). It did reduce the load a bit, but it is still too high. Sagi = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: postfix causing very high load average
Quoting Sagi Bashari, from the post of Thu, 14 Nov: I can't change the partition settings or repartition the harddisk because /var is a very big partition that is also used for data (database,web). time to split it up. worth a few minutes of downtime to improve relyability and performance. However i have empty 6GB partition on the harddisk. I don't need that much for spool directory, is it possible to repartition the harddisk while the RAID (linux MD raid) is working? the linux software raid howto says that you shouldn't. Is there any reason for this? (after you turn off the RAID on the partition that you want to delete, ofcourse). look at EVMS, it will make many of those steps easier. Is it possible to create virtual partition on a file (using another filesystem format)? ugly, but possible. do it on the clean partition though, not in /var itself, since it's probably VERY fragmented. -- In full Effect mode Ira Abramov http://ira.abramov.org/email/ This post is encrypted twice with ROT-13. Documenting or attempting to crack this encryption is illegal. msg23283/pgp0.pgp Description: PGP signature
Re: postfix causing very high load average
On 14/11/2002 16:50, Ira Abramov wrote: Quoting Sagi Bashari, from the post of Thu, 14 Nov: I can't change the partition settings or repartition the harddisk because /var is a very big partition that is also used for data (database,web). time to split it up. worth a few minutes of downtime to improve relyability and performance. I only have remote access to the server (it is colocated). I asked here few weeks ago if there is a reason to put /var/www somewhere else (like /home) and the answer I received is that it is probably better to leave it where it is (because I also have dynamic data there). I can move /var to / and repartition /var. But I have software RAID running on this drive. Is it safe to do, remotely, when software RAID is activated on / and /home? However i have empty 6GB partition on the harddisk. I don't need that much for spool directory, is it possible to repartition the harddisk while the RAID (linux MD raid) is working? the linux software raid howto says that you shouldn't. Is there any reason for this? (after you turn off the RAID on the partition that you want to delete, ofcourse). look at EVMS, it will make many of those steps easier. Yes, you suggested that few weeks ago. But RedHat do not offer it as part of their official kernels, and I'd rather not compile it myself because I don't have physical access and kernel security updates are much easier with the official RPMs. Is it possible to create virtual partition on a file (using another filesystem format)? ugly, but possible. do it on the clean partition though, not in /var itself, since it's probably VERY fragmented. I might just allocate the free 6GB for the spool directory. Which filesystem should I use for it? reiserfs? it is full tiny files and directories: [root@black postfix]# find . -type d |wc -l 1323 [root@black postfix]# find . -type f |wc -l 25083 [root@black postfix]# du -sh 156M. [root@black postfix]# Sagi = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: postfix causing very high load average
Quoting Sagi Bashari, from the post of Thu, 14 Nov: time to split it up. worth a few minutes of downtime to improve relyability and performance. I only have remote access to the server (it is colocated). I asked here few weeks ago if there is a reason to put /var/www somewhere else (like /home) and the answer I received is that it is probably better to leave it where it is (because I also have dynamic data there). you never mentioned the volume... like we said, you could put your html pages in /tmp for all apache cares. it's a question of usage patterns. I can move /var to / and repartition /var. But I have software RAID running on this drive. Is it safe to do, remotely, when software RAID is activated on / and /home? probably OK, but you won't be able to see that directory till you reboot. mixing MD and non-MD on the same drive makes little sense to me though. look at EVMS, it will make many of those steps easier. Yes, you suggested that few weeks ago. But RedHat do not offer it as part of their official kernels, and I'd rather not compile it myself because I don't have physical access and kernel security updates are much easier with the official RPMs. humpf. no solutions without reboots then. try and find a clean hour at 3am when the server can be taken down for a while. I might just allocate the free 6GB for the spool directory. Which filesystem should I use for it? reiserfs? it is full tiny files and directories: I'd say Reiser, yup. definitely. -- Now playing for the Denver Broncos Ira Abramov http://ira.abramov.org/email/ This post is encrypted twice with ROT-13. Documenting or attempting to crack this encryption is illegal. msg23288/pgp0.pgp Description: PGP signature
Re: postfix causing very high load average
On 14/11/2002 17:30, Ira Abramov wrote: Quoting Sagi Bashari, from the post of Thu, 14 Nov: I can move /var to / and repartition /var. But I have software RAID running on this drive. Is it safe to do, remotely, when software RAID is activated on / and /home? probably OK, but you won't be able to see that directory till you reboot. mixing MD and non-MD on the same drive makes little sense to me though. I do not intend to mix MD and non-MD, I just need to delete existing MD partition and create several small partitions from it. look at EVMS, it will make many of those steps easier. Yes, you suggested that few weeks ago. But RedHat do not offer it as part of their official kernels, and I'd rather not compile it myself because I don't have physical access and kernel security updates are much easier with the official RPMs. humpf. no solutions without reboots then. try and find a clean hour at 3am when the server can be taken down for a while. I have no problem to reboot. It just has to be done remotely (and I have the knowledge and the resources to do it remotely). My question is just if it is safe to repartition free space on harddisk that is currently in use. Quoting http://iglu.org.il/LDP/HOWTO/Software-RAID-HOWTO-4.html: Never NEVER *never* re-partition disks that are part of a running RAID. If you must alter the partition table on a disk which is a part of a RAID, stop the array first, then repartition. It just sounds strange, if linux treats the RAID members as normal drives. Sagi = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: postfix causing very high load average
There is no problem in breaking RAID (mirror or otherwise), you do not lose information. So, backup, break the raid, repartition, rebuild the raid and restore. I did it, and it's very simple. - Original Message - From: Sagi Bashari [EMAIL PROTECTED] To: Ira Abramov [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Sent: Thursday, November 14, 2002 5:56 PM Subject: Re: postfix causing very high load average On 14/11/2002 17:30, Ira Abramov wrote: Quoting Sagi Bashari, from the post of Thu, 14 Nov: I can move /var to / and repartition /var. But I have software RAID running on this drive. Is it safe to do, remotely, when software RAID is activated on / and /home? probably OK, but you won't be able to see that directory till you reboot. mixing MD and non-MD on the same drive makes little sense to me though. I do not intend to mix MD and non-MD, I just need to delete existing MD partition and create several small partitions from it. look at EVMS, it will make many of those steps easier. Yes, you suggested that few weeks ago. But RedHat do not offer it as part of their official kernels, and I'd rather not compile it myself because I don't have physical access and kernel security updates are much easier with the official RPMs. humpf. no solutions without reboots then. try and find a clean hour at 3am when the server can be taken down for a while. I have no problem to reboot. It just has to be done remotely (and I have the knowledge and the resources to do it remotely). My question is just if it is safe to repartition free space on harddisk that is currently in use. Quoting http://iglu.org.il/LDP/HOWTO/Software-RAID-HOWTO-4.html: Never NEVER *never* re-partition disks that are part of a running RAID. If you must alter the partition table on a disk which is a part of a RAID, stop the array first, then repartition. It just sounds strange, if linux treats the RAID members as normal drives. Sagi = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED] = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]