Re: Heavy I/O blocks FreeBSD box for several seconds
Hi, On Thu, Jul 7, 2011 at 7:45 PM, Adrian Chadd adr...@freebsd.org wrote: (OT, yes, but I'd like to take a stab at explaining why these things fall to the wayside..) On 7 July 2011 12:08, Arnaud Lacombe lacom...@gmail.com wrote: What would be the point to even start looking at an issue? You guys (by you, I mean official committers on public list) don't care When someone who has an active interest takes ownership of the problem. about people providing patches, might it be for trivial, obvious, fixes. I'm not even talking about complex patches ... When you eventually ends up providing a patch, you ends up being slammed a door at by maintainers asserting their code is perfect, until logic and user complaints prove them wrong. That said, this comment is off-topic, but I will certainly re-state this next month when I'll be ping'ing trivial patches. The problem is that someone doesn't own the problem. If I commit someone's fix to the tree without really understanding what's going on, I take ownership of that change and any issues/breakages/changes that it creates. The people responsible for these areas are likely very busy with other things. It's not that they don't want to help! It's much more likely that they don't have the time. Trivial patches aren't always so trivial. You can change the behaviour of something subtle which works great for you and not for others. This is very likely what's going on with IO/CPU scheduling. It's a tricky area. A simple fix isn't always as simple. So if there's a diagnosed problem, with reproducable test cases and some patches which fix it, I suggest doing something like the following: * create a webpage, even if it's a wiki somewhere (even wiki.freebsd.org if you ask someone nicely) * dump all the information you can in there. Having stuff in emails is great - but it's only really helpful for tracking the 'flow' of a discussion. Having a summarised analysis of all of that on a webpage is much more helpful. * Add the patches there. * Encourage people who aren't in your immediate community to try them too - to try and find if your changes mess up other configurations somehow. * Be persistent trying to get your changes in. If you've done the background research, done some wide-spread testing and show you've not caused any obvious regressions, you're much more likely to get your changes in. For the record, I would like to see enforced public review for _every_ patch *before* it is checked in, as a strong rule. gcc system is particularly interesting. But it is not likely to happen in FreeBSD where FreeBSD committers are clearly more free than other at checking-in un-publicly-reviewed stuff (especially _bad_ stuff). This would of course apply even to long-time committers, no matter how it hurt their ego (which I definitively do not care about). - Arnaud With all of that done, you can likely find a committer who will help you get your fixes into the tree. Please just try not to interpret a lack of response as a lack of interest. There's only so much time in the day and committers tend to be a busy bunch, with day jobs that may in no way reflect their FreeBSD interests. Finally, if people do enough of the above and begin to take ownership of parts of the tree, you'll find someone will likely sponsor you for a commit bit. HTH, Adrian ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: Heavy I/O blocks FreeBSD box for several seconds
On Mon, Jul 11, 2011 at 04:33:44PM -0400, Arnaud Lacombe wrote: For the record, I would like to see enforced public review for _every_ patch *before* it is checked in, as a strong rule. gcc system is particularly interesting. But it is not likely to happen in FreeBSD where FreeBSD committers are clearly more free than other at checking-in un-publicly-reviewed stuff (especially _bad_ stuff). This would of course apply even to long-time committers, no matter how it hurt their ego (which I definitively do not care about). As a long time GCC committer, I think that you have grossly over-simplified the GCC review process and how a submitted patch is approved for committing. -- Steve ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: Heavy I/O blocks FreeBSD box for several seconds
Hi, On Mon, Jul 11, 2011 at 4:40 PM, Steve Kargl s...@troutmask.apl.washington.edu wrote: On Mon, Jul 11, 2011 at 04:33:44PM -0400, Arnaud Lacombe wrote: For the record, I would like to see enforced public review for _every_ patch *before* it is checked in, as a strong rule. gcc system is particularly interesting. But it is not likely to happen in FreeBSD where FreeBSD committers are clearly more free than other at checking-in un-publicly-reviewed stuff (especially _bad_ stuff). This would of course apply even to long-time committers, no matter how it hurt their ego (which I definitively do not care about). As a long time GCC committer, I think that you have grossly over-simplified the GCC review process and how a submitted patch is approved for committing. Yes. - Arnaud ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: Heavy I/O blocks FreeBSD box for several seconds
Hi, [re-sent publicly, I did not Replied-to-all:)] On Mon, Jul 11, 2011 at 5:38 PM, Arnaud Lacombe lacom...@gmail.com wrote: Hi, On Mon, Jul 11, 2011 at 4:40 PM, Steve Kargl s...@troutmask.apl.washington.edu wrote: On Mon, Jul 11, 2011 at 04:33:44PM -0400, Arnaud Lacombe wrote: For the record, I would like to see enforced public review for _every_ patch *before* it is checked in, as a strong rule. gcc system is particularly interesting. But it is not likely to happen in FreeBSD where FreeBSD committers are clearly more free than other at checking-in un-publicly-reviewed stuff (especially _bad_ stuff). This would of course apply even to long-time committers, no matter how it hurt their ego (which I definitively do not care about). As a long time GCC committer, I think that you have grossly over-simplified the GCC review process and how a submitted patch is approved for committing. Yes. Just to provide information more information than these sterile mails, here is the gcc contribution guidelines: http://gcc.gnu.org/contribute.html - Arnaud ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: Heavy I/O blocks FreeBSD box for several seconds
On Mon, Jul 11, 2011 at 05:50:44PM -0400, Arnaud Lacombe wrote: On Mon, Jul 11, 2011 at 5:38 PM, Arnaud Lacombe lacom...@gmail.com wrote: Hi, On Mon, Jul 11, 2011 at 4:40 PM, Steve Kargl s...@troutmask.apl.washington.edu wrote: On Mon, Jul 11, 2011 at 04:33:44PM -0400, Arnaud Lacombe wrote: For the record, I would like to see enforced public review for _every_ patch *before* it is checked in, as a strong rule. gcc system is particularly interesting. But it is not likely to happen in FreeBSD where FreeBSD committers are clearly more free than other at checking-in un-publicly-reviewed stuff (especially _bad_ stuff). This would of course apply even to long-time committers, no matter how it hurt their ego (which I definitively do not care about). As a long time GCC committer, I think that you have grossly over-simplified the GCC review process and how a submitted patch is approved for committing. Yes. Just to provide information more information than these sterile mails, here is the gcc contribution guidelines: http://gcc.gnu.org/contribute.html Which if one reads, one finds http://gcc.gnu.org/svnwrite.html#policies Localized write permission. This is for people who have primary responsibility for ports, front ends, or other specific aspects of the compiler. These folks are allowed to make changes to areas they maintain and related documentation, web pages, and test cases without approval from anyone else, and approve other people's changes in those areas. They must get approval for changes elsewhere in the compiler. -- Steve ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: Heavy I/O blocks FreeBSD box for several seconds
On 07/06/11 23:49, Oliver Pinter wrote: On 7/6/11, Hartmann, O.ohart...@zedat.fu-berlin.de wrote: On 07/06/11 21:36, Steve Kargl wrote: On Wed, Jul 06, 2011 at 03:18:35PM -0400, Arnaud Lacombe wrote: Hi, On Wed, Jul 6, 2011 at 12:28 PM, Steve Kargl s...@troutmask.apl.washington.edu wrote: On Wed, Jul 06, 2011 at 05:29:24PM +0200, O. Hartmann wrote: I use SCHED_ULE on all machines, since it is supposed to be performing better on multicore boxes, but there are lots of suggestions switching back to the old SCHED_4BSD scheduler. If you are using MPI in numerical codes, then you want to use SCHED_4BSD. ?I've posted numerous times about ULE and its very poor performance when using MPI. http://lists.freebsd.org/pipermail/freebsd-hackers/2008-October/026375.html [sarcasm] It is rather funny to see that the post you point out has generated exactly 0 meaningful follow-up then and as you mention later in this thread, the issue still remains today :-) [/sarcasm] Apparently, you are privy to my private email exchanges with jeffr. I'm also not sure why you're being sarcastic here. The issue was and AFAIK still is a problem for anyone using FreeBSD in a HPC cluster. ULE simply performs worse than 4BSD. Well, I know only very little people using FreeBSD within a HPC cluster or even for scientific purposes, except myself and some people around here. ___ freebsd-curr...@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/thread.html#58537 The problem is not only related to desktop boxes, it involves servers with big hardware as well. ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: Heavy I/O blocks FreeBSD box for several seconds
Steve Kargl s...@troutmask.apl.washington.edu wrote: Let's face, ULE is not a silver bullet. Or perhaps it is, but this particular problem is so heavily armored as to demand depleted uranium :) ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: Heavy I/O blocks FreeBSD box for several seconds
On 06/07/2011 20:11, Nathan Whitehorn wrote: I've seen exactly this problem with multi-threaded math libraries, as well. Using parallel GotoBLAS on FreeBSD gives terrible performance because the threads keep migrating between CPUs, causing frequent cache misses. On both schedulers? ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: Heavy I/O blocks FreeBSD box for several seconds
(OT, yes, but I'd like to take a stab at explaining why these things fall to the wayside..) On 7 July 2011 12:08, Arnaud Lacombe lacom...@gmail.com wrote: What would be the point to even start looking at an issue? You guys (by you, I mean official committers on public list) don't care When someone who has an active interest takes ownership of the problem. about people providing patches, might it be for trivial, obvious, fixes. I'm not even talking about complex patches ... When you eventually ends up providing a patch, you ends up being slammed a door at by maintainers asserting their code is perfect, until logic and user complaints prove them wrong. That said, this comment is off-topic, but I will certainly re-state this next month when I'll be ping'ing trivial patches. The problem is that someone doesn't own the problem. If I commit someone's fix to the tree without really understanding what's going on, I take ownership of that change and any issues/breakages/changes that it creates. The people responsible for these areas are likely very busy with other things. It's not that they don't want to help! It's much more likely that they don't have the time. Trivial patches aren't always so trivial. You can change the behaviour of something subtle which works great for you and not for others. This is very likely what's going on with IO/CPU scheduling. It's a tricky area. A simple fix isn't always as simple. So if there's a diagnosed problem, with reproducable test cases and some patches which fix it, I suggest doing something like the following: * create a webpage, even if it's a wiki somewhere (even wiki.freebsd.org if you ask someone nicely) * dump all the information you can in there. Having stuff in emails is great - but it's only really helpful for tracking the 'flow' of a discussion. Having a summarised analysis of all of that on a webpage is much more helpful. * Add the patches there. * Encourage people who aren't in your immediate community to try them too - to try and find if your changes mess up other configurations somehow. * Be persistent trying to get your changes in. If you've done the background research, done some wide-spread testing and show you've not caused any obvious regressions, you're much more likely to get your changes in. With all of that done, you can likely find a committer who will help you get your fixes into the tree. Please just try not to interpret a lack of response as a lack of interest. There's only so much time in the day and committers tend to be a busy bunch, with day jobs that may in no way reflect their FreeBSD interests. Finally, if people do enough of the above and begin to take ownership of parts of the tree, you'll find someone will likely sponsor you for a commit bit. HTH, Adrian ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: Heavy I/O blocks FreeBSD box for several seconds
2011/7/6 O. Hartmann ohart...@zedat.fu-berlin.de: having performance issues Could you post /etc/sysctl.conf and /boot/loader.conf? Also, the output of uname -a on all machines would be nice. And since you don't use GENERIC, could you also tell us what difference your setup is from a GENERIC kernel? -- chs, ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: Heavy I/O blocks FreeBSD box for several seconds
2011/7/6 O. Hartmann ohart...@zedat.fu-berlin.de When performing an update on the ports tree via portsnap fetch update or when checking out (or) large Subversion repositories or when copying large data files (~ 50 to 250 GB in size, results from numerical modelings) or when compiling world, FreeBD 9.0 and FreeBSD 8.2-STABLE tend to freeze for several seconds or drop overall performance dramatically for seconds. On boxes with only console- or terminal access (no GUI) a running 'vi' gets stuck for seconds while one of the processes producing heavy I/O is running, or the output of a 'cat' of a large file stops for several seconds. Using X11, this phenomenon gets even worse and the 'freezing' tends to persist sometimes for more than 10 or 15 seconds. I've also had (and still having) this problem on FreeBSD 7.2-RELEASE and 8-STABLE with both UFS and ZFS. Though, i've been running FreeBSD not on powerful servers, but on laptops (2-core CPU's, 2 GB of RAM). But still, KDE4 on Linux performs much better during high disk IO. ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: Heavy I/O blocks FreeBSD box for several seconds
On Wed Jul 6 11, arrowdodger wrote: 2011/7/6 O. Hartmann ohart...@zedat.fu-berlin.de When performing an update on the ports tree via portsnap fetch update or when checking out (or) large Subversion repositories or when copying large data files (~ 50 to 250 GB in size, results from numerical modelings) or when compiling world, FreeBD 9.0 and FreeBSD 8.2-STABLE tend to freeze for several seconds or drop overall performance dramatically for seconds. On boxes with only console- or terminal access (no GUI) a running 'vi' gets stuck for seconds while one of the processes producing heavy I/O is running, or the output of a 'cat' of a large file stops for several seconds. this might be a scheduling issue. iirc i/o intensive tasks have higher priority than cpu intensive tasks, because they are expected to only issue a i/o request and then free the processor, while cpu intensive tasks occupy the cpu a lot longer. so maybe a process whith cyclic i/o requests blocks processes which aren't doing i/o. maybe playing with ULE's options can improve the situation. since you're running GENERIC, preemption *should* be enabled. however you should double check. i once tried running ULE without preemption and experienced exactly the same situation you described in your mail. for ULE preemption is pretty much mandatory. for the old 4bsd scheduler, running without preemtion doesn't really make that much of a difference, compared to running with preemption. you might also want to try enabling options IPI_PREEMPTION. no idea, if this improves your situation, though. cheers. alex Using X11, this phenomenon gets even worse and the 'freezing' tends to persist sometimes for more than 10 or 15 seconds. I've also had (and still having) this problem on FreeBSD 7.2-RELEASE and 8-STABLE with both UFS and ZFS. Though, i've been running FreeBSD not on powerful servers, but on laptops (2-core CPU's, 2 GB of RAM). But still, KDE4 on Linux performs much better during high disk IO. ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: Heavy I/O blocks FreeBSD box for several seconds
On 07/06/11 12:37, arrowdodger wrote: 2011/7/6 O. Hartmannohart...@zedat.fu-berlin.de When performing an update on the ports tree via portsnap fetch update or when checking out (or) large Subversion repositories or when copying large data files (~ 50 to 250 GB in size, results from numerical modelings) or when compiling world, FreeBD 9.0 and FreeBSD 8.2-STABLE tend to freeze for several seconds or drop overall performance dramatically for seconds. On boxes with only console- or terminal access (no GUI) a running 'vi' gets stuck for seconds while one of the processes producing heavy I/O is running, or the output of a 'cat' of a large file stops for several seconds. Using X11, this phenomenon gets even worse and the 'freezing' tends to persist sometimes for more than 10 or 15 seconds. I've also had (and still having) this problem on FreeBSD 7.2-RELEASE and 8-STABLE with both UFS and ZFS. Though, i've been running FreeBSD not on powerful servers, but on laptops (2-core CPU's, 2 GB of RAM). But still, KDE4 on Linux performs much better during high disk IO. I read about issues with the old codebase of X11 in FreeBSD's ports used, which could be the cause of some performance problems, but I wouldn't expect those I/O-triggered blockings on boxes without any GUI. I saw Linux very often performing tremendously better when used as a workstation or desktop, but this is often gained on the costs of other subsystems. I followed a very hard-to-understand discussion about grouping threads related to ttys which seems to get higher priorized in Linux to make the GUI more fluent, but this is definitely on cost of other subsystems, which in consequence gets less priorized. But even without GUI, Linux seems to perform I/O much better on multicore-/multiprocessor boxes than FreeBSD *.X and 9.X). Today I looked at some benchmarks performed by Phoronix/openbenchmark.org (http://www.phoronix.com/scan.php?page=articleitem=freebsd8_ubuntu910num=9) and it seems that threaded I/O is an issue in FreeBSD (compared to Linux). I have no glue how to tune those bottlenecks away in FBSD. I use SCHED_ULE on all machines, since it is supposed to be performing better on multicore boxes, but there are lots of suggestions switching back to the old SCHED_4BSD scheduler. Oliver ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: Heavy I/O blocks FreeBSD box for several seconds
On Wed, Jul 06, 2011 at 05:29:24PM +0200, O. Hartmann wrote: I use SCHED_ULE on all machines, since it is supposed to be performing better on multicore boxes, but there are lots of suggestions switching back to the old SCHED_4BSD scheduler. If you are using MPI in numerical codes, then you want to use SCHED_4BSD. I've posted numerous times about ULE and its very poor performance when using MPI. http://lists.freebsd.org/pipermail/freebsd-hackers/2008-October/026375.html -- Steve ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: Heavy I/O blocks FreeBSD box for several seconds
On 07/06/11 18:28, Steve Kargl wrote: On Wed, Jul 06, 2011 at 05:29:24PM +0200, O. Hartmann wrote: I use SCHED_ULE on all machines, since it is supposed to be performing better on multicore boxes, but there are lots of suggestions switching back to the old SCHED_4BSD scheduler. If you are using MPI in numerical codes, then you want to use SCHED_4BSD. I've posted numerous times about ULE and its very poor performance when using MPI. http://lists.freebsd.org/pipermail/freebsd-hackers/2008-October/026375.html Worth a try, but most of my code I use is OpenMP, not MPI. The post is of 2008, that's three years ago and 9.0 is on the brink to become released ... ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: Heavy I/O blocks FreeBSD box for several seconds
Has anyone re-run those IO benchmarks? Something smells fishy there.. (with the benchmarking.) adrian 2011/7/6 O. Hartmann ohart...@zedat.fu-berlin.de: On 07/06/11 12:37, arrowdodger wrote: 2011/7/6 O. Hartmannohart...@zedat.fu-berlin.de When performing an update on the ports tree via portsnap fetch update or when checking out (or) large Subversion repositories or when copying large data files (~ 50 to 250 GB in size, results from numerical modelings) or when compiling world, FreeBD 9.0 and FreeBSD 8.2-STABLE tend to freeze for several seconds or drop overall performance dramatically for seconds. On boxes with only console- or terminal access (no GUI) a running 'vi' gets stuck for seconds while one of the processes producing heavy I/O is running, or the output of a 'cat' of a large file stops for several seconds. Using X11, this phenomenon gets even worse and the 'freezing' tends to persist sometimes for more than 10 or 15 seconds. I've also had (and still having) this problem on FreeBSD 7.2-RELEASE and 8-STABLE with both UFS and ZFS. Though, i've been running FreeBSD not on powerful servers, but on laptops (2-core CPU's, 2 GB of RAM). But still, KDE4 on Linux performs much better during high disk IO. I read about issues with the old codebase of X11 in FreeBSD's ports used, which could be the cause of some performance problems, but I wouldn't expect those I/O-triggered blockings on boxes without any GUI. I saw Linux very often performing tremendously better when used as a workstation or desktop, but this is often gained on the costs of other subsystems. I followed a very hard-to-understand discussion about grouping threads related to ttys which seems to get higher priorized in Linux to make the GUI more fluent, but this is definitely on cost of other subsystems, which in consequence gets less priorized. But even without GUI, Linux seems to perform I/O much better on multicore-/multiprocessor boxes than FreeBSD *.X and 9.X). Today I looked at some benchmarks performed by Phoronix/openbenchmark.org (http://www.phoronix.com/scan.php?page=articleitem=freebsd8_ubuntu910num=9) and it seems that threaded I/O is an issue in FreeBSD (compared to Linux). I have no glue how to tune those bottlenecks away in FBSD. I use SCHED_ULE on all machines, since it is supposed to be performing better on multicore boxes, but there are lots of suggestions switching back to the old SCHED_4BSD scheduler. Oliver ___ freebsd-curr...@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: Heavy I/O blocks FreeBSD box for several seconds
On Wed, Jul 06, 2011 at 06:38:23PM +0200, Hartmann, O. wrote: On 07/06/11 18:28, Steve Kargl wrote: On Wed, Jul 06, 2011 at 05:29:24PM +0200, O. Hartmann wrote: I use SCHED_ULE on all machines, since it is supposed to be performing better on multicore boxes, but there are lots of suggestions switching back to the old SCHED_4BSD scheduler. If you are using MPI in numerical codes, then you want to use SCHED_4BSD. I've posted numerous times about ULE and its very poor performance when using MPI. http://lists.freebsd.org/pipermail/freebsd-hackers/2008-October/026375.html Worth a try, but most of my code I use is OpenMP, not MPI. It may impact OpenMP. I don't have any OpenMP to test. But, if OpenMP is spawning as many or more threads than the number of available processors/cores, then I think you will have problems. The post is of 2008, that's three years ago and 9.0 is on the brink to become released ... I periodically ran the same type test in the 2008 post over the last three years. Nothing has changed. I even set up an account on one node in my cluster for jeffr to use. He was too busy to investigate at that time. -- Steve ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: Heavy I/O blocks FreeBSD box for several seconds
In message 20110706170132.ga68...@troutmask.apl.washington.edu, Steve Kargl w rites: I periodically ran the same type test in the 2008 post over the last three years. Nothing has changed. I even set up an account on one node in my cluster for jeffr to use. He was too busy to investigate at that time. Isn't this just the lemming-syncer hurling every dirty block over the cliff at the same time ? To find out: Run gstat and keep and eye on the leftmost column The road map for fixing that has been known for years... -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 p...@freebsd.org | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: Heavy I/O blocks FreeBSD box for several seconds
On Wed, Jul 06, 2011 at 05:05:41PM +, Poul-Henning Kamp wrote: In message 20110706170132.ga68...@troutmask.apl.washington.edu, Steve Kargl w rites: I periodically ran the same type test in the 2008 post over the last three years. Nothing has changed. I even set up an account on one node in my cluster for jeffr to use. He was too busy to investigate at that time. Isn't this just the lemming-syncer hurling every dirty block over the cliff at the same time ? I don't know the answer. Of course, having no experience in processing scheduling, I don't understand the question either ;-) AFAICT, it is a cpu affinity issue. If I launch n+1 MPI images on a system with n cpus/cores, then 2 (and sometimes 3) images are stuck on a cpu and those 2 (or 3) images ping-pong on that cpu. I recall trying to use renice(8) to force some load balancing, but vaguely remember that it did not help. To find out: Run gstat and keep and eye on the leftmost column The road map for fixing that has been known for years... I'll keep this in mind, the next time I upgrade the cluster. It's currently running a Feb 10th vintage kernel, and is under fairly heavy use at the moment. -- Steve ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: Heavy I/O blocks FreeBSD box for several seconds
On Wed, Jul 06, 2011 at 03:18:35PM -0400, Arnaud Lacombe wrote: Hi, On Wed, Jul 6, 2011 at 12:28 PM, Steve Kargl s...@troutmask.apl.washington.edu wrote: On Wed, Jul 06, 2011 at 05:29:24PM +0200, O. Hartmann wrote: I use SCHED_ULE on all machines, since it is supposed to be performing better on multicore boxes, but there are lots of suggestions switching back to the old SCHED_4BSD scheduler. If you are using MPI in numerical codes, then you want to use SCHED_4BSD. ?I've posted numerous times about ULE and its very poor performance when using MPI. http://lists.freebsd.org/pipermail/freebsd-hackers/2008-October/026375.html [sarcasm] It is rather funny to see that the post you point out has generated exactly 0 meaningful follow-up then and as you mention later in this thread, the issue still remains today :-) [/sarcasm] Apparently, you are privy to my private email exchanges with jeffr. I'm also not sure why you're being sarcastic here. The issue was and AFAIK still is a problem for anyone using FreeBSD in a HPC cluster. ULE simply performs worse than 4BSD. -- Steve ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: Heavy I/O blocks FreeBSD box for several seconds
Hi, On Wed, Jul 6, 2011 at 12:28 PM, Steve Kargl s...@troutmask.apl.washington.edu wrote: On Wed, Jul 06, 2011 at 05:29:24PM +0200, O. Hartmann wrote: I use SCHED_ULE on all machines, since it is supposed to be performing better on multicore boxes, but there are lots of suggestions switching back to the old SCHED_4BSD scheduler. If you are using MPI in numerical codes, then you want to use SCHED_4BSD. I've posted numerous times about ULE and its very poor performance when using MPI. http://lists.freebsd.org/pipermail/freebsd-hackers/2008-October/026375.html [sarcasm] It is rather funny to see that the post you point out has generated exactly 0 meaningful follow-up then and as you mention later in this thread, the issue still remains today :-) [/sarcasm] - Arnaud ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: Heavy I/O blocks FreeBSD box for several seconds
On 07/06/11 21:36, Steve Kargl wrote: On Wed, Jul 06, 2011 at 03:18:35PM -0400, Arnaud Lacombe wrote: Hi, On Wed, Jul 6, 2011 at 12:28 PM, Steve Kargl s...@troutmask.apl.washington.edu wrote: On Wed, Jul 06, 2011 at 05:29:24PM +0200, O. Hartmann wrote: I use SCHED_ULE on all machines, since it is supposed to be performing better on multicore boxes, but there are lots of suggestions switching back to the old SCHED_4BSD scheduler. If you are using MPI in numerical codes, then you want to use SCHED_4BSD. ?I've posted numerous times about ULE and its very poor performance when using MPI. http://lists.freebsd.org/pipermail/freebsd-hackers/2008-October/026375.html [sarcasm] It is rather funny to see that the post you point out has generated exactly 0 meaningful follow-up then and as you mention later in this thread, the issue still remains today :-) [/sarcasm] Apparently, you are privy to my private email exchanges with jeffr. I'm also not sure why you're being sarcastic here. The issue was and AFAIK still is a problem for anyone using FreeBSD in a HPC cluster. ULE simply performs worse than 4BSD. Well, I know only very little people using FreeBSD within a HPC cluster or even for scientific purposes, except myself and some people around here. ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: Heavy I/O blocks FreeBSD box for several seconds
On 07/06/11 13:00, Steve Kargl wrote: On Wed, Jul 06, 2011 at 05:05:41PM +, Poul-Henning Kamp wrote: In message20110706170132.ga68...@troutmask.apl.washington.edu, Steve Kargl w rites: I periodically ran the same type test in the 2008 post over the last three years. Nothing has changed. I even set up an account on one node in my cluster for jeffr to use. He was too busy to investigate at that time. Isn't this just the lemming-syncer hurling every dirty block over the cliff at the same time ? I don't know the answer. Of course, having no experience in processing scheduling, I don't understand the question either ;-) AFAICT, it is a cpu affinity issue. If I launch n+1 MPI images on a system with n cpus/cores, then 2 (and sometimes 3) images are stuck on a cpu and those 2 (or 3) images ping-pong on that cpu. I recall trying to use renice(8) to force some load balancing, but vaguely remember that it did not help. I've seen exactly this problem with multi-threaded math libraries, as well. Using parallel GotoBLAS on FreeBSD gives terrible performance because the threads keep migrating between CPUs, causing frequent cache misses. -Nathan ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: Heavy I/O blocks FreeBSD box for several seconds
On 7/6/11, Hartmann, O. ohart...@zedat.fu-berlin.de wrote: On 07/06/11 21:36, Steve Kargl wrote: On Wed, Jul 06, 2011 at 03:18:35PM -0400, Arnaud Lacombe wrote: Hi, On Wed, Jul 6, 2011 at 12:28 PM, Steve Kargl s...@troutmask.apl.washington.edu wrote: On Wed, Jul 06, 2011 at 05:29:24PM +0200, O. Hartmann wrote: I use SCHED_ULE on all machines, since it is supposed to be performing better on multicore boxes, but there are lots of suggestions switching back to the old SCHED_4BSD scheduler. If you are using MPI in numerical codes, then you want to use SCHED_4BSD. ?I've posted numerous times about ULE and its very poor performance when using MPI. http://lists.freebsd.org/pipermail/freebsd-hackers/2008-October/026375.html [sarcasm] It is rather funny to see that the post you point out has generated exactly 0 meaningful follow-up then and as you mention later in this thread, the issue still remains today :-) [/sarcasm] Apparently, you are privy to my private email exchanges with jeffr. I'm also not sure why you're being sarcastic here. The issue was and AFAIK still is a problem for anyone using FreeBSD in a HPC cluster. ULE simply performs worse than 4BSD. Well, I know only very little people using FreeBSD within a HPC cluster or even for scientific purposes, except myself and some people around here. ___ freebsd-curr...@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/thread.html#58537 ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: Heavy I/O blocks FreeBSD box for several seconds
Offer a bounty for getting it fixed? thanks, Adrian On 7 July 2011 05:00, Hartmann, O. ohart...@zedat.fu-berlin.de wrote: On 07/06/11 21:36, Steve Kargl wrote: On Wed, Jul 06, 2011 at 03:18:35PM -0400, Arnaud Lacombe wrote: Hi, On Wed, Jul 6, 2011 at 12:28 PM, Steve Kargl s...@troutmask.apl.washington.edu wrote: On Wed, Jul 06, 2011 at 05:29:24PM +0200, O. Hartmann wrote: I use SCHED_ULE on all machines, since it is supposed to be performing better on multicore boxes, but there are lots of suggestions switching back to the old SCHED_4BSD scheduler. If you are using MPI in numerical codes, then you want to use SCHED_4BSD. ?I've posted numerous times about ULE and its very poor performance when using MPI. http://lists.freebsd.org/pipermail/freebsd-hackers/2008-October/026375.html [sarcasm] It is rather funny to see that the post you point out has generated exactly 0 meaningful follow-up then and as you mention later in this thread, the issue still remains today :-) [/sarcasm] Apparently, you are privy to my private email exchanges with jeffr. I'm also not sure why you're being sarcastic here. The issue was and AFAIK still is a problem for anyone using FreeBSD in a HPC cluster. ULE simply performs worse than 4BSD. Well, I know only very little people using FreeBSD within a HPC cluster or even for scientific purposes, except myself and some people around here. ___ freebsd-curr...@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: Heavy I/O blocks FreeBSD box for several seconds
On Thu, Jul 07, 2011 at 09:17:51AM +0800, Adrian Chadd wrote: Offer a bounty for getting it fixed? steve == ENOMONEY jeffr == ENOTIME And, 4BSD works. -- Steve ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: Heavy I/O blocks FreeBSD box for several seconds
On 7 July 2011 09:51, Steve Kargl s...@troutmask.apl.washington.edu wrote: On Thu, Jul 07, 2011 at 09:17:51AM +0800, Adrian Chadd wrote: Offer a bounty for getting it fixed? steve == ENOMONEY jeffr == ENOTIME And, 4BSD works. I meant it as a more general observation. If something doesn't work as needed, consider either diving in to fix it, or offering a bounty to someone to do so. It sounds like these scheduler issues (IO and threads) are well-known and reasonably well-understood. All that's lacking is the last bit of the puzzle - the actual developer to develop it. :) Adrian ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: Heavy I/O blocks FreeBSD box for several seconds
On Thu, Jul 07, 2011 at 10:39:00AM +0800, Adrian Chadd wrote: On 7 July 2011 09:51, Steve Kargl s...@troutmask.apl.washington.edu wrote: On Thu, Jul 07, 2011 at 09:17:51AM +0800, Adrian Chadd wrote: Offer a bounty for getting it fixed? steve == ENOMONEY jeffr == ENOTIME And, 4BSD works. I meant it as a more general observation. If something doesn't work as needed, consider either diving in to fix it, or offering a bounty to someone to do so. Or take the path of least resistance, use 4BSD, and get my actual work. I diagnosed the problem. I gave a fairly easy method for reproducing the problem (including providing a statically linked MPI program, and a script and data files to launch it). I offered access to one of the nodes in my cluster (including root access to install new kernels and to reboot the node). Unfortunately, I have neither the brain capacity and time nor the money to fix the issue. To solve OP's problem in the short, the simplest solution may be to switch to 4BSD. Let's face, ULE is not a silver bullet. -- Steve ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: Heavy I/O blocks FreeBSD box for several seconds
Hi, On Wed, Jul 6, 2011 at 10:39 PM, Adrian Chadd adr...@freebsd.org wrote: On 7 July 2011 09:51, Steve Kargl s...@troutmask.apl.washington.edu wrote: On Thu, Jul 07, 2011 at 09:17:51AM +0800, Adrian Chadd wrote: Offer a bounty for getting it fixed? steve == ENOMONEY jeffr == ENOTIME And, 4BSD works. I meant it as a more general observation. If something doesn't work as needed, consider either diving in to fix it, or offering a bounty to someone to do so. What would be the point to even start looking at an issue? You guys (by you, I mean official committers on public list) don't care about people providing patches, might it be for trivial, obvious, fixes. I'm not even talking about complex patches ... When you eventually ends up providing a patch, you ends up being slammed a door at by maintainers asserting their code is perfect, until logic and user complaints prove them wrong. That said, this comment is off-topic, but I will certainly re-state this next month when I'll be ping'ing trivial patches. - Arnaud It sounds like these scheduler issues (IO and threads) are well-known and reasonably well-understood. All that's lacking is the last bit of the puzzle - the actual developer to develop it. :) Adrian ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org