Re: SCHED_ULE should not be the default
On 12/12/2011 05:47, O. Hartmann wrote: Do we have any proof at hand for such cases where SCHED_ULE performs much better than SCHED_4BSD? I complained about poor interactive performance of ULE in a desktop environment for years. I had numerous people try to help, including Jeff, with various tunables, dtrace'ing, etc. The cause of the problem was never found. I switched to 4BSD, problem gone. This is on 2 separate systems with core 2 duos. hth, Doug If the algorithm ULE does not contain problems - it means the problem has Core2Duo, or in a piece of code that uses the ULE scheduler. I already wrote in a mailing list that specifically in my case (Core2Duo) partially helps the following patch: --- sched_ule.c.orig2011-11-24 18:11:48.0 +0200 +++ sched_ule.c 2011-12-10 22:47:08.0 +0200 @@ -794,7 +794,8 @@ * 1.5 * balance_interval. */ balance_ticks = max(balance_interval / 2, 1); - balance_ticks += random() % balance_interval; +// balance_ticks += random() % balance_interval; + balance_ticks += ((int)random()) % balance_interval; if (smp_started == 0 || rebalance == 0) return; tdq = TDQ_SELF(); @@ -2118,13 +2119,21 @@ struct td_sched *ts; THREAD_LOCK_ASSERT(td, MA_OWNED); + if (td-td_pri_class PRI_FIFO_BIT) + return; + ts = td-td_sched; + /* +* We used up one time slice. +*/ + if (--ts-ts_slice 0) + return; tdq = TDQ_SELF(); #ifdef SMP /* * We run the long term load balancer infrequently on the first cpu. */ - if (balance_tdq == tdq) { - if (balance_ticks --balance_ticks == 0) + if (balance_ticks --balance_ticks == 0) { + if (balance_tdq == tdq) sched_balance(); } #endif @@ -2144,9 +2153,6 @@ if (TAILQ_EMPTY(tdq-tdq_timeshare.rq_queues[tdq-tdq_ridx])) tdq-tdq_ridx = tdq-tdq_idx; } - ts = td-td_sched; - if (td-td_pri_class PRI_FIFO_BIT) - return; if (PRI_BASE(td-td_pri_class) == PRI_TIMESHARE) { /* * We used a tick; charge it to the thread so @@ -2157,11 +2163,6 @@ sched_priority(td); } /* -* We used up one time slice. -*/ - if (--ts-ts_slice 0) - return; - /* * We're out of time, force a requeue at userret(). */ ts-ts_slice = sched_slice; and refusal to use options FULL_PREEMPTION But no one has unsubscribed to my letter, my patch helps or not in the case of Core2Duo... There is a suspicion that the problems stem from the sections of code associated with the SMP... Maybe I'm in something wrong, but I want to help in solving this problem ... ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: SCHED_ULE should not be the default
On Tue, Dec 13, 2011 at 10:40:48AM +0200, Ivan Klymenko wrote: On 12/12/2011 05:47, O. Hartmann wrote: Do we have any proof at hand for such cases where SCHED_ULE performs much better than SCHED_4BSD? I complained about poor interactive performance of ULE in a desktop environment for years. I had numerous people try to help, including Jeff, with various tunables, dtrace'ing, etc. The cause of the problem was never found. I switched to 4BSD, problem gone. This is on 2 separate systems with core 2 duos. hth, Doug If the algorithm ULE does not contain problems - it means the problem has Core2Duo, or in a piece of code that uses the ULE scheduler. I observe ULE interactivity slowness even on single core machine (Pentium 4) in very visible places, like 'ps ax' output stucks in the middle by ~1 second. When I switch back to SHED_4BSD, all slowness is gone. -- http://ache.vniz.net/ ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: SCHED_ULE should not be the default
On 13 December 2011 01:00, Andrey Chernov a...@freebsd.org wrote: If the algorithm ULE does not contain problems - it means the problem has Core2Duo, or in a piece of code that uses the ULE scheduler. I observe ULE interactivity slowness even on single core machine (Pentium 4) in very visible places, like 'ps ax' output stucks in the middle by ~1 second. When I switch back to SHED_4BSD, all slowness is gone. Are you able to provide KTR traces of the scheduler results? Something that can be fed to schedgraph? Adrian ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: SCHED_ULE should not be the default
On Mon, Dec 12, 2011 at 02:47:57PM +0100, O. Hartmann wrote: Not fully right, boinc defaults to run on idprio 31 so this isn't an issue. And yes, there are cases where SCHED_ULE shows much better performance then SCHED_4BSD. [...] Do we have any proof at hand for such cases where SCHED_ULE performs much better than SCHED_4BSD? Whenever the subject comes up, it is mentioned, that SCHED_ULE has better performance on boxes with a ncpu 2. But in the end I see here contradictionary statements. People complain about poor performance (especially in scientific environments), and other give contra not being the case. Within our department, we developed a highly scalable code for planetary science purposes on imagery. It utilizes present GPUs via OpenCL if present. Otherwise it grabs as many cores as it can. By the end of this year I'll get a new desktop box based on Intels new Sandy Bridge-E architecture with plenty of memory. If the colleague who developed the code is willing performing some benchmarks on the same hardware platform, we'll benchmark bot FreeBSD 9.0/10.0 and the most recent Suse. For FreeBSD I intent also to look for performance with both different schedulers available. This is in no way shape or form the same kind of benchmark as what you're planning to do, but I thought I'd throw it out there for folks to take in as they see fit. I know folks were focused mainly on buildworld. I personally would find it interesting if someone with a higher-end system (e.g. 2 physical CPUs, with 6 or 8 cores per CPU) was to do the same test (changing -jX to -j{numofcores} of course). -- | Jeremy Chadwickjdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | sched_ule === - time make -j2 buildworld 1689.831u 229.328s 18:46.20 170.4% 6566+2051k 432+4264io 4565pf+0w - time make -j2 buildkernel 640.542u 87.737s 9:01.38 134.5% 6490+1920k 134+5968io 0pf+0w sched_4bsd - time make -j2 buildworld 1662.793u 206.908s 17:12.02 181.1% 6578+2054k 23750+4271io 6451pf+0w - time make -j2 buildkernel 638.717u 76.146s 8:34.90 138.8% 6530+1927k 6415+5903io 0pf+0w software == * sched_ule test: FreeBSD 8.2-STABLE, Thu Dec 1 04:37:29 PST 2011 * sched_4bsd test: FreeBSD 8.2-STABLE, Mon Dec 12 22:42:54 PST 2011 hardware == * Intel Core 2 Duo E8400, 3GHz * Supermicro X7SBA * 8GB ECC RAM (4x2GB), DDR2-800 * Intel 320-series SSD, 80GB: /, swap, /var, /tmp, /usr tuning adjustments / etc. === * Before each scheduler test, system was rebooted to ensure I/O cache and other whatnots were empty * All filesystems stock UFS2 + SU (root is non-SU) * All filesystems had tunefs -t enable applied to them * powerd(8) in use, with two rc.conf variables (per CPU spec): performance_cx_lowest=C2 economy_cx_lowest=C2 * loader.conf kern.maxdsiz=2560M kern.dfldsiz=2560M kern.maxssiz=256M ahci_load=yes hint.p4tcc.0.disabled=1 hint.acpi_throttle.0.disabled=1 vfs.zfs.arc_max=5120M * make.conf CPUTYPE?=core2 * src.conf WITHOUT_INET6=true WITHOUT_IPFILTER=true WITHOUT_LIB32=true WITHOUT_KERBEROS=true WITHOUT_PAM_SUPPORT=true WITHOUT_PROFILE=true WITHOUT_SENDMAIL=true * kernel configuration - note: between kernel builds, config was changed to either use SCHED_4BSD or SCHED_ULE respectively. cpu HAMMER ident GENERIC makeoptions DEBUG=-g# Build kernel with gdb(1) debug symbols options SCHED_4BSD # Classic BSD scheduler #optionsSCHED_ULE # ULE scheduler options PREEMPTION # Enable kernel thread preemption options INET# InterNETworking options FFS # Berkeley Fast Filesystem options SOFTUPDATES # Enable FFS soft updates support options UFS_ACL # Support for access control lists options UFS_DIRHASH # Improve performance on big directories options UFS_GJOURNAL# Enable gjournal-based UFS journaling options MD_ROOT # MD is a potential root device options NFSCLIENT # Network Filesystem Client options NFSSERVER # Network Filesystem Server options NFSLOCKD# Network Lock Manager options NFS_ROOT# NFS usable as /, requires NFSCLIENT options MSDOSFS # MSDOS Filesystem options CD9660 # ISO 9660 Filesystem options PROCFS # Process filesystem (requires PSEUDOFS) options PSEUDOFS# Pseudo-filesystem framework options GEOM_PART_GPT # GUID Partition Tables. options
Re: SCHED_ULE should not be the default
On 12/12/11 16:13, Vincent Hoffman wrote: On 12/12/2011 13:47, O. Hartmann wrote: Not fully right, boinc defaults to run on idprio 31 so this isn't an issue. And yes, there are cases where SCHED_ULE shows much better performance then SCHED_4BSD. [...] Do we have any proof at hand for such cases where SCHED_ULE performs much better than SCHED_4BSD? Whenever the subject comes up, it is mentioned, that SCHED_ULE has better performance on boxes with a ncpu 2. But in the end I see here contradictionary statements. People complain about poor performance (especially in scientific environments), and other give contra not being the case. It all a little old now but some if the stuff in http://people.freebsd.org/~kris/scaling/ covers improvements that were seen. http://jeffr-tech.livejournal.com/5705.html shows a little too, reading though Jeffs blog is worth it as it has some interesting stuff on SHED_ULE. I thought there were some more benchmarks floating round but cant find any with a quick google. Vince Interesting, there seems to be a much more performant scheduler in 7.0, called SCHED_SMP. I have some faint recalls on that ... where is this beast gone? Oliver signature.asc Description: OpenPGP digital signature
Re: SCHED_ULE should not be the default
On Tue, Dec 13, 2011 at 12:13:42PM +0100, O. Hartmann wrote: On 12/12/11 16:13, Vincent Hoffman wrote: On 12/12/2011 13:47, O. Hartmann wrote: Not fully right, boinc defaults to run on idprio 31 so this isn't an issue. And yes, there are cases where SCHED_ULE shows much better performance then SCHED_4BSD. [...] Do we have any proof at hand for such cases where SCHED_ULE performs much better than SCHED_4BSD? Whenever the subject comes up, it is mentioned, that SCHED_ULE has better performance on boxes with a ncpu 2. But in the end I see here contradictionary statements. People complain about poor performance (especially in scientific environments), and other give contra not being the case. It all a little old now but some if the stuff in http://people.freebsd.org/~kris/scaling/ covers improvements that were seen. http://jeffr-tech.livejournal.com/5705.html shows a little too, reading though Jeffs blog is worth it as it has some interesting stuff on SHED_ULE. I thought there were some more benchmarks floating round but cant find any with a quick google. Vince Interesting, there seems to be a much more performant scheduler in 7.0, called SCHED_SMP. I have some faint recalls on that ... where is this beast gone? Boy I sure hope I remember this right. I strongly urge others to correct me where I'm wrong; thanks in advance! The classic scheduler, SCHED_4BSD, was implemented back before there was oxygen. sched_4bsd(4) mentions this. No need to discuss it. Jeff Robertson began working on the first-generation ULE scheduler during the days of FreeBSD 5.x (I believe 5.1), and a paper on it was presented at USENIX circa 2003: http://www.usenix.org/event/bsdcon03/tech/full_papers/roberson/roberson.pdf Over the following years, Jeff (and others I assume -- maybe folks like George Neville-Neil and/or Kirk McKusick?) adjusted and tinkered with some of the semantics and models/methods. If I remember right, some of these quirks/fixes were committed. All of this was happening under the scheduler that was then called SCHED_ULE, but it was ULE 1.0 for lack of better terminology. This scheduler did not perform well, if I remember right, and Jeff was quite honest about that. From this point forward, Jeff began idealising and working on a scheduler which he called SCHED_SMP -- think of it as ULE 2.0, again, for lack of better terminology. It was different than the existing SCHED_ULE scheduler, hence a different name. Jeff blogged about this in early 2007, using exactly that term (ULE 2.0): http://jeffr-tech.livejournal.com/3729.html In mid-2007, prior to FreeBSD 7.0-RELEASE, Jeff announced that effectively he wanted to make SCHED_ULE do what SCHED_SMP did, and provided a patch to SCHED_ULE to accomplish just that: http://unix.derkeiler.com/Mailing-Lists/FreeBSD/current/2007-07/msg00755.html Full thread is here (beware -- many replies): http://unix.derkeiler.com/Mailing-Lists/FreeBSD/current/2007-07/threads.html#00755 The patch mentioned above was merged into HEAD on 2007/07/19. http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/kern/sched_ule.c#rev1.202 So in effect, as of 2007/07/19, SCHED_ULE became SCHED_SMP. FreeBSD 7.0-RELEASE was released on 2008/02/27, and the above commit/changes were available at that time as well (meaning: RELENG_7 and RELENG_7_0 at that moment in time should have included the patch from the above paragraph). The document released by Kris Kenneway hinted at those changes and performance improvements: http://people.freebsd.org/~kris/scaling/7.0%20Preview.pdf Keep in mind, however, that at that time kernel configuration files (GENERIC, etc.) still defaulted to SCHED_4BSD. The default scheduler in kernel config files (GENERIC, etc.) for i386 and amd64 (not sure about others) was changed in 2007/10/19: http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/i386/conf/GENERIC#rev1.475 http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/amd64/conf/GENERIC#rev1.485 This was done *prior* to FreeBSD 7.1-RELEASE. So, it first became available as the default scheduler for the masses when 7.1-RELEASE came out on 2009/01/05. All of the answers, in a roundabout and non-user-friendly way, are available by examining the commit history for src/sys/kern/sched_ule.c. It's hard to follow especially given that you have to consider all the releases/branchpoints that took place over time, but: http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/kern/sched_ule.c Are we having fun yet? :-) -- | Jeremy Chadwickjdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to
Re: [RFC] winbond watchdog driver for FreeBSD/i386 and FreeBSD/amd64
On 13. Dec 2011, at 00:01 , Keith Simonsen wrote: On 12/12/2011 12:25, Mike Tancsa wrote: On 12/12/2011 2:49 PM, Keith Simonsen wrote: I've been using 20110718-02-wbwd.diff for a few months now on a project with PC Engines Alix 1.d boards (http://pcengines.ch/alix1d.htm). They have a Winbond W83627HG chip. I don't see any probing/attach messages on boot but the driver seems to be properly configuring the chip - if I kill watchdogd with -9 the board reboots with watchdog timeout. Are you sure thats the watchdog thats doing the 'killing' so to speak ? If you have option CPU_GEODE in your kernel config, you will get the watchdog code there no ? ( /usr/src/sys/i386/i386/geode.c) Yes I do have CPU_GEODE in my kernel and I see the geode MFGPT probed in the verbose dmesg output. I'm not sure how I can tell what piece of hardware /dev/fido is linked to but I think you're correct and I'm using the geode watchdog and not the winbond chip. Maybe this has something do with me not having 'device eisa' in my kernel config! I'm going to start compiling a new nanobsd image right now with eisa and the newer wbwd.c driver and see how it goes Thanks You probably don't need eisa but if using my variant make sure you have the hint enabled or the hints file installed/updated to include it. You should see the debug sysctls for the watchdog or it did not attach and is not used. -- Bjoern A. Zeeb You have to have visions! Stop bit received. Insert coin for new address family. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: SCHED_ULE should not be the default
On 12/12/11 16:51, Steve Kargl wrote: On Mon, Dec 12, 2011 at 02:47:57PM +0100, O. Hartmann wrote: Not fully right, boinc defaults to run on idprio 31 so this isn't an issue. And yes, there are cases where SCHED_ULE shows much better performance then SCHED_4BSD. [...] Do we have any proof at hand for such cases where SCHED_ULE performs much better than SCHED_4BSD? Whenever the subject comes up, it is mentioned, that SCHED_ULE has better performance on boxes with a ncpu 2. But in the end I see here contradictionary statements. People complain about poor performance (especially in scientific environments), and other give contra not being the case. Within our department, we developed a highly scalable code for planetary science purposes on imagery. It utilizes present GPUs via OpenCL if present. Otherwise it grabs as many cores as it can. By the end of this year I'll get a new desktop box based on Intels new Sandy Bridge-E architecture with plenty of memory. If the colleague who developed the code is willing performing some benchmarks on the same hardware platform, we'll benchmark bot FreeBSD 9.0/10.0 and the most recent Suse. For FreeBSD I intent also to look for performance with both different schedulers available. This comes up every 9 months or so, and must be approaching FAQ status. In a HPC environment, I recommend 4BSD. Depending on the workload, ULE can cause a severe increase in turn around time when doing already long computations. If you have an MPI application, simply launching greater than ncpu+1 jobs can show the problem. Well, those recommendations should based on WHY. As the mostly negative experiences with SCHED_ULE in highly computative workloads get allways contradicted by ...but there are workloads that show the opposite ... this should be shown by more recent benchmarks and explanations than legacy benchmarks from years ago. And, indeed, I highly would recommend having a FAQ or a short note in tuning or the handbook in which it is mentioned to use SCHED_4BSD in HPC environments and SCHED_ULE for other workloads (which has to be more specific). It is not an easy task setting up a certain kind of OS for a specific purpose and tuning by crawling the mailing lists. Some notes and hints in the documentation is always a valuable hint and highly appreciated by folks not deep into development. And by the way, I have the deep impression that most of these discussions about the poor performance of SCHED_ULE tend to always end up in a covering up that flaw and the conclusive waste of development. But this is only my personal impression. signature.asc Description: OpenPGP digital signature
Re: NFS + SVN problem?
Dimitry Andric wrote: On 2011-11-23 19:26, Sean Bruno wrote: On Wed, 2011-11-23 at 09:58 -0800, Rick Macklem wrote: I don't know if Dimitry tried this, but you could also try the nolockd option, so that byte range locking is done locally in the client and avoids the NLM. Good luck with it and please let us know how it goes, rick This seems to allow SVN 1.7 to do whatever nonsense it is trying to do. I've modified my fstab on the test host in the cluster to: dumpster:/vol/volshscratch /dumpster/scratch nfs rw,soft,intr,bg,nolockd,nosuid 0 0 Removing soft,intr had no effect. This, I suspect will be problematic for clusteradm@ if we start updating hosts in the cluster. A very late addition to this: I got Subversion 1.7 to work properly over NFSv3, by making sure rpc.lockd runs on both server and client. E.g, set rpc_lockd_enable to YES in rc.conf; this is off by default, even if you have nfs_client_enable/nfs_server_enable set to YES. and rpc_statd_enable=YES on all systems, as well. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: SCHED_ULE should not be the default
On Tue, Dec 13, 2011 at 02:23:46PM +0100, O. Hartmann wrote: On 12/12/11 16:51, Steve Kargl wrote: On Mon, Dec 12, 2011 at 02:47:57PM +0100, O. Hartmann wrote: Not fully right, boinc defaults to run on idprio 31 so this isn't an issue. And yes, there are cases where SCHED_ULE shows much better performance then SCHED_4BSD. [...] Do we have any proof at hand for such cases where SCHED_ULE performs much better than SCHED_4BSD? Whenever the subject comes up, it is mentioned, that SCHED_ULE has better performance on boxes with a ncpu 2. But in the end I see here contradictionary statements. People complain about poor performance (especially in scientific environments), and other give contra not being the case. Within our department, we developed a highly scalable code for planetary science purposes on imagery. It utilizes present GPUs via OpenCL if present. Otherwise it grabs as many cores as it can. By the end of this year I'll get a new desktop box based on Intels new Sandy Bridge-E architecture with plenty of memory. If the colleague who developed the code is willing performing some benchmarks on the same hardware platform, we'll benchmark bot FreeBSD 9.0/10.0 and the most recent Suse. For FreeBSD I intent also to look for performance with both different schedulers available. This comes up every 9 months or so, and must be approaching FAQ status. In a HPC environment, I recommend 4BSD. Depending on the workload, ULE can cause a severe increase in turn around time when doing already long computations. If you have an MPI application, simply launching greater than ncpu+1 jobs can show the problem. Well, those recommendations should based on WHY. As the mostly negative experiences with SCHED_ULE in highly computative workloads get allways contradicted by ...but there are workloads that show the opposite ... this should be shown by more recent benchmarks and explanations than legacy benchmarks from years ago. I have given the WHY in previous discussions of ULE, based on what you call legacy benchmarks. I have not seen any commit to sched_ule.c that would lead me to believe that the performance issues with ULE and cpu-bound numerical codes have been addressed. Repeating the benchmark would be a waste of time. -- Steve ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
8.2-9.prerel: gmirror failed with error 19
csup make buildworld make kernel boot single root mount waiting for: usbus4 uhub4: 8 ports with 8 removable, self powered Trying to mount root from ufs:/dev/mirror/boota [rw]... mountroot: waiting for device /dev/mirror/boota ... Mounting from ufs:/dev/mirror/boota failed with error 19. the only way out was via loader OK unload OK load boot/kernel.old/kernel OK load boot/kernel.old/geom_mirror.ko OK set kern.geom.part.check_integrity=0 OK set vfs.root.mountfrom.options=rw OK boot -s this would work with the 8.2 /boot/kernel.old but not with the 9.fresh /boot/kernel randy ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [CFT] pkgng alpha2
on 30/11/2011 22:32 Julien Laffaye said the following: [1] : https://github.com/pkgng/pkgng/issues [2] : https://github.com/pkgng/pkgng [3] : http://wiki.freebsd.org/pkgng [4] : http://people.freebsd.org/~bapt/pkgng-bsdcan2011.pdf [5] : http://wiki.freebsd.org/201110DevSummit/Ports?action=AttachFiledo=gettarget=pkgng-devsummit.pdf [6] : http://wiki.freebsd.org/201110DevSummit?action=AttachFiledo=gettarget=pkgng-devsummit-track.pdf Couple of questions/suggestions: 1. Do you plan to have a pkgng port to issue the preview releases pkgng? Current pkgng installation/bootstrap procedure is really easy, but the port would be even more convenient for prospective testers. 2. Is there a public pre-built package repository with pkgng-format packages that could be used for testing and getting a taste of a packages-only pkgng-managed system? Thank you. -- Andriy Gapon ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [CFT] pkgng alpha2
On 12/13/2011 06:16 PM, Andriy Gapon wrote: on 30/11/2011 22:32 Julien Laffaye said the following: [1] : https://github.com/pkgng/pkgng/issues [2] : https://github.com/pkgng/pkgng [3] : http://wiki.freebsd.org/pkgng [4] : http://people.freebsd.org/~bapt/pkgng-bsdcan2011.pdf [5] : http://wiki.freebsd.org/201110DevSummit/Ports?action=AttachFiledo=gettarget=pkgng-devsummit.pdf [6] : http://wiki.freebsd.org/201110DevSummit?action=AttachFiledo=gettarget=pkgng-devsummit-track.pdf Couple of questions/suggestions: 1. Do you plan to have a pkgng port to issue the preview releases pkgng? Current pkgng installation/bootstrap procedure is really easy, but the port would be even more convenient for prospective testers. Yes, this is planned. The ports will bootstrap pkgng. 2. Is there a public pre-built package repository with pkgng-format packages that could be used for testing and getting a taste of a packages-only pkgng-managed system? Unfortunately, no. I think I now have the resources to do that for the next CFT. But it will only be 9.0 amd64 I am afraid. We cant build packages for the entire matrix. Thank you. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: grabbing console (syscons) in kernel
on 11/12/2011 23:45 Andriy Gapon said the following: There are a few cases when the kernel needs to interact with a user via syscons. This is the cases where the kernel not only spews some output but also expects some input. Some examples are: - asking for a root filesystem specification - entering ddb - asking to press a key for reboot In this cases the kernel implicitly grabs the console for its own use. I'd like to make action more explicit. What do you think about the approach and implementation in the following patches? Thank you! https://gitorious.org/~avg/freebsd/avgbsd/commit/5248b49ebf84d98a0597fa5aa4d813a38f581acc https://gitorious.org/~avg/freebsd/avgbsd/commit/a0849c52242378474bb2eaa41726376fbc4c5bf6 https://gitorious.org/~avg/freebsd/avgbsd/commit/a67515cbd720b16f03ba435ed182966a8a338b15 https://gitorious.org/~avg/freebsd/avgbsd/commit/b8864b68b4c0e26ece065a38301c305833be32eb https://gitorious.org/~avg/freebsd/avgbsd/commit/1017ae425d8abecd7482bd6c6deaaf9f25f5c6cd I was advised that the above links might not be the best way to present the patches for review, so here are them as a single diff file: http://people.freebsd.org/~avg/cngrab.diff P.S. one of the benefits is that a keyboard is put into and out of the polling mode before getting all the required input and after that; not around each character as it is done now in rather twisted way. -- Andriy Gapon ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [CFT] pkgng alpha2
on 13/12/2011 19:22 Julien Laffaye said the following: On 12/13/2011 06:16 PM, Andriy Gapon wrote: on 30/11/2011 22:32 Julien Laffaye said the following: [1] : https://github.com/pkgng/pkgng/issues [2] : https://github.com/pkgng/pkgng [3] : http://wiki.freebsd.org/pkgng [4] : http://people.freebsd.org/~bapt/pkgng-bsdcan2011.pdf [5] : http://wiki.freebsd.org/201110DevSummit/Ports?action=AttachFiledo=gettarget=pkgng-devsummit.pdf [6] : http://wiki.freebsd.org/201110DevSummit?action=AttachFiledo=gettarget=pkgng-devsummit-track.pdf Couple of questions/suggestions: 1. Do you plan to have a pkgng port to issue the preview releases pkgng? Current pkgng installation/bootstrap procedure is really easy, but the port would be even more convenient for prospective testers. Yes, this is planned. The ports will bootstrap pkgng. Great! 2. Is there a public pre-built package repository with pkgng-format packages that could be used for testing and getting a taste of a packages-only pkgng-managed system? Unfortunately, no. I think I now have the resources to do that for the next CFT. But it will only be 9.0 amd64 I am afraid. We cant build packages for the entire matrix. I understand. Those would take an immense amount of compilation time and storage space. -- Andriy Gapon ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
RC3 won't boot after install
Helloat the suggestion of Bjoern i'm using this forum to help resolve and test this issue. I'm using the 9rc3 memstick img file to do an install on a Samsung Chronos 7, brand new. The install booted from the usb and answered defaults for all the install questions. Even went to the shell with no problem. When it got to the end of the install and asked to reboot, I clicked ok. Pulled out the media (usb) and heard the machine rebooting, but then it seemed like it went into a loop trying to start some kind of graphics mode. The screen kept flashing with no text or images, just complete black and then what looked like a backlight..I thought it might be a graphics intro or splash screen, but no matter what I do I cant break out of it...and it won't load.. -jeff ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
RC3 won't boot after install
Hello I'm using the 9rc3 memstick img file to do an install on a Samsung Chronos 7, brand new. The install booted from the usb and answered defaults for all the install questions. Even went to the shell with no problem. When it got to the end of the install and asked to reboot, I clicked ok. Pulled out the media (usb) and heard the machine rebooting, but then it seemed like it went into a loop trying to start some kind of graphics mode. The screen kept flashing with no text or images, just complete black and then what looked like a backlight..I thought it might be a graphics intro or splash screen, but no matter what I do I cant break out of it...and it won't load.. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
mount -u /path/containing/a/symlink broken in 9.0
Hi all, I just discovered after upgrading the portsnap buildbox from 8.2 to 9.0-rc3 that # mount -u /path/containing/a/symlink now fails with 'not currently mounted'. Can anyone tell me if this change was deliberate? -- Colin Percival Security Officer, FreeBSD | freebsd.org | The power to serve Founder / author, Tarsnap | tarsnap.com | Online backups for the truly paranoid ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: SCHED_ULE should not be the default
On 12/13/2011 10:54 AM, Steve Kargl wrote: I have given the WHY in previous discussions of ULE, based on what you call legacy benchmarks. I have not seen any commit to sched_ule.c that would lead me to believe that the performance issues with ULE and cpu-bound numerical codes have been addressed. Repeating the benchmark would be a waste of time. Trying a simple pbzip2 on a large file, the results are pretty consistent through iterations. pbzip2 with 4BSD is barely faster on a file thats 322MB in size. after a reboot, I did a strings bigfile /dev/null then ran pbzip2 -v xaa -c /dev/null 7 times If I do a burnP6 in the background, they perform about the same. (from sysutils/cpuburn) eg pbzip2 -v xaa -c /dev/null Parallel BZIP2 v1.1.6 - by: Jeff Gilchrist [http://compression.ca] [Oct. 30, 2011] (uses libbzip2 by Julian Seward) Major contributions: Yavor Nikolov nikolov.javor+pbz...@gmail.com # CPUs: 4 BWT Block Size: 900 KB File Block Size: 900 KB Maximum Memory: 100 MB --- File #: 1 of 1 Input Name: xaa Output Name: stdout Input Size: 352404831 bytes Compressing data... Output Size: 50630745 bytes --- Wall Clock: 18.139342 seconds ULE 18.113204 18.116896 18.123400 18.105894 18.163332 18.139342 18.082888 ULE with burnP6 23.076085 22.003666 21.162987 21.682445 21.935568 23.595781 21.601277 4BSD 17.983395 17.986218 18.009254 18.004312 18.001494 17.997032 4BSD with burnP6 22.215508 21.886459 21.595179 21.361830 21.325351 21.244793 # ministat uleP6 bsdP6 x uleP6 + bsdP6 +--+ |x+ + ++x + x x + xx| | ||__MA|M_A__| | +--+ N Min MaxMedian AvgStddev x 6 21.162987 23.595781 22.003666 22.2427550.91175566 + 6 21.244793 22.215508 21.595179 21.604853 0.3792413 No difference proven at 95.0% confidence x ule + bsd +--+ |+ + + + + + xx x xx x x| | |__A___M___| |M__A__| | +--+ N Min MaxMedian AvgStddev x 7 18.082888 18.163332 18.116896 18.120708 0.025468695 + 6 17.983395 18.009254 18.001494 17.996951 0.010248473 Difference at 95.0% confidence -0.123757 +/- 0.024538 -0.68296% +/- 0.135414% (Student's t, pooled s = 0.0200388) hardware is X3450 with 8G of memory. RELENG8 ---Mike -- --- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, m...@sentex.net Providing Internet services since 1994 www.sentex.net Cambridge, Ontario Canada http://www.tancsa.com/ ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: CVS removal from the base
On 12/11/2011 06:14, Julian H. Stacey wrote: Doug Barton wrote: On 12/02/2011 04:35, Adrian Chadd wrote: I think you're missing the point a little. The point is, you have to keep in mind how comfortable people feel about things, and progress sometimes makes people uncomfortable. I think you should leave these changes bake for a while and let people get comfortable with the changing status quo. The fact that we have so many people who are radically change-averse, no matter how rational the change; is a bug, not a feature. This particular bug is complicated dramatically by the fact that the majority view seems to lean heavily towards If I use it, it must be the default and/or in the base rather than seeing ports as part of the overall operating SYSTEM. BSD is more conservative. More value given to stability of availability of interfaces tools etc, Having things in ports doesn't make them less available. :) More Long term professionals. I don't know what this means. Doug's attempting to force working FreeBSD ports such as procmail to be discarded is deplorable. Um, I had nothing to say about procmail. In fact, I use procmail, and would not want to see it removed. Doug should stop coercing FreeBSD toward a Linux model, move himself to Linux. I actually do use Linux sometimes. In many ways it is a far superior desktop. That said, I am certainly *not* trying to turn FreeBSD into another Linux distro. What I am trying to do is to see what we can learn from how Linux does things, and apply those ideas here when they are useful. Just because Linux does it, doesn't mean it's wrong. :) I've said this before, but it's worth repeating. Decisions that were made 20 years ago about what should and should not be included in the Berkeley Software Distribution, while valid at the time, may not be valid any longer because things have changed since then. Just to take one obvious example, when these decisions were being made it was necessary to distribute a full system, including the 3rd party stuff, all in one go because the software was being distributed on magnetic tape. Doug -- [^L] Breadth of IT experience, and depth of knowledge in the DNS. Yours for the right price. :) http://SupersetSolutions.com/ ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: SCHED_ULE should not be the default
On 12/13/2011 13:31, Malin Randstrom wrote: stop sending me spam mail ... you never stop despite me having unsubscribeb several times. stop this! If you had actually unsubscribed, the mail would have stopped. :) You can see the instructions you need to follow below. ___ freebsd-sta...@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org -- [^L] Breadth of IT experience, and depth of knowledge in the DNS. Yours for the right price. :) http://SupersetSolutions.com/ ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: SCHED_ULE should not be the default
On Tue, Dec 13, 2011 at 10:40:48AM +0200, Ivan Klymenko wrote: If the algorithm ULE does not contain problems - it means the problem has Core2Duo, or in a piece of code that uses the ULE scheduler. I already wrote in a mailing list that specifically in my case (Core2Duo) partially helps the following patch: --- sched_ule.c.orig 2011-11-24 18:11:48.0 +0200 +++ sched_ule.c 2011-12-10 22:47:08.0 +0200 @@ -794,7 +794,8 @@ * 1.5 * balance_interval. */ balance_ticks = max(balance_interval / 2, 1); - balance_ticks += random() % balance_interval; +// balance_ticks += random() % balance_interval; + balance_ticks += ((int)random()) % balance_interval; if (smp_started == 0 || rebalance == 0) return; tdq = TDQ_SELF(); This avoids a 64-bit division on 64-bit platforms but seems to have no effect otherwise. Because this function is not called very often, the change seems unlikely to help. @@ -2118,13 +2119,21 @@ struct td_sched *ts; THREAD_LOCK_ASSERT(td, MA_OWNED); + if (td-td_pri_class PRI_FIFO_BIT) + return; + ts = td-td_sched; + /* + * We used up one time slice. + */ + if (--ts-ts_slice 0) + return; This skips most of the periodic functionality (long term load balancer, saving switch count (?), insert index (?), interactivity score update for long running thread) if the thread is not going to be rescheduled right now. It looks wrong but it is a data point if it helps your workload. tdq = TDQ_SELF(); #ifdef SMP /* * We run the long term load balancer infrequently on the first cpu. */ - if (balance_tdq == tdq) { - if (balance_ticks --balance_ticks == 0) + if (balance_ticks --balance_ticks == 0) { + if (balance_tdq == tdq) sched_balance(); } #endif The main effect of this appears to be to disable the long term load balancer completely after some time. At some point, a CPU other than the first CPU (which uses balance_tdq) will set balance_ticks = 0, and sched_balance() will never be called again. It also introduces a hypothetical race condition because the access to balance_ticks is no longer restricted to one CPU under a spinlock. If the long term load balancer may be causing trouble, try setting kern.sched.balance_interval to a higher value with unpatched code. @@ -2144,9 +2153,6 @@ if (TAILQ_EMPTY(tdq-tdq_timeshare.rq_queues[tdq-tdq_ridx])) tdq-tdq_ridx = tdq-tdq_idx; } - ts = td-td_sched; - if (td-td_pri_class PRI_FIFO_BIT) - return; if (PRI_BASE(td-td_pri_class) == PRI_TIMESHARE) { /* * We used a tick; charge it to the thread so @@ -2157,11 +2163,6 @@ sched_priority(td); } /* - * We used up one time slice. - */ - if (--ts-ts_slice 0) - return; - /* * We're out of time, force a requeue at userret(). */ ts-ts_slice = sched_slice; and refusal to use options FULL_PREEMPTION But no one has unsubscribed to my letter, my patch helps or not in the case of Core2Duo... There is a suspicion that the problems stem from the sections of code associated with the SMP... Maybe I'm in something wrong, but I want to help in solving this problem ... -- Jilles Tjoelker ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: NFS + SVN problem?
On Tue, 2011-12-13 at 07:53 -0800, Rick Macklem wrote: Dimitry Andric wrote: On 2011-11-23 19:26, Sean Bruno wrote: On Wed, 2011-11-23 at 09:58 -0800, Rick Macklem wrote: I don't know if Dimitry tried this, but you could also try the nolockd option, so that byte range locking is done locally in the client and avoids the NLM. Good luck with it and please let us know how it goes, rick This seems to allow SVN 1.7 to do whatever nonsense it is trying to do. I've modified my fstab on the test host in the cluster to: dumpster:/vol/volshscratch /dumpster/scratch nfs rw,soft,intr,bg,nolockd,nosuid 0 0 Removing soft,intr had no effect. This, I suspect will be problematic for clusteradm@ if we start updating hosts in the cluster. A very late addition to this: I got Subversion 1.7 to work properly over NFSv3, by making sure rpc.lockd runs on both server and client. E.g, set rpc_lockd_enable to YES in rc.conf; this is off by default, even if you have nfs_client_enable/nfs_server_enable set to YES. and rpc_statd_enable=YES on all systems, as well. Thanks for this btw. :-) Sean ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: SCHED_ULE should not be the default
On Mon, Dec 12, 2011 at 04:29:14PM -0800, Doug Barton wrote: On 12/12/2011 05:47, O. Hartmann wrote: Do we have any proof at hand for such cases where SCHED_ULE performs much better than SCHED_4BSD? I complained about poor interactive performance of ULE in a desktop environment for years. I had numerous people try to help, including Jeff, with various tunables, dtrace'ing, etc. The cause of the problem was never found. The issues that I've seen with ULE on the desktop seem to be caused by X taking up a steady amount of CPU, and being demoted from being an interactive process. X then becomes the bottleneck for other processes that would otherwise be interactive. Try 'renice -20 pid_of_X' and see if that makes your problems go away. Marcus ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: SCHED_ULE should not be the default
В Wed, 14 Dec 2011 00:04:42 +0100 Jilles Tjoelker jil...@stack.nl пишет: On Tue, Dec 13, 2011 at 10:40:48AM +0200, Ivan Klymenko wrote: If the algorithm ULE does not contain problems - it means the problem has Core2Duo, or in a piece of code that uses the ULE scheduler. I already wrote in a mailing list that specifically in my case (Core2Duo) partially helps the following patch: --- sched_ule.c.orig2011-11-24 18:11:48.0 +0200 +++ sched_ule.c 2011-12-10 22:47:08.0 +0200 @@ -794,7 +794,8 @@ * 1.5 * balance_interval. */ balance_ticks = max(balance_interval / 2, 1); - balance_ticks += random() % balance_interval; +// balance_ticks += random() % balance_interval; + balance_ticks += ((int)random()) % balance_interval; if (smp_started == 0 || rebalance == 0) return; tdq = TDQ_SELF(); This avoids a 64-bit division on 64-bit platforms but seems to have no effect otherwise. Because this function is not called very often, the change seems unlikely to help. Yes, this section does not apply to this problem :) Just I posted the latest patch which i using now... @@ -2118,13 +2119,21 @@ struct td_sched *ts; THREAD_LOCK_ASSERT(td, MA_OWNED); + if (td-td_pri_class PRI_FIFO_BIT) + return; + ts = td-td_sched; + /* +* We used up one time slice. +*/ + if (--ts-ts_slice 0) + return; This skips most of the periodic functionality (long term load balancer, saving switch count (?), insert index (?), interactivity score update for long running thread) if the thread is not going to be rescheduled right now. It looks wrong but it is a data point if it helps your workload. Yes, I did it for as long as possible to delay the execution of the code in section: ... #ifdef SMP /* * We run the long term load balancer infrequently on the first cpu. */ if (balance_tdq == tdq) { if (balance_ticks --balance_ticks == 0) sched_balance(); } #endif ... tdq = TDQ_SELF(); #ifdef SMP /* * We run the long term load balancer infrequently on the first cpu. */ - if (balance_tdq == tdq) { - if (balance_ticks --balance_ticks == 0) + if (balance_ticks --balance_ticks == 0) { + if (balance_tdq == tdq) sched_balance(); } #endif The main effect of this appears to be to disable the long term load balancer completely after some time. At some point, a CPU other than the first CPU (which uses balance_tdq) will set balance_ticks = 0, and sched_balance() will never be called again. That is, for the same reason as above in the text... It also introduces a hypothetical race condition because the access to balance_ticks is no longer restricted to one CPU under a spinlock. If the long term load balancer may be causing trouble, try setting kern.sched.balance_interval to a higher value with unpatched code. I checked it in the first place - but it did not help fix the situation... The impression of malfunction rebalancing... It seems that the thread is passed on to the same core that is loaded and so... Perhaps this is a consequence of an incorrect definition of the topology CPU? @@ -2144,9 +2153,6 @@ if (TAILQ_EMPTY(tdq-tdq_timeshare.rq_queues[tdq-tdq_ridx])) tdq-tdq_ridx = tdq-tdq_idx; } - ts = td-td_sched; - if (td-td_pri_class PRI_FIFO_BIT) - return; if (PRI_BASE(td-td_pri_class) == PRI_TIMESHARE) { /* * We used a tick; charge it to the thread so @@ -2157,11 +2163,6 @@ sched_priority(td); } /* -* We used up one time slice. -*/ - if (--ts-ts_slice 0) - return; - /* * We're out of time, force a requeue at userret(). */ ts-ts_slice = sched_slice; and refusal to use options FULL_PREEMPTION But no one has unsubscribed to my letter, my patch helps or not in the case of Core2Duo... There is a suspicion that the problems stem from the sections of code associated with the SMP... Maybe I'm in something wrong, but I want to help in solving this problem ... ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: SCHED_ULE should not be the default
В Tue, 13 Dec 2011 23:02:15 + Marcus Reid mar...@blazingdot.com пишет: On Mon, Dec 12, 2011 at 04:29:14PM -0800, Doug Barton wrote: On 12/12/2011 05:47, O. Hartmann wrote: Do we have any proof at hand for such cases where SCHED_ULE performs much better than SCHED_4BSD? I complained about poor interactive performance of ULE in a desktop environment for years. I had numerous people try to help, including Jeff, with various tunables, dtrace'ing, etc. The cause of the problem was never found. The issues that I've seen with ULE on the desktop seem to be caused by X taking up a steady amount of CPU, and being demoted from being an interactive process. X then becomes the bottleneck for other processes that would otherwise be interactive. Try 'renice -20 pid_of_X' and see if that makes your problems go away. Why, then X is not a bottleneck when using 4BSD? Marcus ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: SCHED_ULE should not be the default
On Tue, Dec 13, 2011 at 3:39 PM, Ivan Klymenko fi...@ukr.net wrote: В Wed, 14 Dec 2011 00:04:42 +0100 Jilles Tjoelker jil...@stack.nl пишет: On Tue, Dec 13, 2011 at 10:40:48AM +0200, Ivan Klymenko wrote: If the algorithm ULE does not contain problems - it means the problem has Core2Duo, or in a piece of code that uses the ULE scheduler. I already wrote in a mailing list that specifically in my case (Core2Duo) partially helps the following patch: --- sched_ule.c.orig 2011-11-24 18:11:48.0 +0200 +++ sched_ule.c 2011-12-10 22:47:08.0 +0200 @@ -794,7 +794,8 @@ * 1.5 * balance_interval. */ balance_ticks = max(balance_interval / 2, 1); - balance_ticks += random() % balance_interval; +// balance_ticks += random() % balance_interval; + balance_ticks += ((int)random()) % balance_interval; if (smp_started == 0 || rebalance == 0) return; tdq = TDQ_SELF(); This avoids a 64-bit division on 64-bit platforms but seems to have no effect otherwise. Because this function is not called very often, the change seems unlikely to help. Yes, this section does not apply to this problem :) Just I posted the latest patch which i using now... @@ -2118,13 +2119,21 @@ struct td_sched *ts; THREAD_LOCK_ASSERT(td, MA_OWNED); + if (td-td_pri_class PRI_FIFO_BIT) + return; + ts = td-td_sched; + /* + * We used up one time slice. + */ + if (--ts-ts_slice 0) + return; This skips most of the periodic functionality (long term load balancer, saving switch count (?), insert index (?), interactivity score update for long running thread) if the thread is not going to be rescheduled right now. It looks wrong but it is a data point if it helps your workload. Yes, I did it for as long as possible to delay the execution of the code in section: ... #ifdef SMP /* * We run the long term load balancer infrequently on the first cpu. */ if (balance_tdq == tdq) { if (balance_ticks --balance_ticks == 0) sched_balance(); } #endif ... tdq = TDQ_SELF(); #ifdef SMP /* * We run the long term load balancer infrequently on the first cpu. */ - if (balance_tdq == tdq) { - if (balance_ticks --balance_ticks == 0) + if (balance_ticks --balance_ticks == 0) { + if (balance_tdq == tdq) sched_balance(); } #endif The main effect of this appears to be to disable the long term load balancer completely after some time. At some point, a CPU other than the first CPU (which uses balance_tdq) will set balance_ticks = 0, and sched_balance() will never be called again. That is, for the same reason as above in the text... It also introduces a hypothetical race condition because the access to balance_ticks is no longer restricted to one CPU under a spinlock. If the long term load balancer may be causing trouble, try setting kern.sched.balance_interval to a higher value with unpatched code. I checked it in the first place - but it did not help fix the situation... The impression of malfunction rebalancing... It seems that the thread is passed on to the same core that is loaded and so... Perhaps this is a consequence of an incorrect definition of the topology CPU? @@ -2144,9 +2153,6 @@ if (TAILQ_EMPTY(tdq-tdq_timeshare.rq_queues[tdq-tdq_ridx])) tdq-tdq_ridx = tdq-tdq_idx; } - ts = td-td_sched; - if (td-td_pri_class PRI_FIFO_BIT) - return; if (PRI_BASE(td-td_pri_class) == PRI_TIMESHARE) { /* * We used a tick; charge it to the thread so @@ -2157,11 +2163,6 @@ sched_priority(td); } /* - * We used up one time slice. - */ - if (--ts-ts_slice 0) - return; - /* * We're out of time, force a requeue at userret(). */ ts-ts_slice = sched_slice; and refusal to use options FULL_PREEMPTION But no one has unsubscribed to my letter, my patch helps or not in the case of Core2Duo... There is a suspicion that the problems stem from the sections of code associated with the SMP... Maybe I'm in something wrong, but I want to help in solving this problem ... Has anyone experiencing problems tried to set sysctl kern.sched.steal_thresh=1 ? I don't remember what our specific problem at $WORK was, perhaps it was just interrupt threads not getting serviced fast enough, but we've hard-coded this to 1 and removed the code that sets it in sched_initticks(). The same effect should be had by setting the sysctl after a box is up. Thanks, matthew ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: SCHED_ULE should not be the default
В Tue, 13 Dec 2011 16:01:56 -0800 m...@freebsd.org пишет: On Tue, Dec 13, 2011 at 3:39 PM, Ivan Klymenko fi...@ukr.net wrote: В Wed, 14 Dec 2011 00:04:42 +0100 Jilles Tjoelker jil...@stack.nl пишет: On Tue, Dec 13, 2011 at 10:40:48AM +0200, Ivan Klymenko wrote: If the algorithm ULE does not contain problems - it means the problem has Core2Duo, or in a piece of code that uses the ULE scheduler. I already wrote in a mailing list that specifically in my case (Core2Duo) partially helps the following patch: --- sched_ule.c.orig 2011-11-24 18:11:48.0 +0200 +++ sched_ule.c 2011-12-10 22:47:08.0 +0200 @@ -794,7 +794,8 @@ * 1.5 * balance_interval. */ balance_ticks = max(balance_interval / 2, 1); - balance_ticks += random() % balance_interval; +// balance_ticks += random() % balance_interval; + balance_ticks += ((int)random()) % balance_interval; if (smp_started == 0 || rebalance == 0) return; tdq = TDQ_SELF(); This avoids a 64-bit division on 64-bit platforms but seems to have no effect otherwise. Because this function is not called very often, the change seems unlikely to help. Yes, this section does not apply to this problem :) Just I posted the latest patch which i using now... @@ -2118,13 +2119,21 @@ struct td_sched *ts; THREAD_LOCK_ASSERT(td, MA_OWNED); + if (td-td_pri_class PRI_FIFO_BIT) + return; + ts = td-td_sched; + /* + * We used up one time slice. + */ + if (--ts-ts_slice 0) + return; This skips most of the periodic functionality (long term load balancer, saving switch count (?), insert index (?), interactivity score update for long running thread) if the thread is not going to be rescheduled right now. It looks wrong but it is a data point if it helps your workload. Yes, I did it for as long as possible to delay the execution of the code in section: ... #ifdef SMP /* * We run the long term load balancer infrequently on the first cpu. */ if (balance_tdq == tdq) { if (balance_ticks --balance_ticks == 0) sched_balance(); } #endif ... tdq = TDQ_SELF(); #ifdef SMP /* * We run the long term load balancer infrequently on the first cpu. */ - if (balance_tdq == tdq) { - if (balance_ticks --balance_ticks == 0) + if (balance_ticks --balance_ticks == 0) { + if (balance_tdq == tdq) sched_balance(); } #endif The main effect of this appears to be to disable the long term load balancer completely after some time. At some point, a CPU other than the first CPU (which uses balance_tdq) will set balance_ticks = 0, and sched_balance() will never be called again. That is, for the same reason as above in the text... It also introduces a hypothetical race condition because the access to balance_ticks is no longer restricted to one CPU under a spinlock. If the long term load balancer may be causing trouble, try setting kern.sched.balance_interval to a higher value with unpatched code. I checked it in the first place - but it did not help fix the situation... The impression of malfunction rebalancing... It seems that the thread is passed on to the same core that is loaded and so... Perhaps this is a consequence of an incorrect definition of the topology CPU? @@ -2144,9 +2153,6 @@ if (TAILQ_EMPTY(tdq-tdq_timeshare.rq_queues[tdq-tdq_ridx])) tdq-tdq_ridx = tdq-tdq_idx; } - ts = td-td_sched; - if (td-td_pri_class PRI_FIFO_BIT) - return; if (PRI_BASE(td-td_pri_class) == PRI_TIMESHARE) { /* * We used a tick; charge it to the thread so @@ -2157,11 +2163,6 @@ sched_priority(td); } /* - * We used up one time slice. - */ - if (--ts-ts_slice 0) - return; - /* * We're out of time, force a requeue at userret(). */ ts-ts_slice = sched_slice; and refusal to use options FULL_PREEMPTION But no one has unsubscribed to my letter, my patch helps or not in the case of Core2Duo... There is a suspicion that the problems stem from the sections of code associated with the SMP... Maybe I'm in something wrong, but I want to help in solving this problem ... Has anyone experiencing problems tried to set sysctl kern.sched.steal_thresh=1 ? In my case, the variable kern.sched.steal_thresh and so has the value 1. I don't remember what our specific problem at $WORK was, perhaps it was just interrupt threads not getting serviced fast enough, but we've hard-coded this to 1 and removed the code that sets it in sched_initticks(). The same effect should be had by setting the
Re: multihomed nfs server - NLM lock failure on additional interfaces
John De wrote: Hi Folks, I have a 9-prerelease system where I've been testing nfs/zfs. The system has been working quite well until moving the server to a multihomed configuration. Given the following: nfsd: master (nfsd) nfsd: server (nfsd) /usr/sbin/rpcbind -h 10.24.6.38 -h 172.1.1.2 -h 172.21.201.1 -h 172.21.202.1 -h 172.21.203.1 -h 172.21.204.1 -h 172.21.205.1 -h 10.24.6.34 -h 10.24.6.33 /usr/sbin/mountd -r -l -h 10.24.6.38 -h 172.1.1.2 -h 172.21.201.1 -h 172.21.202.1 -h 172.21.203.1 -h 172.21.204.1 -h 172.21.205.1 -h 10.24.6.34 -h 10.24.6.33 /usr/sbin/rpc.statd -h 10.24.6.38 -h 172.1.1.2 -h 172.21.201.1 -h 172.21.202.1 -h 172.21.203.1 -h 172.21.204.1 -h 172.21.205.1 -h 10.24.6.34 -h 10.24.6.33 /usr/sbin/rpc.lockd -h 10.24.6.38 -h 172.1.1.2 -h 172.21.201.1 -h 172.21.202.1 -h 172.21.203.1 -h 172.21.204.1 -h 172.21.205.1 -h 10.24.6.34 -h 10.24.6.33 10.24.6.38 is the default interface on 1G. The 172 nets are 10G connected to compute systems. ifconfig_bce0=' inet 10.24.6.38 netmask 255.255.0.0 -rxcsum -txcsum' _c='physical addr which never changes' ifconfig_bce1=' inet 172.1.1.2 netmask 255.255.255.0' _c='physcial addr on crossover cable' ifconfig_cxgb2='inet 172.21.21.129 netmask 255.255.255.0' _c='physical backside 10g compute net' ifconfig_cxgb3='inet 172.21.201.1 netmask 255.255.255.0 mtu 9000' _c='physical backside 10g compute net' ifconfig_cxgb6='inet 172.21.202.1 netmask 255.255.255.0 mtu 9000' _c='physical backside 10g compute net' ifconfig_cxgb8='inet 172.21.203.1 netmask 255.255.255.0 mtu 9000' _c='physical backside 10g compute net' ifconfig_cxgb4='inet 172.21.204.1 netmask 255.255.255.0 mtu 9000' _c='physical backside 10g compute net' ifconfig_cxgb0='inet 172.21.205.1 netmask 255.255.255.0 mtu 9000' _c='physical backside 10g compute net' The 10.24.6.34 and 10.24.6.33 are alias addresses for the system. Destination Gateway Flags Refs Use Netif Expire default 10.24.0.1 UGS 0 1049 bce0 The server works correctly (and quite well) for both udp tcp mounts. Basically, all nfs traffic is great! However, locking only works for clients connected to the 10.24.6.38 interface. A tcpdump file from good bad runs: http://www.freebsd.org/~jwd/lockgood.pcap http://www.freebsd.org/~jwd/lockbad.pcap Basically, the clients (both FreeBSD Linux) query the servers rpcbind for the address of the nlm which is returned correctly. For the good run, the NLM is then called. For the bad call, it is not. Well, first off I think your packet traces are missing packets. If you look at nlm_get_rpc(), which is the function in sys/nlm/nlm_prot_impl.c that is doing this, you will see that it first attempts UDP and then falls back to TCP when talking to rpcbind. Your packet traces only show TCP, so I suspect that the UDP case went through a different interface (or missed getting captured some other way?). My guess would be that the attempt to connect to the server's NLM does the same thing, since the lockbad.pcap doesn't show any SYN,... to port 844. If I were you, I'd put lottsa printfs in nlm_get_rpc() showing what is in the address structure ss and, in particular when it calls clnt_reconnect_create(). { For the client. } For the server, it starts at sys_nlm_syscall(), which calls ... until you get to nlm_register_services(). It copies in a list of address(es) and I would printf those address(es) once copied into the kernel, to see if they make sense. These are the address(es) that are going to get sobind()'d later by a function called svn_tli_create() { over is sys/rpc/rpc_generic.c }. That's as far as I got. Good luck with it, rick I've started digging through code, but I do not claim to be an rpc expert. If anyone has suggestions I would appreciate any pointers. Thanks! John ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
about XHCI_PS_PP
Hi Selasky, I think XHCI_PS_PP is wrong. - #define XHCI_PS_PP 0x0100 /* RW - port power */ + #define XHCI_PS_PP 0x0200 /* RW - port power */ Could you check it? Best regards, Kohji Okuno ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: CVS removal from the base
Hi, Reference: From: Doug Barton do...@freebsd.org Date: Tue, 13 Dec 2011 13:29:02 -0800 Message-id: 4ee7c39e.6040...@freebsd.org Doug Barton wrote: On 12/11/2011 06:14, Julian H. Stacey wrote: Doug Barton wrote: On 12/02/2011 04:35, Adrian Chadd wrote: I think you're missing the point a little. The point is, you have to keep in mind how comfortable people feel about things, and progress sometimes makes people uncomfortable. I think you should leave these changes bake for a while and let people get comfortable with the changing status quo. The fact that we have so many people who are radically change-averse, no matter how rational the change; is a bug, not a feature. This particular bug is complicated dramatically by the fact that the majority view seems to lean heavily towards If I use it, it must be the default and/or in the base rather than seeing ports as part of the overall operating SYSTEM. BSD is more conservative. More value given to stability of availability of interfaces tools etc, Having things in ports doesn't make them less available. :) It didn't used to. It risks it now, since in last months, some ports/ have been targeted by a few rogue commiters purging, who want to toss ports out from one release to another without warning of a DEPRECATED= in previous release Makefiles. More Long term professionals. I don't know what this means. Older folk with more decades of Unix are likely to have had BSD experience way back , jumped at BSD when eg BSD Lite 386BSD came out. Younger folk may have a higher chance their first Unix exposure was Linux on a CD from a computer mag. some of each will have stayed with the BSD or Linux they started with. Hence BSD people tend to have been working a bit longer I think. Doug's attempting to force working FreeBSD ports such as procmail to be discarded is deplorable. Um, I had nothing to say about procmail. In fact, I use procmail, and would not want to see it removed. Doug should stop coercing FreeBSD toward a Linux model, move himself to Linux. Whoops ! _Apologies_ Doug ! I was mixing people up. Apologies ! I actually do use Linux sometimes. In many ways it is a far superior desktop. That said, I am certainly *not* trying to turn FreeBSD into another Linux distro. What I am trying to do is to see what we can learn from how Linux does things, and apply those ideas here when they are useful. Just because Linux does it, doesn't mean it's wrong. :) Yup, each distro can have some good bad. I've said this before, but it's worth repeating. Decisions that were made 20 years ago about what should and should not be included in the Berkeley Software Distribution, while valid at the time, may not be valid any longer because things have changed since then. Just to take one obvious example, when these decisions were being made it was necessary to distribute a full system, including the 3rd party stuff, all in one go because the software was being distributed on magnetic tape. Good point. Doug Apologies again for confusing your name with others. FYI URLs to end of 1st procmail thread beginning of 2nd http://docs.freebsd.org/cgi/getmsg.cgi?fetch=948124+0+archive/2011/freebsd-ports/20110904.freebsd-ports http://docs.freebsd.org/cgi/getmsg.cgi?fetch=85459+0+/usr/local/www/db/text/2011/freebsd-ports/20111002.freebsd-ports Cheers, Julian -- Julian Stacey, BSD Unix Linux C Sys Eng Consultants Munich http://berklix.com Reply below not above, cumulative like a play script, indent with . Format: Plain text. Not HTML, multipart/alternative, base64, quoted-printable. EU tax to kill London Vetoed http://berklix.com/~jhs/blog/2011_12_11 ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: NFS + SVN problem?
On Mon, Dec 12, 2011 at 11:56 PM, Dimitry Andric d...@freebsd.org wrote: A very late addition to this: I got Subversion 1.7 to work properly over NFSv3, by making sure rpc.lockd runs on both server and client. E.g, set rpc_lockd_enable to YES in rc.conf; this is off by default, even if you have nfs_client_enable/nfs_server_enable set to YES. ___ If nfs_client_enable or nfs_server_enable are set to YES, does is it reasonable to have rpc_lockd_enable default to YES, unless the user explicitly sets it to NO in their /etc/rc.conf? -- Craig Rodrigues rodr...@crodrigues.org ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: CVS removal from the base
On 12/13/11 7:49 PM, Julian H. Stacey wrote: Hi, Reference: From: Doug Bartondo...@freebsd.org Date: Tue, 13 Dec 2011 13:29:02 -0800 Message-id: 4ee7c39e.6040...@freebsd.org Doug Barton wrote: On 12/11/2011 06:14, Julian H. Stacey wrote: Doug Barton wrote: On 12/02/2011 04:35, Adrian Chadd wrote: I think you're missing the point a little. The point is, you have to keep in mind how comfortable people feel about things, and progress sometimes makes people uncomfortable. I think you should leave these changes bake for a while and let people get comfortable with the changing status quo. The fact that we have so many people who are radically change-averse, no matter how rational the change; is a bug, not a feature. This particular bug is complicated dramatically by the fact that the majority view seems to lean heavily towards If I use it, it must be the default and/or in the base rather than seeing ports as part of the overall operating SYSTEM. BSD is more conservative. More value given to stability of availability of interfaces tools etc, Having things in ports doesn't make them less available. :) It didn't used to. It risks it now, since in last months, some ports/ have been targeted by a few rogue commiters purging, who want to toss ports out from one release to another without warning of a DEPRECATED= in previous release Makefiles. which brings up teh possibility of 1st class ports.. which are kept more as part of the system.. (sorry for sounding like a broken record..) ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: CVS removal from the base
On Tue, Dec 13, 2011 at 9:29 PM, Julian Elischer jul...@freebsd.org wrote: On 12/13/11 7:49 PM, Julian H. Stacey wrote: which brings up teh possibility of 1st class ports.. which are kept more as part of the system.. (sorry for sounding like a broken record..) *jumps back into the fray* If it's something that isn't maintainable, because the upstream package is too hard to follow across a major version release cycle, it should be pulled from base. Otherwise, I'd say carry on as usual. Otherwise, there really isn't any difference in package organization from Linux; granted, I would still like to see granular definitions in packaging metadata so one could pick and choose between base and ports openssh for instance, but that's still a nicety that hasn't come true. Thanks, -Garrett ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org