Re: OOM killer and kernel cache reclamation rate limit in vm_pageout_scan()
On 16/10/2014 08:56, Justin T. Gibbs wrote: avg pointed out the rate limiting code in vm_pageout_scan() during discussion about PR 187594. While it certainly can contribute to the problems discussed in that PR, a bigger problem is that it can allow the OOM killer to be triggered even though there is plenty of reclaimable memory available in the system. Any load that can consume enough pages within the polling interval to hit the v_free_min threshold (e.g. multiple 'dd if=/dev/zero of=/file/on/zfs') can make this happen. The product I’m working on does not have swap configured and treats any OOM trigger as fatal, so it is very obvious when this happens. :-) I’ve tried several things to mitigate the problem. The first was to ignore rate limiting for pass 2. However, even though ZFS is guaranteed to receive some feedback prior to OOM being declared, my testing showed that a trivial load (a couple dd operations) could still consume enough of the reclaimed space to leave the system below its target at the end of pass 2. After removing the rate limiting entirely, I’ve so far been unable to kill the system via a ZFS induced load. I understand the motivation behind the rate limiting, but the current implementation seems too simplistic to be safe. The documentation for the Solaris slab allocator provides good motivation for their approach of using a “sliding average” to reign in temporary bursts of usage without unduly harming efficient service for the recorded steady-state memory demand. Regardless of the approach taken, I believe that the OOM killer must be a last resort and shouldn’t be called when there are caches that can be culled. FWIW, I have this toy branch: https://github.com/avg-I/freebsd/compare/experiment/uma-cache-trimming Not all commits are relevant to the problem and some things are unfinished. Not sure if the changes would help your case either... One other thing I’ve noticed in my testing with ZFS is that it needs feedback and a little time to react to memory pressure. Calling it’s lowmem handler just once isn’t enough for it to limit in-flight writes so it can avoid reuse of pages that it just freed up. But, it doesn’t take too long to react ( I've been thinking about this and maybe we need to make arc_memory_throttle() more aggressive on FreeBSD. I can't say that I really follow the logic of that code, though. 1sec in the profiling I’ve done). Is there a way in vm_pageout_scan() that we can better record that progress is being made (pages were freed in the pass, even if some/all of them were consumed again) and allow more passes before the OOM killer is invoked in this case? -- Andriy Gapon ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: OOM killer and kernel cache reclamation rate limit in vm_pageout_scan()
On 16/10/2014 08:56, Justin T. Gibbs wrote: avg pointed out the rate limiting code in vm_pageout_scan() during discussion about PR 187594. While it certainly can contribute to the problems discussed in that PR, a bigger problem is that it can allow the OOM killer to be triggered even though there is plenty of reclaimable memory available in the system. Any load that can consume enough pages within the polling interval to hit the v_free_min threshold (e.g. multiple 'dd if=/dev/zero of=/file/on/zfs') can make this happen. The product I’m working on does not have swap configured and treats any OOM trigger as fatal, so it is very obvious when this happens. :-) I’ve tried several things to mitigate the problem. The first was to ignore rate limiting for pass 2. However, even though ZFS is guaranteed to receive some feedback prior to OOM being declared, my testing showed that a trivial load (a couple dd operations) could still consume enough of the reclaimed space to leave the system below its target at the end of pass 2. After removing the rate limiting entirely, I’ve so far been unable to kill the system via a ZFS induced load. I understand the motivation behind the rate limiting, but the current implementation seems too simplistic to be safe. The documentation for the Solaris slab allocator provides good motivation for their approach of using a “sliding average” to reign in temporary bursts of usage without unduly harming efficient service for the recorded steady-state memory demand. Regardless of the approach taken, I believe that the OOM killer must be a last resort and shouldn’t be called when there are caches that can be culled. FWIW, I have this toy branch: https://github.com/avg-I/freebsd/compare/experiment/uma-cache-trimming Not all commits are relevant to the problem and some things are unfinished. Not sure if the changes would help your case either... One other thing I’ve noticed in my testing with ZFS is that it needs feedback and a little time to react to memory pressure. Calling it’s lowmem handler just once isn’t enough for it to limit in-flight writes so it can avoid reuse of pages that it just freed up. But, it doesn’t take too long to react ( I've been thinking about this and maybe we need to make arc_memory_throttle() more aggressive on FreeBSD. I can't say that I really follow the logic of that code, though. 1sec in the profiling I’ve done). Is there a way in vm_pageout_scan() that we can better record that progress is being made (pages were freed in the pass, even if some/all of them were consumed again) and allow more passes before the OOM killer is invoked in this case? -- Andriy Gapon ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Resizing a zpool as a VMware ESXi guest ...
On 1010T1529, Matthew Grooms wrote: All, I am a long time user and advocate of FreeBSD and manage a several deployments of FreeBSD in a few data centers. Now that these environments are almost always virtual, it would make sense that FreeBSD support for basic features such as dynamic disk resizing. It looks like most of the parts are intended to work. Kudos to the FreeBSD foundation for seeing the need and sponsoring dynamic increase of online UFS filesystems via growfs. Unfortunately, it would appear that there are still problems in this area, such as ... a) cam/geom recognizing when a drive's size has increased b) zpool recognizing when a gpt partition size has increased For example, if I do an install of FreeBSD 10 on VMware using ZFS, I see the following ... root@zpool-test:~ # gpart show = 34 16777149 da0 GPT (8.0G) 34 10241 freebsd-boot (512K) 1058 41943042 freebsd-swap (2.0G) 4195362 125818213 freebsd-zfs (6.0G) If I increase the VM disk size using VMware to 16G and rescan using camcontrol, this is what I see ... camcontrol rescan does not force fetching the updated disk size. AFAIK there is no way to do that. However, this should happen automatically, if the other side properly sends proper Unit Attention after resizing. No idea why this doesn't happen with VMWare. Reboot obviously clears things up. [..] Now I want the claim the additional 14 gigs of space for my zpool ... root@zpool-test:~ # zpool status pool: zroot state: ONLINE scan: none requested config: NAME STATE READ WRITE CKSUM zroot ONLINE 0 0 0 gptid/352086bd-50b5-11e4-95b8-0050569b2a04 ONLINE 0 0 0 root@zpool-test:~ # zpool set autoexpand=on zroot root@zpool-test:~ # zpool online -e zroot gptid/352086bd-50b5-11e4-95b8-0050569b2a04 root@zpool-test:~ # zpool list NAMESIZE ALLOC FREECAP DEDUP HEALTH ALTROOT zroot 5.97G 876M 5.11G14% 1.00x ONLINE - The zpool appears to still only have 5.11G free. Lets reboot and try again ... Interesting. This used to work; actually either of those (autoexpand or online -e) should do the trick. root@zpool-test:~ # zpool set autoexpand=on zroot root@zpool-test:~ # zpool online -e zroot gptid/352086bd-50b5-11e4-95b8-0050569b2a04 root@zpool-test:~ # zpool list NAMESIZE ALLOC FREECAP DEDUP HEALTH ALTROOT zroot 14.0G 876M 13.1G 6% 1.00x ONLINE - Now I have 13.1G free. I can add this space to any of my zfs volumes and it picks the change up immediately. So the question remains, why do I need to reboot the OS twice to allocate new disk space to a volume? FreeBSD is first and foremost a server operating system. Servers are commonly deployed in data centers. Virtual environments are now commonplace in data centers, not the exception to the rule. VMware still has the vast majority of the private virutal environment market. I assume that most would expect things like this to work out of the box. Did I miss a required step or is this fixed in CURRENT? Looks like genuine bugs (or rather, one missing feature and one bug). Filling PRs for those might be a good idea. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Resizing a zpool as a VMware ESXi guest ...
On Oct 16, 2014, at 1:10, Edward Tomasz Napierała tr...@freebsd.org wrote: camcontrol rescan does not force fetching the updated disk size. AFAIK there is no way to do that. However, this should happen automatically, if the other side properly sends proper Unit Attention after resizing. No idea why this doesn't happen with VMWare. Reboot obviously clears things up. [..] Is open-vm-tools installed? I ask because if I don't have it installed and the kernel modules loaded, VMware doesn't notify the guest OS of disks being added/removed. Also, what disk controller are you using? Cheers. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: OOM killer and kernel cache reclamation rate limit in vm_pageout_scan()
Unfortunately ZFS doesn't prevent new inflight writes until it hits zfs_dirty_data_max, so while what your suggesting will help, if the writes come in quick enough I would expect it to still be able to out run the pageout. - Original Message - From: Justin T. Gibbs gi...@freebsd.org To: freebsd-current@freebsd.org Cc: a...@freebsd.org; Andriy Gapon a...@freebsd.org Sent: Thursday, October 16, 2014 6:56 AM Subject: OOM killer and kernel cache reclamation rate limit in vm_pageout_scan() avg pointed out the rate limiting code in vm_pageout_scan() during discussion about PR 187594. While it certainly can contribute to the problems discussed in that PR, a bigger problem is that it can allow the OOM killer to be triggered even though there is plenty of reclaimable memory available in the system. Any load that can consume enough pages within the polling interval to hit the v_free_min threshold (e.g. multiple 'dd if=/dev/zero of=/file/on/zfs') can make this happen. The product I’m working on does not have swap configured and treats any OOM trigger as fatal, so it is very obvious when this happens. :-) I’ve tried several things to mitigate the problem. The first was to ignore rate limiting for pass 2. However, even though ZFS is guaranteed to receive some feedback prior to OOM being declared, my testing showed that a trivial load (a couple dd operations) could still consume enough of the reclaimed space to leave the system below its target at the end of pass 2. After removing the rate limiting entirely, I’ve so far been unable to kill the system via a ZFS induced load. I understand the motivation behind the rate limiting, but the current implementation seems too simplistic to be safe. The documentation for the Solaris slab allocator provides good motivation for their approach of using a “sliding average” to reign in temporary bursts of usage without unduly harming efficient service for the recorded steady-state memory demand. Regardless of the approach taken, I believe that the OOM killer must be a last resort and shouldn’t be called when there are caches that can be culled. One other thing I’ve noticed in my testing with ZFS is that it needs feedback and a little time to react to memory pressure. Calling it’s lowmem handler just once isn’t enough for it to limit in-flight writes so it can avoid reuse of pages that it just freed up. But, it doesn’t take too long to react ( 1sec in the profiling I’ve done). Is there a way in vm_pageout_scan() that we can better record that progress is being made (pages were freed in the pass, even if some/all of them were consumed again) and allow more passes before the OOM killer is invoked in this case? — Justin ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: installincludes, bsd.incs.mk and param.h
Bezüglich Ian Lepore's Nachricht vom 14.10.2014 19:00 (localtime): … The old code that used to work for you got the version via sysctl, so I was recommending that you keep doing that yourself, since it's no longer built in to bsd.ports.mk. So just add export OSVERSION=`sysctl kern.osreldate` to your script that kicks off this update process, something like that. Thank you for your support! Like for many others, the former OSVERSION detection wasn't working very well for me with jail environments (python broke e.g.). Therefore I had worked arround it defferently, nonetheless I'm not happy with reverting to the old behaviour. Since /usr/include gets populated regardless if WITHOUT_TOOLCHAIN=true was set in src.conf, I think it's a good idea to have the one param.h also installed, regardless of the option. Please see https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=194401 Thanks, -Harry signature.asc Description: OpenPGP digital signature
Re: Resizing a zpool as a VMware ESXi guest ...
On 2014-10-16 04:17, Garrett Cooper wrote: On Oct 16, 2014, at 1:10, Edward Tomasz Napierała tr...@freebsd.org wrote: camcontrol rescan does not force fetching the updated disk size. AFAIK there is no way to do that. However, this should happen automatically, if the other side properly sends proper Unit Attention after resizing. No idea why this doesn't happen with VMWare. Reboot obviously clears things up. [..] Is open-vm-tools installed? I ask because if I don't have it installed and the kernel modules loaded, VMware doesn't notify the guest OS of disks being added/removed. Also, what disk controller are you using? Cheers. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org I duplicated this behavior. According to gpart The virtual disk does not grow until the freebsd guest is rebooted. FreeBSD freebsd10 10.0-RELEASE-p6 FreeBSD 10.0-RELEASE-p6 #0: Tue Jun 24 07:47:37 UTC 2014 r...@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC pkg info -- amd64 open-vm-tools-nox11-1280544_8,1 Open VMware tools for FreeBSD VMware guests ESXi reported -- Running, version:2147483647 (3rd-party/Independent) ESXi-5.5-1331820(A00) Guest Hardware version 10 789 - S 0:00.54 /usr/local/bin/vmtoolsd -c /usr/local/share/vmware-tools/ Id Refs AddressSize Name 1 12 0x8020 15f03b0 kernel 21 0x81a12000 5209 fdescfs.ko 31 0x81a18000 2198 vmmemctl.ko 41 0x81a1b000 23d8 vmxnet.ko 51 0x81a1e000 2bf0 vmblock.ko 61 0x81a21000 81b4 vmhgfs.ko --mikej ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: zfs recv hangs in kmem arena
The zfs recv / kmem arena hang happens with -CURRENT as well as 10-STABLE, on two different systems, with 16GB or 32GB of RAM, from memstick or normal multi-user environments, Hangs usually seem to hapeen 1TB to 3TB in, but last night one run hung after only 4.35MB. On 9/26/2014 1:42 AM, James R. Van Artsdalen wrote: FreeBSD BLACKIE.housenet.jrv 10.1-BETA2 FreeBSD 10.1-BETA2 #2 r272070M: Wed Sep 24 17:36:56 CDT 2014 ja...@blackie.housenet.jrv:/usr/obj/usr/src/sys/GENERIC amd64 With current STABLE10 I am unable to replicate a ZFS pool using zfs send/recv without zfs hanging in state kmem arena, within the first 4TB or so (of a 23TB Pool). The most recent attempt used this command line SUPERTEX:/root# zfs send -R BIGTEX/UNIX@syssnap | ssh BLACKIE zfs recv -duvF BIGTOX though local replications fail in kmem arena too. The two machines I've been attempting this on have 16BG and 32GB of RAM each and are otherwise idle. Any suggestions on how to get around, or investigate, kmem arena? # top last pid: 3272; load averages: 0.22, 0.22, 0.23 up 0+08:25:02 01:32:07 34 processes: 1 running, 33 sleeping CPU: 0.0% user, 0.0% nice, 0.1% system, 0.0% interrupt, 99.9% idle Mem: 21M Active, 82M Inact, 15G Wired, 28M Cache, 450M Free ARC: 12G Total, 24M MFU, 12G MRU, 23M Anon, 216M Header, 47M Other Swap: 16G Total, 16G Free PID USERNAMETHR PRI NICE SIZERES STATE C TIMEWCPU COMMAND 1173 root 1 520 86476K 7780K select 0 124:33 0.00% sshd 1176 root 1 460 87276K 47732K kmem a 3 48:36 0.00% zfs 968 root 32 200 12344K 1888K rpcsvc 0 0:13 0.00% nfsd 1009 root 1 200 25452K 2864K select 3 0:01 0.00% ntpd ... ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: CURRENT: EFI boot failure
Bezüglich O. Hartmann's Nachricht vom 04.10.2014 08:47 (localtime): … Sorry, forget the suggestion, it doesn't work since it leads to CFLAG -march= and the same problem occurs. For my case this works: --- sys/boot/efi/Makefile.inc.orig 2014-09-23 16:22:46.0 +0200 +++ sys/boot/efi/Makefile.inc 2014-09-23 16:46:30.0 +0200 @@ -2,6 +2,10 @@ BINDIR?= /boot +.if ${CPUTYPE} == core-avx2 +CPUTYPE= core-avx-i +.endif + .if ${MACHINE_CPUARCH} == i386 CFLAGS+=-march=i386 .endif JFI -Harry Has this problem anyhow seriously been addressed? I run into this very often on several platforms with HAswell-based CPUs (other systems with IvyBridge or SandyBridge are still to be migrated to UEFI boot, so I do not have any older architectures at hand to proof whether this issue is still present or not on Non-AVX2 systems. If there is no progress so far, would it be well-advised to open a PR? Unofrtunately I don't really have qualified knwoledge about compiler optimazations nor any efi binary knwoledge. Opening a PR is really needed, this issue shouldn't be left unchecked. But I'd prefer if someone does it, who understands what Matt Fleming answered in http://lists.freebsd.org/pipermail/freebsd-current/2014-September/052354.html Anyone? Thanks, -Harry signature.asc Description: OpenPGP digital signature
Re: OOM killer and kernel cache reclamation rate limit in vm_pageout_scan()
On 16/10/2014 12:08, Steven Hartland wrote: Unfortunately ZFS doesn't prevent new inflight writes until it hits zfs_dirty_data_max, so while what your suggesting will help, if the writes come in quick enough I would expect it to still be able to out run the pageout. As I've mentioned, arc_memory_throttle() also plays role in limiting the dirty data. -- Andriy Gapon ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: zfs recv hangs in kmem arena
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On 10/16/14 4:25 AM, James R. Van Artsdalen wrote: The zfs recv / kmem arena hang happens with -CURRENT as well as 10-STABLE, on two different systems, with 16GB or 32GB of RAM, from memstick or normal multi-user environments, Hangs usually seem to hapeen 1TB to 3TB in, but last night one run hung after only 4.35MB. On 9/26/2014 1:42 AM, James R. Van Artsdalen wrote: FreeBSD BLACKIE.housenet.jrv 10.1-BETA2 FreeBSD 10.1-BETA2 #2 r272070M: Wed Sep 24 17:36:56 CDT 2014 ja...@blackie.housenet.jrv:/usr/obj/usr/src/sys/GENERIC amd64 With current STABLE10 I am unable to replicate a ZFS pool using zfs send/recv without zfs hanging in state kmem arena, within the first 4TB or so (of a 23TB Pool). The most recent attempt used this command line SUPERTEX:/root# zfs send -R BIGTEX/UNIX@syssnap | ssh BLACKIE zfs recv -duvF BIGTOX though local replications fail in kmem arena too. The two machines I've been attempting this on have 16BG and 32GB of RAM each and are otherwise idle. Any suggestions on how to get around, or investigate, kmem arena? # top last pid: 3272; load averages: 0.22, 0.22, 0.23 up 0+08:25:02 01:32:07 34 processes: 1 running, 33 sleeping CPU: 0.0% user, 0.0% nice, 0.1% system, 0.0% interrupt, 99.9% idle Mem: 21M Active, 82M Inact, 15G Wired, 28M Cache, 450M Free ARC: 12G Total, 24M MFU, 12G MRU, 23M Anon, 216M Header, 47M Other Swap: 16G Total, 16G Free PID USERNAMETHR PRI NICE SIZERES STATE C TIME WCPU COMMAND 1173 root 1 520 86476K 7780K select 0 124:33 0.00% sshd 1176 root 1 460 87276K 47732K kmem a 3 48:36 0.00% zfs 968 root 32 200 12344K 1888K rpcsvc 0 0:13 0.00% nfsd 1009 root 1 200 25452K 2864K select 3 0:01 0.00% ntpd ... What does procstat -kk 1176 (or the PID of your 'zfs' process that stuck in that state) say? Cheers, -BEGIN PGP SIGNATURE- iQIcBAEBCgAGBQJUP+5vAAoJEJW2GBstM+ns0v4P/31s7geR2j22etrRnfReUxbb lbex0VkmLGm23TbTj2vpVce+ogBeA4zo6h4WzF/yYt2372MpWOfnEoVX2yOuuGku AFapewXS3UMXLzaRWrdTWng1KQlOyQykAHI2rvQLlYlQNTLA5AbUm6uzNXaKpD8s PbckREQ6wHnpZOiRcMN695QstjBNCal+XJHgvrwTfyp9vdFrPVD4UHnsN7MU6QSO XobxOqbuw4Tq95mgYJqrjk+xEYMgzUy2zkVp2QTCBXZn3T3yroI2RcgUZQWaw5SO xRegPa5jfJqcQJAdSxl8oVs9Sz8+5YDeksAnjCOxIQzLZBbNho+SOAzi+kjnT6W7 ijTc20z5eioQVPekdJ4MBweBsAeS1aGi8VWppuP+ZDLoirmxB0LaZyRv/W/HRQDD j4CoZswkndh+J+9Crsa9SUkfNGNvVVNjhJUGyIfTGFUsMbWTAWwa4SMj7Ad04aqW yhg+Ab4H3Yc14TahtX0jrhD3sTBer6ZoMFKE3tl8aStGXHVMyPkj0PHg5xjZEWL2 XGF86eoIgx03A9sIdbdHEZpyTMksfNatDXZk5XpPGF/sVd6txUoYP4Ch2wD8YRFM O5Ny2r6ash2rZYmlyjf19n4gvKebdGo8d8NbzOJ3oYue6OI/88cu0rv6xLV9hHSF fwgIbPo5uK4hIpEm0Dk4 =qY45 -END PGP SIGNATURE- ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
[CFT] multiple instance support in rc.d script
[Please reply to freebsd-rc@] Hi, I would like your feedback and testers of the attached patch. This implements multiple instance support in rc.d scripts. You can try it by replacing /etc/rc.subr with the attached one. More details are as follow. Typically, an rc.d/foo script has the following structure and rc.conf variables: /etc/rc.d/foo: name=foo rcvar=foo_enable ... load_rc_command $name run_rc_command $* /etc/rc.conf: foo_enable=YES foo_flags=-f -l -a -g The above supports one instance for one script. After replacing rc.subr, you can specify additional instances in rc.conf: /etc/rc.conf: foo_instances=one two foo_one_enable=YES foo_one_flags=-f -l -a -g foo_two_enable=YES foo_two_flags=-F -L -A -G $foo_instances defines instances by space-separated list of instance names, and rc.conf variables for them are something like ${name}_${instname}_enable. The following command # service foo start starts foo_one and foo_two with the specified flags. Instances can be specified in the following form: # service foo start:one or multiple instances in a particular order: # service foo start:two,one Basically, no change is required for the rc.d/foo script itself. However, there is a problem that default values of the instantiated variables are not defined. For example, if an rc.d/script uses $foo_mode, you need to define $foo_one_mode. The default value of $foo_mode is usually defined in etc/defaults/rc.conf for rc.d scripts in the base system and : ${foo_mode:=value} idiom in scripts from Ports Collection. So all of the variables should be defined for each instance, too. As you noticed, this is not easy without editing the script itself. To alleviate this, set_rcvar() can be used: /etc/rc.d/foo: name=foo rcvar=foo_enable set_rcvar foo_enable YES Enable $name set_rcvar foo_program/tmp/test Command for $name ... load_rc_command $name run_rc_command $* The three arguments are varname, default value, and description. If a variable is defined by set_rcvar(), default values instantiated variables will be set automatically---foo_one_program is set by foo_program if it is not defined. This approach still has another problem. set_rcvar() is not supported in all branches, so a script using it does not work in old supported branches. One solution which can be used for scripts in Ports Collection is adding both definitions before and after load_rc_command() until EoL of old branches like this: /etc/rc.d/foo: name=foo rcvar=foo_enable if type set_rcvar /dev/null 21; then set_rcvar foo_enableYES Enable $name set_rcvar foo_program /tmp/test Command for $name fi ... load_rc_command $name # will be removed after all supported branches have set_rcvar(). if ! type set_rcvar /dev/null 21; then : ${foo_enable:=YES} : ${foo_program:=/tmp/test} for _i in $foo_instances; do for _j in enable program; do eval : \${foo_${_i}_enable:=\$foo_$_j} done done fi run_rc_command $* This is a bit ugly but should work fine. I am using this patch to invoke multiple named (caching server/contents server) and syslogd (local only/listens INET/INET6 socket only) daemons. While $foo_instances is designed as a user-defined knob, this can be applied to software which need to invoke multiple/different daemons which depend on each other in a script, too. I am feeling this patch still needs more careful review from others. Any comments are welcome. Thank you. -- Hiroki Index: etc/rc.subr === --- etc/rc.subr (revision 272976) +++ etc/rc.subr (working copy) @@ -698,7 +698,10 @@ # start stop restart rcvar status poll ${extra_commands} # If there's a match, run ${argument}_cmd or the default method # (see below). +# _run_rc_command0() is the main routine and run_rc_command() is +# a wrapper to handle multiple instances. # +# # If argument has a given prefix, then change the operation as follows: # Prefix Operation # -- - @@ -755,6 +758,9 @@ # # ${name}_nice n Nice level to run ${command} at. # +# ${name}_pidfile n This to be used in /etc/rc.conf to override +#${pidfile}. +# # ${name}_user n User to run ${command} as, using su(1) if not #using ${name}_chroot. #Requires /usr to be mounted. @@ -863,6 +869,57 @@ # run_rc_command() { + local _act _instances _name _desc _rcvar + + _act=$1 + shift + eval _instances=\$${name}_instances + + # Check if instance is specified, e.g. start:instance, + case ${_act%:*} in + $_act) ;; # no instance specified + *) + _instances=$(echo ${_act#*:} | tr , ) + _act=${_act%:*} + ;; + esac + + # Use
Re: [CFT] multiple instance support in rc.d script
On 2014-10-16 21:22, Hiroki Sato wrote: [Please reply to freebsd-rc@] Hi, I would like your feedback and testers of the attached patch. This implements multiple instance support in rc.d scripts. You can try it by replacing /etc/rc.subr with the attached one. More details are as follow. Typically, an rc.d/foo script has the following structure and rc.conf variables: /etc/rc.d/foo: name=foo rcvar=foo_enable ... load_rc_command $name run_rc_command $* /etc/rc.conf: foo_enable=YES foo_flags=-f -l -a -g The above supports one instance for one script. After replacing rc.subr, you can specify additional instances in rc.conf: /etc/rc.conf: foo_instances=one two foo_one_enable=YES foo_one_flags=-f -l -a -g foo_two_enable=YES foo_two_flags=-F -L -A -G $foo_instances defines instances by space-separated list of instance names, and rc.conf variables for them are something like ${name}_${instname}_enable. The following command # service foo start starts foo_one and foo_two with the specified flags. Instances can be specified in the following form: # service foo start:one or multiple instances in a particular order: # service foo start:two,one Basically, no change is required for the rc.d/foo script itself. However, there is a problem that default values of the instantiated variables are not defined. For example, if an rc.d/script uses $foo_mode, you need to define $foo_one_mode. The default value of $foo_mode is usually defined in etc/defaults/rc.conf for rc.d scripts in the base system and : ${foo_mode:=value} idiom in scripts from Ports Collection. So all of the variables should be defined for each instance, too. As you noticed, this is not easy without editing the script itself. To alleviate this, set_rcvar() can be used: /etc/rc.d/foo: name=foo rcvar=foo_enable set_rcvar foo_enable YES Enable $name set_rcvar foo_program /tmp/test Command for $name ... load_rc_command $name run_rc_command $* The three arguments are varname, default value, and description. If a variable is defined by set_rcvar(), default values instantiated variables will be set automatically---foo_one_program is set by foo_program if it is not defined. This approach still has another problem. set_rcvar() is not supported in all branches, so a script using it does not work in old supported branches. One solution which can be used for scripts in Ports Collection is adding both definitions before and after load_rc_command() until EoL of old branches like this: /etc/rc.d/foo: name=foo rcvar=foo_enable if type set_rcvar /dev/null 21; then set_rcvar foo_enableYES Enable $name set_rcvar foo_program /tmp/test Command for $name fi ... load_rc_command $name # will be removed after all supported branches have set_rcvar(). if ! type set_rcvar /dev/null 21; then : ${foo_enable:=YES} : ${foo_program:=/tmp/test} for _i in $foo_instances; do for _j in enable program; do eval : \${foo_${_i}_enable:=\$foo_$_j} done done fi run_rc_command $* This is a bit ugly but should work fine. I am using this patch to invoke multiple named (caching server/contents server) and syslogd (local only/listens INET/INET6 socket only) daemons. While $foo_instances is designed as a user-defined knob, this can be applied to software which need to invoke multiple/different daemons which depend on each other in a script, too. I am feeling this patch still needs more careful review from others. Any comments are welcome. Thank you. -- Hiroki This feature is quite useful. I've used the built in version that the rc.d script in memcached and it is very helpful to be able to run multiple named instances. I wonder if sysrc could be improved to support an 'append', so you can have: foo_instances=one two and do: sysrc --append foo_instances=three to get: foo_instances=one two three instead of having to do: sysrc foo_instances=`sysrc -n foo_instances` three or something more convoluted -- Allan Jude signature.asc Description: OpenPGP digital signature
Re: zfs recv hangs in kmem arena
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On 10/16/14 8:43 PM, James R. Van Artsdalen wrote: On 10/16/2014 11:12 AM, Xin Li wrote: On 9/26/2014 1:42 AM, James R. Van Artsdalen wrote: FreeBSD BLACKIE.housenet.jrv 10.1-BETA2 FreeBSD 10.1-BETA2 #2 r272070M: Wed Sep 24 17:36:56 CDT 2014 ja...@blackie.housenet.jrv:/usr/obj/usr/src/sys/GENERIC amd64 With current STABLE10 I am unable to replicate a ZFS pool using zfs send/recv without zfs hanging in state kmem arena, within the first 4TB or so (of a 23TB Pool). What does procstat -kk 1176 (or the PID of your 'zfs' process that stuck in that state) say? Cheers, SUPERTEX:/root# ps -lp 866 UID PID PPID CPU PRI NI VSZ RSS MWCHAN STAT TT TIME COMMAND 0 866 863 0 52 0 66800 29716 kmem are D+1 57:40.82 zfs recv -duvF BIGTOX SUPERTEX:/root# procstat -kk 866 PIDTID COMM TDNAME KSTACK 866 101573 zfs -mi_switch+0xe1 sleepq_wait+0x3a _cv_wait+0x16d vmem_xalloc+0x568 vmem_alloc+0x3d kmem_malloc+0x33 keg_alloc_slab+0xcd keg_fetch_slab+0x151 zone_fetch_slab+0x7e zone_import+0x40 uma_zalloc_arg+0x34e arc_get_data_buf+0x31a arc_buf_alloc+0xaa dmu_buf_will_fill+0x169 dmu_write+0xfc dmu_recv_stream+0xd40 zfs_ioc_recv+0x94e zfsdev_ioctl+0x5ca Do you have any special tuning in your /boot/loader.conf? Cheers, -BEGIN PGP SIGNATURE- iQIcBAEBCgAGBQJUQJazAAoJEJW2GBstM+ns6dQQAK4NM6X40d7tS7pqoTQvZbrD U0u5kid703tWgAlSFzvORxeOEB94BxcHu/z1a68nGhUlL2kip8SirWV9A1rqBpes i4T6asHYTcFj4OvaPfSoA7lSVsZIaLK+RscraN1b7hehSG9UExeYF8D7cRIguhoa 1Gnlv5iZZkjJZGjR0R6DmxC8C1CyNxAZBXnj1L+ofpgUzqH0Rw2TCW1XVKqMcxvI 5lpt+V0uu7MPNgjzgVy/1z5ZD2SUBPco0eHuN/Npj0c6HkmHkoWqd53vxrBhlyCP eDbzLw7QTO7PaV5hAuC9y9/X1JGlmTVa0GP2irKuE5t1bAbVwUPQqpn+iiFs1Le8 34fL/jkCeSBY6voYYj100CBU1/1mZOh93wuY6FdMVWPJp/bsjbDUtKZUmosGU86j ZMikfVNl5Jc5dmH30JGFCDOWzfaFq+V34toSfYIihaBQPyFov0Mr7De5MvQ7VJ7D qiXDcfAXE99CXzAboYpruwrbxyxTqhUmXlWp2uCPqvmo0WhVUsROmhhXhWXkG3tS S7L0n4X8kgklveirZWq/oDsg4JXNTP2ernNdAYyhD7TbG/N4INdFaVuqZkDVDgny ibwY0HEzg2zskJOJBqr9a21fZx6c2dvJ1n+j5BaAq6ve2Hw2NyvUVWfMTknp4I8j t/JJtsDNs9xokH/veS3J =aBKI -END PGP SIGNATURE- ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org