Re: [zones-discuss] Difference between capped.memory and zone.max-shm-memory
On 04/18/12 02:42 PM, Jordi Espasa Clofent wrote: El 2012-04-18 19.22, Hung-Sheng Tsao (LaoTsao) Ph.D escribió: hi may be one could add in solaris resource control used to be project based one need to setup project and limit the resoure pool then assign the poll to zone. it is not easy to use . since then many shortcut for resource pool control are added to zonecfg make it very easy to add resource control inside the zone The cuestion sill, more or less, there: It is possible to limit the amout of RAM that a zone can borrow from the global zone without rcapd? Without rcapd, it is only possible to limit certain types of ram usage: - zone.max-shm-memory: limits ram used by sysV mappings. - zone.max-locked-memory: limits ram used by locked mappings (mlock, ISM, DISM). Both of these do not limit ram used by other means, such as text pages, mapped files, and anonymous memory (like malloc()'ed memory) As far as I can understand, if a zone only uses zone.max-shm-memory instead, potencially can borrow all the available RAM. So? Correct. In this case, only memory used by sysV mappings (shmget(2), shmat(2) is limited. -Steve L. ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] Zones disappeared fron zoneadm on OpenSolaris
Never seen this failure before. Is it possible that you ran out of disk space? For reference, which opensolaris build (cat /etc/release, and uname -v). Did you do any upgrading? If so, from which prior build? It seems that you've lost your /etc/zones/index file. Perhaps there is a backup or temporary copy in /etc/zones. - Steve L. On 04/20/11 02:37 AM, Jonatan Walck wrote: I'm new to this list, checked the archives and hope I found the right place. A few days ago I rebooted a opensolaris server, at the first boot zones:default didn't go up properly, and after another reboot it did but with one quirk: It finds no zones. zoneadm list -cv shows only the global zone. zonecfg -z [zonename] works for all zones though, all configs are there. zfs list shows all the file systems, zpool status lists no errors. Anyone have a idea for how to get zones back up? How is the zoneadm list list built up? Attaching part of the service log for zones:default, showing something going wrong but I have been unable to track down the problem. [ Apr 14 22:23:51 Enabled. ] [ Apr 14 22:24:42 Executing start method (/lib/svc/method/svc-zones start). ] Booting zones: badger cannonball lolcat mudkip cerberus rockdove tapir kangaroo molerat llama tuna narwhal tiger medusa dolphin tanukiERROR: error while acquiring slave h andle of zone console for tapir: No such device or address console setup: device initialization failed ERROR: error while acquiring slave handle of zone console for kangaroo: No such file or directory console setup: device initialization failed ERROR: error while acquiring slave handle of zone console for tanuki: No such device or address console setup: device initialization failed ERROR: error while acquiring slave handle of zone console for cerberus: No such device or address console setup: device initialization failed ERROR: error while acquiring slave handle of zone console for narwhal: No such file or directory console setup: device initialization failed zone 'tapir': could not start zoneadmd zoneadm: zone 'tapir': call to zoneadmd failed zone 'cerberus': zone 'zone 'could not start tanuki': zoneadmd zone 'kangaroo': narwhal': could not start could not start could not start zoneadmd zoneadmdzoneadmdzoneadm: zone 'cerberus': call to zoneadmd failed zoneadm : zoneadmzoneadm: zone 'tanuki': call to zoneadmd failed zone ': narwhalzone '': kangaroo': call to zoneadmd failed call to zoneadmd failed . [ Apr 14 22:24:56 Method start exited with status 0. ] [ Apr 19 08:51:42 Enabled. ] [ Apr 19 08:52:11 Executing start method (/lib/svc/method/svc-zones start). ] Booting zones: badger cannonball lolcat mudkip cerberus rockdove tapir kangaroo molerat llama tuna narwhal tiger medusa dolphinERROR: error while acquiring slave handle of zone console for tuna: No such device or address console setup: device initialization failed ERROR: error while acquiring slave handle of zone console for cerberus: No such file or directory console setup: device initialization failed zone 'tuna': could not start zoneadmd zone 'cerberus': zoneadm: could not start zoneadmd zone 'tuna': zoneadm: zone 'cerberus': call to zoneadmd failed call to zoneadmd failed tanuki[ Apr 19 09:09:43 Enabled. ] [ Apr 19 09:10:16 Executing start method (/lib/svc/method/svc-zones start). ] [ Apr 19 09:10:16 Method start exited with status 0. ] Thanks for any advice, Jonatan Walck ___ zones-discuss mailing list zones-discuss@opensolaris.org ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] Solaris 10 zone migration to Solaris 11 Express
Look for unmount on this page: http://download.oracle.com/docs/cd/E19797-01/817-1592/gjwmp/index.html On 04/ 6/11 06:18 AM, Mike Gerdts wrote: On Wed 06 Apr 2011 at 02:33AM, Ketan wrote: I was testing of migrating the solaris10 zone to solaris 11 express zone. I used cpio to create the archive with following syntax #find db_zone -print | cpio -oP@ | gzip/swdump/ovpidb_zone.cpio.gz Then i created a solaris10 brand zone on the Solaris 11 environment and tried to attach the zone but i got following error. *** zoneadm -z s10zone1 attach -a /home/vneb/ovpidb_zone.cpio.gz Log File: /var/tmp/s10zone1.attach_log.oFaavh Attaching... ERROR: The image was created with an incompatible libc.so.1 hwcap lofs mount. The zone will not boot on this platform. See the zone's documentation for the recommended way to create the archive. I 'm moving solaris 10u8 zone from M5000 to a Ldom2.0 Solaris11 express It sounds like the zone was running when you created the archive. As a result, the version of libc that is optimized for the SPARC64 CPU found in the M5000 was mounted on top of /lib/libc.so.1. On the T-series box that you are moving to, the CPU architecture is different and incompatible with the type of optimization done for the SPARC64 CPU. It looks like you were following the instructions at http://download.oracle.com/docs/cd/E19963-01/html/821-1460/gentextid-12093.html#gcglo but there shut down the zone while creating the archive step seems to be missing. ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] psets for zones
When pooladm -c is done to commit the configuration, it will try and satisfy the min/max constraints of all psets. In your case, there are plenty of cpus, so pset1 gets 5 cpus. If you had more psets configured, there might not be enough cpus to give pset1 5 cpus. If the pset.min off all psets cannot be satisfied, pooladm -c will fail. -Steve L. On 03/10/11 09:54 AM, Ketan wrote: I configured the the pset1 with 2 different ways 1. poolcfg -c 'create pset pset1 (uint pset.min = 5; uint pset.max = 5)' pset pset1 int pset.sys_id 1 boolean pset.default false uint pset.min 5 uint pset.max 5 string pset.units population uint pset.load 7 uint pset.size 5 string pset.comment * * 2. 1. poolcfg -c 'create pset pset1 (uint pset.min = 1; uint pset.max = 5)' pset pset1 int pset.sys_id 1 boolean pset.default false uint pset.min 1 uint pset.max 5 string pset.units population uint pset.load 10 uint pset.size 5 string pset.comment whats the difference in both of them .. the only difference i can see is uint.min=5 and uint.min=1 but the pset.size is 5 for both the cases so is it just a different way of notation of some other difference too ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] Zone Resource Management Issue.
Do you mean zone_caps? You are looking at project_caps. On 12/13/10 01:50 AM, Ketan wrote: Ok got it .. but still if i want to check what's current usage by a whole zone/project for locked-memory what would be the best way .. i 'm using kstat -c project_caps -n 'lockedmem*' but with this command the usage is very less as reported by the application user which says that db is doin very slow and they are getting memory related errors module: capsinstance: 0 name: lockedmem_project_ class:project_caps usage 2488147563 value 36335423324 ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] All zones continuously core dump after upgrade to Solaris Express
On 11/18/10 12:38 PM, Ian Collins wrote: On 11/19/10 09:12 AM, Steve Lawrence wrote: What build are you upgrading from? 134 through 134b as recommended in the release notes. Is this during the attach -u portion of the upgrade for each zone? It happens after rebooting into the new BE. I didn't detach the zones before upgrading. Oh. I that case, you're zones are still downrev at build 134. You need to detach them, and attach them again with -u. I'm not sure if you'll be able to detach them successfully with zoneadm detach. If not, you'll need to boot back to the 134 BE, detach them, and upgrade again. -Steve Can you gather any core files (or pstacks of core files)? These might be at zonepath/root/ pstack is short: core '/tmp/xx/zoneRoot/webhost/root/core' of 3094:/sbin/init feef3c97 _fxstat (0, 8047560, 180, 8058927) + 7 08058973 st_init (fee201a8, 38, 0, fefccc54, 0, feffb804) + 8f 080543dc main (1, 8047f6c, 8047f74, feffb804) + 150 0805418d _start (1, 8047fe0, 0, 0, 7d8, 8047feb) + 7d I can send the core (it's only 2MB) if that helps. My guess is that init (in the zone) is starting using a downrev libc (aka libc not upgraded yet), and is making a system call that has changed. 12 is SIGSYS. -Steve On 11/18/10 11:14 AM, Ian Collins wrote: I run through the upgrade process on a system with half a dozen zones and on restart, they all get locked into a core dump/restart loop: Nov 19 07:57:50 i7 genunix: [ID 729207 kern.warning] WARNING: init(1M) for zone webhost (pid 3094) core dumped on signal 12: restarting automatically They all run through this cycle in tight loops. Oops. ___ zones-discuss mailing list zones-discuss@opensolaris.org ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] zonestat, prstat -Z, and cpu accounting
Peter Tribble wrote: On Tue, Jul 27, 2010 at 5:34 PM, Steve Lawrence stephen.lawre...@oracle.com wrote: Now, if you want to account for cpu utilization by children, why not use the pr_ctime member of the psinfo structure? As far as I understand it, that collects cpu for child processes that exit, so why can't that be used instead of monkeying about with extended accounting? What am I overlooking here? I haven't explored an algorithm for tracking that. At some point, the child is running, in which case I may have added its current usage to the zone's usage. When the children exit later, I would need to figure out the parent the usage was added to so I can avoid double counting usage I've already charged to the zone. Well, no, you just create the process tree and walk down it from the top. Some might worry about missing data from a child that exits while you're doing the measurement; I tend not to worry too much, because you'll catch it on the next interval in any event. I'm more worried about double counting. For instance, if I count a proc, and then it exits, and I then count its parent's child usage, I will count the processes's usage twice. I don't think that /proc guarantees to report all parents before all children when doing readdir(). I would need to investigate that further. The other bit to solve would be zone-entered processes, which will have a parent in the global zone. The usage by thein-the-zone children would bubble up to a parent in the global zone. This would certainly be wrong. Well, how common are zone-entered processes? And how should they be accounted for anyway? (What sort of examples do we have?) Anything that is zlogin'ed into the zone. Admin's can zlogin anything they want into a zone, and that usage should be counted towards the zone's usage. Thanks! ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] zonestat, prstat -Z, and cpu accounting
Peter Tribble wrote: Looking at the recent zonestat arc case reminded me of something I've been meaning to ask for a while. In the case, it says: prstat polls /proc, and will not account for cpu used by short-lived processes. and Extended accounting must be used to compute the cpu utilization because it will contain all data associated with processes which have exited. And this is something that's been discussed before. Now, if you want to account for cpu utilization by children, why not use the pr_ctime member of the psinfo structure? As far as I understand it, that collects cpu for child processes that exit, so why can't that be used instead of monkeying about with extended accounting? What am I overlooking here? I haven't explored an algorithm for tracking that. At some point, the child is running, in which case I may have added its current usage to the zone's usage. When the children exit later, I would need to figure out the parent the usage was added to so I can avoid double counting usage I've already charged to the zone. The other bit to solve would be zone-entered processes, which will have a parent in the global zone. The usage by thein-the-zone children would bubble up to a parent in the global zone. This would certainly be wrong. -Steve ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] failed to move zones in os2009.06 (attach failed)
Did you try: # zoneadm attach -z bibcmi4 -d rpool/zones/bibcmi4/ROOT/zbe-2 -d is an ipkg specific option. -Steve L. Gerard Henry wrote: hello all, i need to move zones from serv1 to serv2. Every server is os2009.06 b111b On serv1, i have, after detach: serv1 # zfs list -r rpool/zones/bibcmi4 NAME USED AVAIL REFER MOUNTPOINT rpool/zones/bibcmi4 1.46G 8.43G38K /zones/bibcmi4 rpool/zones/bibcmi4/ROOT1.46G 8.43G18K legacy rpool/zones/bibcmi4/ROOT/zbe1.33M 8.43G 828M legacy rpool/zones/bibcmi4/ROOT/zbe-2 1.45G 8.43G 877M /zones/bibcmi4/root following this post: http://mail.opensolaris.org/pipermail/zones-discuss/2010-February/006060.html i send the snapshots on serv2, and i have: serv2 # zfs list -r rpool/zones/bibcmi4 NAME USED AVAIL REFER MOUNTPOINT rpool/zones/bibcmi4 1.29G 45.9G38K /zones/bibcmi4 rpool/zones/bibcmi4/ROOT1.29G 45.9G19K legacy rpool/zones/bibcmi4/ROOT/zbe 494M 45.9G 494M legacy rpool/zones/bibcmi4/ROOT/zbe-2 825M 45.9G 825M /zones/bibcmi4/root (the last line was legacy, and i set the mountpoint manually) i did a zonecfg, with create -a but now, it fails to attach: serv2 # zoneadm -z bibcmi4 attach ERROR: The -a, -d or -r option is required when there is no active root dataset. i found another thread related to this message: http://opensolaris.org/jive/thread.jspa?messageID=439124 but unfortunately, it doesn't help. The option doesn't exist in zoneadm, so i don't understand what's happen. The official docs http://docs.sun.com/app/docs/doc/817-1592/zone?a=view doesn't take this case into account (zonepaths on zfs). I need to move zones before upgrading to b134, thanks in advance for help, gerard ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] ZFS ARC cache issue
Try zfs-discuss. Ketan wrote: We are having a server running zfs root with 64G RAM and the system has 3 zones running oracle fusion app and zfs cache is using 40G memory as per kstat zfs:0:arcstats:size. and system shows only 5G of memory is free rest is taken by kernel and 2 remaining zones. Now my problem is that fusion guys are getting not enough memory message while starting their application due to top and vmstat shows 5G as free memory. But i read ZFS cache releases memory as required by the application so why fusion application is not starting up. Is there some we can do to decrease the ARC Cache usage on the fly without rebooting the global zone ? ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] Webrev for CR 6909222
The bug mentions that this can also impact a nevada zone that was p2v'ed. Should you fix usr/src/lib/brand/native/zone as well? -Steve On Mon, Dec 21, 2009 at 03:46:00PM -0800, Jordan Vaughan wrote: I need someone to review my fix for 6909222 reboot of system upgraded from 128 to build 129 generated error from an s10 zone due to boot-archive My webrev is accessible via http://cr.opensolaris.org/~flippedb/onnv-s10c Thanks, Jordan ___ zones-discuss mailing list zones-discuss@opensolaris.org ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] Webrev for CR 6782448
Minor nit. You could use != POC_STRING, put the Z_NO_ENTRY in the {}, and put the success case after. Not a required change. LGTM. -Steve On Fri, Dec 18, 2009 at 07:28:52PM -0800, Jordan Vaughan wrote: I expanded my webrev to include my fix for 6910339 zonecfg coredumps with badly formed 'select net defrouter' I need someone to review my changes. The webrev is still accessible via http://cr.opensolaris.org/~flippedb/onnv-zone2 Thanks, Jordan ___ zones-discuss mailing list zones-discuss@opensolaris.org ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] Application leaking on local zone
I recommend using libumem on the application. Some folks were nice enough to write about it. http://blogs.sun.com/pnayak/entry/finding_memory_leaks_within_solaris http://blogs.sun.com/dlutz/entry/memory_leak_detection_with_libumem -Steve On Thu, Dec 17, 2009 at 12:09:11PM +0200, AdinaKalin wrote: Hello, I'm struggling with the following problem and I have no idea how to solve it. I'm testing an application which is running fine on a global zone,but memory leaking when installed on a local zone. The local zone has its whole root and a very simple, basic configuration: bash-3.00# zonecfg -z mdmMDMzone zonecfg:mdmMDMzone info zonename: mdmMDMzone zonepath: /mdmMDMzone brand: native autoboot: true bootargs: pool: limitpriv: default,dtrace_proc,dtrace_user,proc_priocntl,proc_lock_memory scheduling-class: FSS ip-type: shared net: address: 192.168.109.14 physical: e1000g0 defrouter not specified One of the application processes, when started on global zone, has an rss of about 5 GB ( prstat -s rss ) and it keeps this size to the end of the test. If I stop the application on global zone and I start it on local zone, the same process starts with the normal size ( 5gb on prstat -s rss ) but is growing during the test ( I saw it 25GB on a server with 32 gb RAM ) until is failing. I don't understand why is this behavior and if the application has a memory leak, why I don't see it on the global zone. Any help is more than welcome!!! ___ zones-discuss mailing list zones-discuss@opensolaris.org ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] Application leaking on local zone
If you can provide some ::findleak details about the particular memory leak, perhaps someone can help. Zones do not leak memory by design. -Steve On Thu, Dec 17, 2009 at 09:27:07PM +0200, AdinaKalin wrote: Yes. I already read this sites. The question is why on global zone there isn't a memory leak and on local zone there is ?! Steve Lawrence wrote: I recommend using libumem on the application. Some folks were nice enough to write about it. [1]http://blogs.sun.com/pnayak/entry/finding_memory_leaks_within_solaris [2]http://blogs.sun.com/dlutz/entry/memory_leak_detection_with_libumem -Steve On Thu, Dec 17, 2009 at 12:09:11PM +0200, AdinaKalin wrote: Hello, I'm struggling with the following problem and I have no idea how to solve it. I'm testing an application which is running fine on a global zone,but memory leaking when installed on a local zone. The local zone has its whole root and a very simple, basic configuration: bash-3.00# zonecfg -z mdmMDMzone zonecfg:mdmMDMzone info zonename: mdmMDMzone zonepath: /mdmMDMzone brand: native autoboot: true bootargs: pool: limitpriv: default,dtrace_proc,dtrace_user,proc_priocntl,proc_lock_memory scheduling-class: FSS ip-type: shared net: address: 192.168.109.14 physical: e1000g0 defrouter not specified One of the application processes, when started on global zone, has an rss of about 5 GB ( prstat -s rss ) and it keeps this size to the end of the test. If I stop the application on global zone and I start it on local zone, the same process starts with the normal size ( 5gb on prstat -s rss ) but is growing during the test ( I saw it 25GB on a server with 32 gb RAM ) until is failing. I don't understand why is this behavior and if the application has a memory leak, why I don't see it on the global zone. Any help is more than welcome!!! ___ zones-discuss mailing list [3]zones-disc...@opensolaris.org References Visible links 1. http://blogs.sun.com/pnayak/entry/finding_memory_leaks_within_solaris 2. http://blogs.sun.com/dlutz/entry/memory_leak_detection_with_libumem 3. mailto:zones-discuss@opensolaris.org ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] zoneadm hangs after repeated boot/halt use
Looks a lot like 6894901. Can you try build 128? -Steve On Fri, Dec 11, 2009 at 03:48:52PM -0500, Glenn Brunette wrote: As part of some Immutable Service Container[1] demonstration that I am creating for an event in January. I have the need to start/stop a zone quite a few times (as part of a Self-Cleansing[2] demo). During the course of my testing, I have been able to repeatedly get zoneadm to hang. Since I am working with a highly customized configuration, I started over with a default zone on OpenSolaris (b127) and was able to repeat this issue. To reproduce this problem use the following script after creating a zone usual the normal/default steps: isc...@osol-isc:~$ while : ; do echo `date`: ZONE BOOT pfexec zoneadm -z test boot sleep 30 pfexec zoneamd -z test halt echo `date`: ZONE HALT sleep 10 done This script works just fine for a while, but eventually zoneadm hangs (was at pass #90 in my last test). When this happens, zoneadm is shown to be consuming quite a bit of CPU: PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP 16598 root 11M 3140K run 10 0:54:49 74% zoneadm/1 A stack trace of zoneadm shows: isc...@osol-isc:~$ pfexec pstack `pgrep zoneadm` 16082:zoneadmd -z test - lwp# 1 - lwp# 2 feef41c6 door (0, 0, 0, 0, 0, 8) feed99f7 door_unref_func (3ed2, fef81000, fe33efe8, f39e) + 67 f3f3 _thrp_setup (fe5b0a00) + 9b f680 _lwp_start (fe5b0a00, 0, 0, 0, 0, 0) - lwp# 3 feef420f __door_return () + 2f - lwp# 4 feef420f door (0, 0, 0, fe140e00, f5f00, a) feed9f57 door_create_func (0, fef81000, fe140fe8, f39e) + 2f f3f3 _thrp_setup (fe5b1a00) + 9b f680 _lwp_start (fe5b1a00, 0, 0, 0, 0, 0) 16598:zoneadm -z test boot feef3fc8 door (6, 80476d0, 0, 0, 0, 3) feede653 door_call (6, 80476d0, 400, fe3d43f7) + 7b fe3d44f0 zonecfg_call_zoneadmd (8047e33, 8047730, 8078448, 1) + 124 0805792d boot_func (0, 8047d74, 100, 805ff0b) + 1cd 08060125 main (4, 8047d64, 8047d78, 805570f) + 2b9 0805576d _start (4, 8047e28, 8047e30, 8047e33, 8047e38, 0) + 7d A stack trace of zoneadmd shows: isc...@osol-isc:~$ pfexec pstack `pgrep zoneadmd` 16082:zoneadmd -z test - lwp# 1 - lwp# 2 feef41c6 door (0, 0, 0, 0, 0, 8) feed99f7 door_unref_func (3ed2, fef81000, fe33efe8, f39e) + 67 f3f3 _thrp_setup (fe5b0a00) + 9b f680 _lwp_start (fe5b0a00, 0, 0, 0, 0, 0) - lwp# 3 feef4147 __door_ucred (80a37c8, fef81000, fe23e838, feed9cfe) + 27 feed9d0d door_ucred (fe23f870, 1000, 0, 0) + 32 08058a88 server (0, fe23f8f0, 510, 0, 0, 8058a04) + 84 feef4240 __door_return () + 60 - lwp# 4 feef420f door (0, 0, 0, fe140e00, f5f00, a) feed9f57 door_create_func (0, fef81000, fe140fe8, f39e) + 2f f3f3 _thrp_setup (fe5b1a00) + 9b f680 _lwp_start (fe5b1a00, 0, 0, 0, 0, 0) A truss of zoneadm (-f -vall -wall -tall) shows this looping: 16598: door_call(6, 0x080476D0)= 0 16598: data_ptr=8047730 data_size=0 16598: desc_ptr=0x0 desc_num=0 16598: rbuf=0x807F2D8 rsize=4096 16598: close(6)= 0 16598: mkdir(/var/run/zones, 0700) Err#17 EEXIST 16598: chmod(/var/run/zones, 0700) = 0 16598: open(/var/run/zones/test.zoneadm.lock, O_RDWR|O_CREAT, 0600) = 6 16598: fcntl(6, F_SETLKW, 0x08046DC0) = 0 16598: typ=F_WRLCK whence=SEEK_SET start=0 len=0 sys=4277003009 pid=6 16598: open(/var/run/zones/test.zoneadmd_door, O_RDONLY) = 7 16598: door_info(7, 0x08047230)= 0 16598: target=16082 proc=0x8058A04 data=0x0 16598: attributes=DOOR_UNREF|DOOR_REFUSE_DESC|DOOR_NO_CANCEL 16598: uniquifier=26426 16598: close(7)= 0 16598: close(6)= 0 16598: open(/var/run/zones/test.zoneadmd_door, O_RDONLY) = 6 16082/3:door_return(0x, 0, 0x, 0xFE23FE00, 1007360) = 0 16082/3:door_ucred(0x080A37C8) = 0 16082/3:euid=0 egid=0 16082/3:ruid=0 rgid=0 16082/3:pid=16598 zoneid=0 16082/3:E: all 16082/3:I: basic 16082/3:P: all 16082/3:L: all PID 16598 is zoneadm and PID 16082 is zoneadmd. Is this a known issue? Are there any other things that I can do to
Re: [zones-discuss] /var/run/zones not cleaned up ?
Feature. It is the F_WRLCK operation which takes the lock. I suppose this avoids having to deal with stale lock files from dead zoneadm's. Similar for the door. The door file is he who fattaches, not he who creates the door file. Saying that, I don't see a problem with the unlock/fdetach operations removing these files, as long at they are truely done, and it ok for a new lock/door to be created. -Steve On Thu, Dec 10, 2009 at 04:37:17PM +0100, Frank Batschulat (Home) wrote: is it to be expected that after no zoneadm/zoneadmd is running anymore, /var/run/zones still contains the corresponding lock files ? (also I looked at the current threadlist of my system and no zone releated kernel threads are running anymore) osoldev.root./var/run/zones.= zoneadm list -cp 0:global:running:/::ipkg:shared -:zone2:configured:/tank/zones/zone2::ipkg:shared osoldev.root./var/run/zones.= ps -eafd|grep zone root 2961 2734 0 16:35:06 pts/2 0:00 grep zone osoldev.root./var/run/zones.= ls -la total 16 drwx-- 2 root root 335 Dec 10 12:23 . drwxr-xr-x 11 root sys 2423 Dec 10 12:21 .. -rw-r--r-- 1 root root 0 Dec 10 12:23 index.lock -rw--- 1 root root 0 Dec 10 12:21 zone1.zoneadm.lock -rw--- 1 root root 0 Dec 10 12:21 zone1.zoneadmd_door this was after a zone boot/zone halt/zone uninstall/zone delete cycle. bug, feature ? --- frankB ___ zones-discuss mailing list zones-discuss@opensolaris.org ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] s10 p2v
This feature exists in nevada, (nevada global to nevada zone), and is currently being backported to s10u9. -Steve L. On Tue, Nov 24, 2009 at 07:41:03AM -0500, Dr. Hung-Sheng Tsao wrote: hi Is there p2v in s10 to move from physical host to zone env? It seems that cpio of the apps directory should work regards ___ zones-discuss mailing list zones-discuss@opensolaris.org ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] restrcit physical memory with zone.max-locked-memory
It limits the amount of physical memory that can be pinned by a zone by mlock() or shmat(SHM_SHARE_MMU). These are typically done by databases or performance critical apps. locked memory cannot be paged out. -Steve L. On Wed, Oct 28, 2009 at 10:25:01AM -0700, Ketan wrote: So for what purpose zone.max-locked-memory is used ? -- This message posted from opensolaris.org ___ zones-discuss mailing list zones-discuss@opensolaris.org ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] Difference between resource management attribbutes
There is a kstat. Look at the output of: $ kstat -c project_caps -n 'lockedmem*' -Steve L. On Wed, Oct 21, 2009 at 05:43:03AM -0700, Ketan wrote: But there is one more thing if i set max-rss i can test it and see the task under specified project does not exceeds the specified rss value but if i set max-locked-memory to 1 gig how can i test that one ? -- This message posted from opensolaris.org ___ zones-discuss mailing list zones-discuss@opensolaris.org ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] Difference between resource management attribbutes
On Tue, Oct 20, 2009 at 10:20:14AM -0700, Ketan wrote: Can anyone answer my questions 1. Whats the difference between project.max-locked-memory and max-rss. And out these 2 which is the preferred way of limiting the physical memory in a project or zone. max-rss limits both pageable and locked physical memory used by projects, so it is an overall physical memory cap. max-locked-memory is useful in limiting applications which specifically lock physical memory. This is useful because if the project locks up to it's max-rss, there will be no memory left for it, and it will page non-stop. max-locked-memory can be set lower than max-rss to protect against this. -Steve L. 2. How to restrict the swap memory in projects There is no mechanism. swap limits are only on zones. -- This message posted from opensolaris.org ___ zones-discuss mailing list zones-discuss@opensolaris.org ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] Zone copy in Live Upgrade
On Wed, Jul 29, 2009 at 08:43:05AM +0200, Martin Rehak wrote: Hi Steve, On 2009.07.23 14:34:22 -0700, Steve Lawrence wrote: On Thu, Jul 23, 2009 at 09:14:55AM +0200, Martin Rehak wrote: Hi Steve, On 2009.07.22 12:32:01 -0700, Steve Lawrence wrote: The issue is that from the global zone context (non-zlogin), stuff like symbolic links to something like /etc could copy files from the global zone. I don't understand it. cpio preserves symlinks, so symlinks will appear just like symlinks in NGZ and files as a files. That means no mapping/no risk. Am I right? I'm not sure why this is dangerous in this case, as we are only reading from the zone, as cpio does not traverse/open sym links, it just copes the link itself. I don't see a problem with it, but you should get feedback from others as well. I see a problem with the current implementation. A spoofed cpio program in an evil non-global zone could create a desctructive cpio stream. The cpio -icdmP@ in the global zone could write to /. Another solution could be to do the restore within the context of the zlogin, to a path mounted within the zone's root. I see. Is there any reason why we are doing a zone copy in the zlogin at all? Which problems would we face if we copy a zone from global zone. That would eliminate problems with evil zone environment completely. I don't see one, but check with the install team. I'm also not sure what is being copied here. Is this clause to copy the / filesystem inside a zone, or just those added via add fs. If the latter, I'm not sure why they are being copied. Does LU treat any of the zone's filesystems as shared between BE's, similar to how it treats /export/home in the global zone? -Steve L. Many thanks -- Martin -Steve L. That's what I think. Does this all end up going through zlogin one byte at a time? Yes, the whole stream goes through zlogin from NGZ to GZ where it is expanded. What would be the problem if we wouldn't use any zlogin? Just a cpio on zone root to a cpio to other zone root? What is the risk there? Thank you -- Martin -Steve On Wed, Jul 22, 2009 at 04:57:47PM +0200, Martin Rehak wrote: Hi, I am trying to get Live Upgrade better by reimplementing some parts of the code. What I am not sure of is whether is it safe to do a copy of non global zone imports (filesystems dedicated to a zone in its config) from the global zone. This is existing code (lucopy.sh:1808, install-nv-clone): http://grok.czech.sun.com:8080/source/xref/install-nv-clone/usr/src/cmd/inst/liveupgrade/scripts/lucopy.sh 1808 ( 1809 fgrep -xv $mountpoint /tmp/lucopy.zonefs.$$ 1810 cat /tmp/lucopy.zoneipd.$$ 1811 ) | sed 's+.*+^/+' | 1812 zlogin $ozonename \ 1813 cat /tmp/lucopy.excl.$$; \ 1814 ( 1815 if [ -s /tmp/lucopy.excl.$$ ]; then 1816 cd $zroot$mountpoint \ 1817 find . -depth -print | \ 1818 egrep -vf /tmp/lucopy.excl.$$ | \ 1819 cpio -ocmP@ 1820 else 1821 cd $zroot$mountpoint \ 1822find . -depth -print | cpio -ocmP@ 1823 fi 1824 ) | 1825 ( cd $tdir cpio -icdmP@ ) 1826 lulib_unmount_pathname $tdir To describe it, I would say that it will zlogin into the non global zone, generates there a listing which it sends onto stdin of cpio which writes an archive on its stdout. That archive is directed to the stdin of cpio running _OUTSIDE_ the zone (in the global zone) which finally expands it and writes it to a target directory. Unfortunatelly few lines above there is this comment: 1769 # Mount each non-lofs zone import in a temporary location 1770 # and copy over the bits that belong there, extracted from 1771 # the running zone. We are now reaching through zone- 1772 # controlled paths and thus must be extremely careful. 1773 # Direct copies are not safe. And the question is: What can happen if I simply will not generate the listing and the archive inside the zone but will do it in the global zone and using 'cpio -p'? If I generalize the problem a little bit more I would like to know your opinion about my idea of copying whole BE including zones in just one 'cpio -p'. Why it wouldn't work, please? Thank you very much for your any reply -- Martin Rehak ___ zones-discuss mailing list zones-discuss@opensolaris.org ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] Zone copy in Live Upgrade
On Thu, Jul 23, 2009 at 09:14:55AM +0200, Martin Rehak wrote: Hi Steve, On 2009.07.22 12:32:01 -0700, Steve Lawrence wrote: The issue is that from the global zone context (non-zlogin), stuff like symbolic links to something like /etc could copy files from the global zone. I don't understand it. cpio preserves symlinks, so symlinks will appear just like symlinks in NGZ and files as a files. That means no mapping/no risk. Am I right? I'm not sure why this is dangerous in this case, as we are only reading from the zone, as cpio does not traverse/open sym links, it just copes the link itself. I don't see a problem with it, but you should get feedback from others as well. I see a problem with the current implementation. A spoofed cpio program in an evil non-global zone could create a desctructive cpio stream. The cpio -icdmP@ in the global zone could write to /. Another solution could be to do the restore within the context of the zlogin, to a path mounted within the zone's root. -Steve L. That's what I think. Does this all end up going through zlogin one byte at a time? Yes, the whole stream goes through zlogin from NGZ to GZ where it is expanded. What would be the problem if we wouldn't use any zlogin? Just a cpio on zone root to a cpio to other zone root? What is the risk there? Thank you -- Martin -Steve On Wed, Jul 22, 2009 at 04:57:47PM +0200, Martin Rehak wrote: Hi, I am trying to get Live Upgrade better by reimplementing some parts of the code. What I am not sure of is whether is it safe to do a copy of non global zone imports (filesystems dedicated to a zone in its config) from the global zone. This is existing code (lucopy.sh:1808, install-nv-clone): http://grok.czech.sun.com:8080/source/xref/install-nv-clone/usr/src/cmd/inst/liveupgrade/scripts/lucopy.sh 1808 ( 1809 fgrep -xv $mountpoint /tmp/lucopy.zonefs.$$ 1810 cat /tmp/lucopy.zoneipd.$$ 1811 ) | sed 's+.*+^/+' | 1812 zlogin $ozonename \ 1813 cat /tmp/lucopy.excl.$$; \ 1814 ( 1815 if [ -s /tmp/lucopy.excl.$$ ]; then 1816 cd $zroot$mountpoint \ 1817 find . -depth -print | \ 1818 egrep -vf /tmp/lucopy.excl.$$ | \ 1819 cpio -ocmP@ 1820 else 1821 cd $zroot$mountpoint \ 1822find . -depth -print | cpio -ocmP@ 1823 fi 1824 ) | 1825 ( cd $tdir cpio -icdmP@ ) 1826 lulib_unmount_pathname $tdir To describe it, I would say that it will zlogin into the non global zone, generates there a listing which it sends onto stdin of cpio which writes an archive on its stdout. That archive is directed to the stdin of cpio running _OUTSIDE_ the zone (in the global zone) which finally expands it and writes it to a target directory. Unfortunatelly few lines above there is this comment: 1769 # Mount each non-lofs zone import in a temporary location 1770 # and copy over the bits that belong there, extracted from 1771 # the running zone. We are now reaching through zone- 1772 # controlled paths and thus must be extremely careful. 1773 # Direct copies are not safe. And the question is: What can happen if I simply will not generate the listing and the archive inside the zone but will do it in the global zone and using 'cpio -p'? If I generalize the problem a little bit more I would like to know your opinion about my idea of copying whole BE including zones in just one 'cpio -p'. Why it wouldn't work, please? Thank you very much for your any reply -- Martin Rehak ___ zones-discuss mailing list zones-discuss@opensolaris.org ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] Zone copy in Live Upgrade
The issue is that from the global zone context (non-zlogin), stuff like symbolic links to something like /etc could copy files from the global zone. I'm not sure why this is dangerous in this case, as we are only reading from the zone, as cpio does not traverse/open sym links, it just copes the link itself. Does this all end up going through zlogin one byte at a time? -Steve On Wed, Jul 22, 2009 at 04:57:47PM +0200, Martin Rehak wrote: Hi, I am trying to get Live Upgrade better by reimplementing some parts of the code. What I am not sure of is whether is it safe to do a copy of non global zone imports (filesystems dedicated to a zone in its config) from the global zone. This is existing code (lucopy.sh:1808, install-nv-clone): http://grok.czech.sun.com:8080/source/xref/install-nv-clone/usr/src/cmd/inst/liveupgrade/scripts/lucopy.sh 1808 ( 1809 fgrep -xv $mountpoint /tmp/lucopy.zonefs.$$ 1810 cat /tmp/lucopy.zoneipd.$$ 1811 ) | sed 's+.*+^/+' | 1812 zlogin $ozonename \ 1813 cat /tmp/lucopy.excl.$$; \ 1814 ( 1815 if [ -s /tmp/lucopy.excl.$$ ]; then 1816 cd $zroot$mountpoint \ 1817 find . -depth -print | \ 1818 egrep -vf /tmp/lucopy.excl.$$ | \ 1819 cpio -ocmP@ 1820 else 1821 cd $zroot$mountpoint \ 1822find . -depth -print | cpio -ocmP@ 1823 fi 1824 ) | 1825 ( cd $tdir cpio -icdmP@ ) 1826 lulib_unmount_pathname $tdir To describe it, I would say that it will zlogin into the non global zone, generates there a listing which it sends onto stdin of cpio which writes an archive on its stdout. That archive is directed to the stdin of cpio running _OUTSIDE_ the zone (in the global zone) which finally expands it and writes it to a target directory. Unfortunatelly few lines above there is this comment: 1769 # Mount each non-lofs zone import in a temporary location 1770 # and copy over the bits that belong there, extracted from 1771 # the running zone. We are now reaching through zone- 1772 # controlled paths and thus must be extremely careful. 1773 # Direct copies are not safe. And the question is: What can happen if I simply will not generate the listing and the archive inside the zone but will do it in the global zone and using 'cpio -p'? If I generalize the problem a little bit more I would like to know your opinion about my idea of copying whole BE including zones in just one 'cpio -p'. Why it wouldn't work, please? Thank you very much for your any reply -- Martin Rehak ___ zones-discuss mailing list zones-discuss@opensolaris.org ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] sysidcfg requires zlogin
If you want to configure the ip addresss within the zone with sysidcfg/hostname.* files, then you need to use exclusive ip stack zones: zonecfg -z zweb$Z set iptype=exclusive -Steve L. On Wed, Jul 15, 2009 at 03:55:27PM -0700, Patrick J. McEvoy wrote: the only thing that comes to mind for me right now is a possible mis-match of the installation repository used and the repository when the zone was created. I could re-create the master -- i.e. that which I am cloning -- if you think that would make a difference. BTW, the master has never been booted -- I assume this makes no difference. This is the script I run to test cloning: Z=1 zoneadm -z zweb$Z halt zoneadm -z zweb$Z uninstall -F zonecfg -z zweb$Z delete -F zonecfg -z zclone export | zonecfg -z zweb$Z -f - zonecfg -z zweb$Z set zonepath=/zonefs/zweb$Z zonecfg -z zweb$Z add net; set physical=vphys1; end zonecfg -z zweb$Z add net; set physical=vweb$Z; end zoneadm -z zweb$Z clone zclone zoneadm -z zweb$Z ready cp ./sysidcfg /zonefs/zweb$Z/root/etc/sysidcfg touch /zonefs/zweb$Z/root/etc/hostname.vphys1 touch /zonefs/zweb$Z/root/etc/hostname.vweb1 #zoneadm -z zweb$Z halt zoneadm -z zweb$Z boot #zlogin -C zweb$Z -- This message posted from opensolaris.org ___ zones-discuss mailing list zones-discuss@opensolaris.org ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] Parallel mount question
On Mon, Jun 29, 2009 at 08:00:28PM +0200, William Roche wrote: Hello Vladi, Yes you can use LOFS to all your zones to share the file system providing r/w access. I would even say that this is your BEST option. NFS mount in your local zones of a file system shared by the global zone is absolutely not supported (including autofs access of course). I think each zone's automounter is smart enough to use lofs instead of nfs for mounts from a non-global to a global zone. -Steve L. HTH, William. On 06/29/09 18:25, Yanakiev, Vladimir wrote: Need a help with a problem. We have VxFS file system, created in a global zone, and mounted under non-global zone as LOFS. Later, two new zones were created on the same server, that needed access to the very same file system. Someone decided to NFS-shareout this file system from the global zone, and NFS mount it on these two new zones. This (to my understanding) after few weeks corrupted bravely the file system, and today we experienced the same for second time. My question is - can I keep the file system in the global zone, loop back it (with LOFS) to all three zones, providing r/w access to all of them, without risk to corrupt it again? Thanks in advance for the help! Vladi ___ zones-discuss mailing list zones-discuss@opensolaris.org ___ zones-discuss mailing list zones-discuss@opensolaris.org ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] Weird Solaris 8 container problem (fwd)
Hey Rich, Looks like it is crashing in the jvm (in JIT code). There are also some jni libraries loaded and being called into by other threads. It would help to know what library is mapped in. I think support should be contacted. My first guess is that they are hitting either an old bug in the jvm, or whatever jni library they are using. This could be an s8c issue, but I'd have somebody look ath the java first. -Steve L. On Tue, Jun 23, 2009 at 09:35:06AM -0700, Rich Teer wrote: Hi Steve, Here are the answers to your questions, as provided by my customer. I hope the attached pstack and mdb sessions get through unscathed! Cheers, -- Rich Teer, SCSA, SCNA, SCSECA URLs: http://www.rite-group.com/rich http://www.linkedin.com/in/richteer -- Forwarded message -- Date: Thu, 18 Jun 2009 12:55:23 -0700 Thank you Rich, I am collecting the requested information. Can you help me please answer these questions better? Here are preliminary versions of the answers Do you mean you've tried both 1.2.2 and 1.5? Is the failure identical with both jvms? Answer: The server would not start up with Java 1.5. The Solaris native version of 1.2.2 crashes on the first (?) execution of the Java code. The ::stack should be correct. Have you looked at the core from the global zone? It shouldn't matter, but it can't hurt. Also use pstack on the core. Answer: I am attaching the results of pstack and MDB session to this email I did not look at the core from the global zone. I am not sure how to do that. Is it in hotspot (dynamcially generated) code? If so the function names will just be hex. Is it dying the jvm itself, or in jni code (java bindings to native code provided by Vantive)? I don't know how to answer this question. The server uses libjvm.so rather than creates a JVM session (process) using fork() or exec(). Is the application threaded? Yes, it is threaded. Thank you, Vlad P.S. Currently playing with MDB with a little success. I don't know assembler -Original Message- From: Rich Teer [mailto:rich.t...@rite-group.com] Sent: Thursday, June 18, 2009 10:14 AM To: Vladimir Ryzhov; Andy Woodward Subject: Re: [zones-discuss] Weird Solaris 8 container problem (fwd) Hi guys, Here's a response I got on the Zones mailing list about the weird Vantive crashes. Could you please reply to me with the answers to Steve's questions, and I'll forward them to the list. -- Rich Teer, SCSA, SCNA, SCSECA URLs: http://www.rite-group.com/rich http://www.linkedin.com/in/richteer -- Forwarded message -- Date: Wed, 17 Jun 2009 17:59:04 -0700 From: Steve Lawrence stephen.lawre...@sun.com To: Rich Teer rich.t...@rite-group.com Cc: Zones discuss zones-discuss@opensolaris.org Subject: Re: [zones-discuss] Weird Solaris 8 container problem On Wed, Jun 17, 2009 at 02:56:59PM -0700, Rich Teer wrote: Hi all, IHAC who is trying to run one of their applications in a Solaris 8 branded zone. The global OS is Solaris 10 5/09 and we're using Solaris 8 containers version 1.0.1, on a Sun Fire 280R server with a 750 MHz CPU and 6 GB of RAM. Although apparently quite tempremental, their app runs acceptably when run on Solaris 8 natively (i.e., S8 on bare metal rather than in a branded zone), but crashes very frequently when run in the branded container. Of course, the applicatiopn source code is unavailable... :-( The application, called Vantive, talks to an Oracle 8.1.6 database, and is written in Java. The Vantive app ships with version 1.2.2 of the Java runtime, and we've tried version 1.5. Do you mean you've tried both 1.2.2 and 1.5? Is the failure identical with both jvms? Annoyingly, when we try trussing the errant process, it doesn't crash! When a crash does happen, a core dump usually occurs, but hasn't been too helpful. The crashes do seem to be happening from within the JVM, if the ::stack output from mdb is to be believed. The ::stack should be correct. Have you looked at the core from the global zone? It shouldn't matter, but it can't hurt. Also use pstack on the core. Is it in hotspot (dynamcially generated) code? If so the function names will just be hex. Is it dying the jvm itself, or in jni code (java bindings to native code provided by Vantive)? Does this ring any bells? Is there anything we can do to help debug this? Note that the branded zone seems to work just fine apart from this one (rather major) issue. No bells. The only Java problem I've seen was in 1.1.8, and it does not exist on java 1.2+. Best to contact support to debug. Could be a jvm (or other bug) that they were lucky enough to never hit on their native s8 system. Given the truss thing, it is likely a race/timing
Re: [zones-discuss] Probs with S8 Branded Zone on b114
On Fri, Jun 05, 2009 at 07:13:14AM -0700, Rich Teer wrote: On Thu, 4 Jun 2009, Steve Lawrence wrote: That's correct. You need only install the 1.0.1 SUNWs?brandk package for each, which enable the brand(s). Cool. So I don't need to install the packages under the 1.0 tree before I install the 1.0.1 package? No. Those are only used for s10u4 or s10u5 systems. You should already have them if you install s10u6 or later. -Steve -- Rich Teer, SCSA, SCNA, SCSECA URLs: http://www.rite-group.com/rich http://www.linkedin.com/in/richteer ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] Probs with S8 Branded Zone on b114
S8C and S9C do not run on sxce or opensolaris. They can be hosted on Solaris 10, using any filesystem which supports zones, including zfs. -Steve L. On Thu, Jun 04, 2009 at 12:06:10PM -0700, Rich Teer wrote: Hi all, I'm trying to install a Solaris 8 branded zone on a 280R running SXCE b114. I've downloaded the Solaris 8 zone software version 1.0.1 and installed the SUNWs8brandr and SUNWs8brandu packages from the 1.0/Product directory, followed by the SUNWs8brandk package from the 1.0.1/Product directory. All packages appeared to install OK, except for a warning about a missing S10 Kernel patch which I ignored because I'm running SXCE b114. :-) When I run the zonecfg command to create the branded zone, it all goes pear-shaped: bash-3.2# zonecfg -z vantive_test vantive_test: No such zone configured Use 'create' to begin configuring a new zone. zonecfg:vantive_test create -t SUNWsolaris8 zonecfg:vantive_test set zonepath=/zones/vantive_test zonecfg:vantive_test set autoboot=true zonecfg:vantive_test verify vantive_test: unknown brand. vantive_test: Invalid document The zonepath will be living on ZFS, but I doubt that's important. Any clues greatfully received! Cheers, -- Rich Teer, SCSA, SCNA, SCSECA URLs: http://www.rite-group.com/rich http://www.linkedin.com/in/richteer ___ zones-discuss mailing list zones-discuss@opensolaris.org ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] Probs with S8 Branded Zone on b114
On Thu, Jun 04, 2009 at 01:12:00PM -0700, Rich Teer wrote: On Thu, 4 Jun 2009, Steve Lawrence wrote: Hi Steve, S8C and S9C do not run on sxce or opensolaris. They can be hosted on Solaris 10, using any filesystem which supports zones, including zfs. Thanks for confirming the bad news :-( Would I be correct in thinking that S8C and S9C work fine in the latest version of Solaris 10, i.e., S10 5/09? That's correct. You need only install the 1.0.1 SUNWs?brandk package for each, which enable the brand(s). -Steve Cheers, -- Rich Teer, SCSA, SCNA, SCSECA URLs: http://www.rite-group.com/rich http://www.linkedin.com/in/richteer ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] Zone Stuck in a shutting_down state
I already tried killing the zoneadmd process and issuing the halt and all it does is start back up the zoneadmd process and hang.* I can't force a crashdump on the system since I can't take the box down. Bug 6272846 makes reference to nfs version 3, (which is the version we are using), and the client apparently leaking rnodes. Is there any way to verify this other then a forced crashdump? I might take a live core of the system and open a case to see if that yields anything. The zone_ref 1 means that something in the kernel is holding the zone. You should be able to use mdb -k on the live system, and issue dcmds similar to the comments of 6272846. No need to force a crashdump or take a live crashdump. -Steve L. Derek On Wed, May 6, 2009 at 4:08 PM, Steve Lawrence [1]stephen.lawre...@sun.com wrote: zsched is always unkillable. *It will only exit when instructed to by zoneadmd. Is the remaining zone shutting down, or down? *(zoneadm list -v). What is the ref_count on the zone? # mdb -k ::walk zone | ::print zone_t zone_name zone_ref If the refcount is greater than 0x1, it could be: * * * *6272846 User orders zone death; NFS client thumbs nose No workaround for this one. *A crashdump would help investigate a zone_ref greater than 1. Is there a zoneadmd process for the given zone? # pgrep -lf zoneadmd If so, please provide *truss -p pid of this process. *You may also attempt killing this zoneadmd process (which lives in the global zone), and then re-attempting zoneadm -z zonename halt. Thanks, -Steve L. References Visible links 1. mailto:stephen.lawre...@sun.com ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] Zone Stuck in a shutting_down state
Related comments from bug below (X'ed out some paths): The zone in question clearly has too many references 030004a09680::print zone_t zone_ref zone_ref = 0t11 Ten too many, to be precise. So what's holding onto the zone? Well the rnode cache has 5 entries ::kmem_cache ! grep rnode 030003a1e988 rnode_cache 00 640 572988 030003a20988 rnode4_cache 00 9840 030003a1e988::walk kmem | ::print rnode_t r_vnode | ::vnode2path /opt/zones/z1/root/ /opt/zones/z1/root/ /opt/zones/z1/root/ /opt/zones/z1/root/ /opt/zones/z1/root/ even though no nfs filesystems are mounted ::fsinfo VFSP FS MOUNT 0187f420 ufs / 0187f508 devfs /devices 03315780 ctfs/system/contract 033156c0 proc/proc 03315600 mntfs /etc/mnttab 03315480 tmpfs /etc/svc/volatile 033153c0 objfs /system/object 0300039987c0 namefs /etc/svc/volatile/repository_door 0300039984c0 fd /dev/fd 030003a99e00 ufs /var 030003998400 tmpfs /tmp 030003a99680 tmpfs /var/run 030003a98f00 namefs /var/run/name_service_door 030003a98b40 namefs /var/run/sysevent_channels/syseventd_channel... 030003a989c0 namefs /etc/sysevent/sysevent_door 030003a98780 namefs /etc/sysevent/devfsadm_event_channel/1 030003a98540 namefs /dev/.zone_reg_door 030003a983c0 namefs /dev/.devfsadm_synch_door 030003a99380 namefs /etc/sysevent/piclevent_door 0300044b1d80 namefs /var/run/picld_door 030003a99200 ufs /opt 0300044b0700 namefs /var/run/zones/z1.zoneadmd_door And as apparent from the path, all of those rnodes refer to zone z1 through their mntinfo structure 030003a1e988::walk kmem | ::print rnode_t r_vnode-v_vfsp-vfs_data | ::print mntinfo_t mi_zone | ::zone ADDR ID NAME PATH 030004a09680 1 z1 /opt/zones/z1/root/ 030004a09680 1 z1 /opt/zones/z1/root/ 030004a09680 1 z1 /opt/zones/z1/root/ 030004a09680 1 z1 /opt/zones/z1/root/ 030004a09680 1 z1 /opt/zones/z1/root/ So if each of those rnodes has two holds on the zone, then that accounts for all of the extra holds exactly. On Wed, May 06, 2009 at 09:04:54PM -0500, Derek McEachern wrote: I don't believe that I can see the comments since they are not public. Is that something you can pass along? On Wed, May 6, 2009 at 5:27 PM, Steve Lawrence [1]stephen.lawre...@sun.com wrote: * *I already tried killing the zoneadmd process and issuing the halt and all * *it does is start back up the zoneadmd process and hang.* I can't force a * *crashdump on the system since I can't take the box down. * *Bug 6272846 makes reference to nfs version 3, (which is the version we are * *using), and the client apparently leaking rnodes. Is there any way to * *verify this other then a forced crashdump? I might take a live core of the * *system and open a case to see if that yields anything. The zone_ref 1 means that something in the kernel is holding the zone. You should be able to use mdb -k on the live system, and issue dcmds similar to the comments of 6272846. *No need to force a crashdump or take a live crashdump. -Steve L. * *Derek * *On Wed, May 6, 2009 at 4:08 PM, Steve Lawrence * *[1][2]stephen.lawre...@sun.com wrote: * * *zsched is always unkillable. *It will only exit when instructed to by * * *zoneadmd. * * *Is the remaining zone shutting down, or down? *(zoneadm list -v). * * *What is the ref_count on the zone? * * *# mdb -k * * * ::walk zone | ::print zone_t zone_name zone_ref * * *If the refcount is greater than 0x1, it could be: * * ** * * *6272846 User orders zone death; NFS client thumbs nose * * *No workaround for this one. *A crashdump would help investigate a * * *zone_ref * * *greater than 1. * * *Is there a zoneadmd process for the given zone? * * *# pgrep -lf zoneadmd * * *If so, please provide *truss -p pid of this process. *You may also * * *attempt * * *killing this zoneadmd process (which lives in the global zone), and then * * *re-attempting zoneadm -z zonename halt. * * *Thanks, * * *-Steve L. References * *Visible links * *1. mailto:[3]stephen.lawre...@sun.com
Re: [zones-discuss] Zone in a pset with high load generating high packet loss at the frame level
On Thu, Mar 05, 2009 at 01:22:25PM -0500, Jeff Victor wrote: Thanks for the great feedback Gael. Comments below. On Thu, Mar 5, 2009 at 11:00 AM, Gael gael.marti...@gmail.com wrote: On Wed, Mar 4, 2009 at 9:06 AM, Jeff Victor jeff.j.vic...@gmail.com wrote: Some questions: 1. Do you use set pool= anymore, now that the dedicated-cpu feature exists? We got over one hundred physical frames running zones here, covering nearly all versions of Solaris 10, we are currently sticking to set pool until we can get the whole environment upgraded. Before that, cannot afford to have the whole team of admins handling zones differently depending on the OS version. Headache... It is now clear to me that this feature would need to support disabling interrupts when a zone uses set pool=. Currently, all pool attributes are configured using the pool tools (poolcfg, pooladm) and I don't see any reason to not continue. When I write this up, it will fulfill that need. Ae you proposing that we add support for pset-interrupt disposition config to the pools framework? Such as a property on a pool-pset boolean pset.interrupts = false?? I think the right solution for pool= is this or similar. It could also be a string value, such as: none no interrupts handled on cpus in the pool-pset. zone Device interrupts for bound zones are serviced. any Any device interrupts can be dispatched to the pset. Zonecfg could make use of these pool-pset properties to implement the desired behavior for dedicated-cpu. The default value should be any. zonecfg should set zone for all dedicated-cpu zones. zoneadm could warn if pool= is set, the zone has dedicated devices, zone the pset for that pool has not been configured to be zone. legacy psets (psrset) could be extended to support this property via some new flags. Ther other part of this is how to reconsile zonecfg and/or pools settings for interrupts, with device-cpu mappings that are specified via dladm. Currently, dladm allows the specification of a list of cpu ids. Another way to approach this would be to point dladm directly at the desired pool. -Steve 2. Is it sufficient to simply disable interrupts on a zone's pset? In our case, we do pset only when licensing requires it (aka oracle,datastage,sybase,borland apps) or when the applications behave poorly and we keep hearing that by lack of budget/resources, the issue cannot be addressed and without direct impact on the business itself, nothing will change. Gael, I realized that my question was vague. When you use a pool, you're using a pset. Do you mean that you only use pools and psets when licensing requires it? Also, I couldn't tell how the comment responded to the question. What about creating an IO pset, and then disabling the interrupt on everything else while using it as a FSS pool or psets pools ? Very similar to ldom I would think... Yes, that occurred to me, too. You can do that now, either with a pset that's being used by a zone or with the default pset. But I'm not convinced there's enough reason to separate an I/O pset from the default pset. There's great potential for wasted CPU cycles. --JeffV ___ zones-discuss mailing list zones-discuss@opensolaris.org ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] Zone in a pset with high load generating high packet loss at the frame level
On Thu, Mar 05, 2009 at 04:12:19PM -0500, Jeff Victor wrote: On Thu, Mar 5, 2009 at 1:48 PM, Steve Lawrence stephen.lawre...@sun.com wrote: On Thu, Mar 05, 2009 at 01:22:25PM -0500, Jeff Victor wrote: On Thu, Mar 5, 2009 at 11:00 AM, Gael gael.marti...@gmail.com wrote: On Wed, Mar 4, 2009 at 9:06 AM, Jeff Victor jeff.j.vic...@gmail.com wrote: Some questions: 1. Do you use set pool= anymore, now that the dedicated-cpu feature exists? It is now clear to me that this feature would need to support disabling interrupts when a zone uses set pool=. Currently, all pool attributes are configured using the pool tools (poolcfg, pooladm) and I don't see any reason to not continue. When I write this up, it will fulfill that need. Ae you proposing that we add support for pset-interrupt disposition config to the pools framework? Such as a property on a pool-pset boolean pset.interrupts = false?? The short answer is yes. BobN and I came to the same conclusion just a few hours ago... :-) CPUs already have cpu.status which can be on-line, no-intr (LWPs but no interrupt handlers), or off-line (no LWPs but still able to handle interrupts). A pset.interrupts field would allow Solaris to set cpu.status on CPUs as they enter the pset. Zones could then use that so we can increase their isolation. When a CPU re-enters the default pset, it becomes able to handle interrupts again. When needed, intrd will give it one (or more). I think the right solution for pool= is this or similar. It could also be a string value, such as: none no interrupts handled on cpus in the pool-pset. zone Device interrupts for bound zones are serviced. any Any device interrupts can be dispatched to the pset. I don't see how we could do zone in all situations - there isn't a 1:1 mapping between zone and device (except for exclusive-IP). Imagine zoneA and zoneB on a pset (psetAB) with pset.interrupts=zone. Further, zoneA and zoneC share e1000g0, but zoneB doesn't. Finally, zoneC has its own pset. Where does the interrupt handler for e1000g0 go - psetAB or psetC? I was thinking in the exclusive case. For shared stack zones, the devices would all be bound to the global zone's (aka default) pset. Or are you suggesting that interrupts from one device can be intercepted and diverted to a CPU associated with a specific pset, based on which process the interrupt is/should be associated with? No, although I'm not sure what is configurable for vnics. It may be possible for shared stack zones using a exclusive vnic (not exclusive stack) to have some of the vnic workload bound to it's pset. Or am I misunderstanding the description of zone? Zonecfg could make use of these pool-pset properties to implement the desired behavior for dedicated-cpu. Exactly. The default value should be any. zonecfg should set zone for all dedicated-cpu zones. zoneadm could warn if pool= is set, the zone has dedicated devices, zone the pset for that pool has not been configured to be zone. The only devices we can be sure are dedicated for the boot-session of a zone are NICs. So this whole segregate the interrupts per zone/pset combo will be limited at best. It would be nice if we could generalize it like you say, but I don't think it's workable yet. Agreed. This is really just for network devices at this point. legacy psets (psrset) could be extended to support this property via some new flags. Ther other part of this is how to reconsile zonecfg and/or pools settings for interrupts, with device-cpu mappings that are specified via dladm. Currently, dladm allows the specification of a list of cpu ids. Another way to approach this would be to point dladm directly at the desired pool. Which currently are you on? :-) I'm on NV94 and I don't see anything like that in dladm(1M) Crossbow when into 105. http://blogs.sun.com/nitin/entry/resource_allocation_for_network_processing I'm beginning to think this is really a two-phase project: * Phase 1: make it easier to disable interrupts on a zone's pset (one configured with the pool property or dedicated-cpu resource) * Phase 2: optimize this by enabling a zone's pset to handle interrupts from a device which is exclusively bound to this zone. As long as phase one is compatable with phase two, meaning that this case such as this one are properly defined: 1. pool mypool has property interrupts=disabled. 2. Zone has pool=mypool 3. Zone property stating to bind network interrupts to pool. One solution would be to alow this config, and bind the net interrupts to mypool anyway. Another would be to only allow auto-net-binding in zonecfg when using dedicated-cpu. I think that most people that need any of this only need Phase 1. Agreed. Philosophically, shifting interrupt handlers into the default pset is consistent
Re: [zones-discuss] Capped-Memory - swap physical? (was: Failing to NFS mount on non-global zone)
Your config is basically what you want. The zone will be able to reserve up to 6gb of memory, 4 of which may reside in physical memory. The 4gb of ram is not reserved for the zone, so the zone could get less RAM due to global memory pressure, and use more than 2gb of disk swap. For instance, the zone could get 3 gb of ram and 3gb of disk swap (for a total of 6gb of virtual memory). -Steve L. On Mon, Feb 23, 2009 at 09:06:13PM +0100, Bernd Schemmer wrote: Hi, So if we want a to configure a Zone with 4 GB physical memory and max 2 GB swap the values have to be apped-memory: physical: 4G [swap: 6G] Is that correct? I just reread the documentation about swap and from that it's not clear to me that swap in the zone configuration is used that way regards Bernd Steve Lawrence wrote: Swap limits how much of the systems total memory (ram + disk) can be reserved. When this limit is hit, allocations, such as malloc, will fail. Physical memory limits resident memory. When this limit is hit, the zone will page pages in memory to disk swap. In general, your example config is only useful if the zone uses a lot of physical memory, but does not reserve as much swap. An example is an application which maps a large on-disk file into memory. No swap is needed for the file, (because the file can be paged back to the a filesystem), but a large amount of physical memory may be needed to pull the file into RAM. Such applications are rare, so your example config is not often used. Your basically right is saying that this config does not make any sense in most cases. -Steve L. On Fri, Feb 20, 2009 at 08:36:20PM +0100, Alexander Skwar wrote: Hi! On Fri, Feb 20, 2009 at 17:50, Asif Iqbal vad...@gmail.com wrote: capped-memory: physical: 1G [swap: 512M] A question regarding this setting - does that setting really make sense? I suppose he tries to achieve that the zone as a max. uses 1G of real memory and no more than 512M of Swap. But does it really do that? Or is he rather limiting the amount of allocable mem to 512M? Alexander -- [ Soc. = http://twitter.com/alexs77 | http://www.plurk.com/alexs77 ] [ Mehr = http://zyb.com/alexws77 ] [ Chat = Jabber: alexw...@jabber80.com | Google Talk: a.sk...@gmail.com ] [ Mehr = MSN: alexw...@live.de | Yahoo!: askwar | ICQ: 350677419 ] ___ zones-discuss mailing list zones-discuss@opensolaris.org ___ zones-discuss mailing list zones-discuss@opensolaris.org -- Bernd Schemmer, Frankfurt am Main, Germany http://home.arcor.de/bnsmb/index.html M s temprano que tarde el mundo cambiar . Fidel Castro ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] Capped-Memory - swap physical? (was: Failing to NFS mount on non-global zone)
Swap limits how much of the systems total memory (ram + disk) can be reserved. When this limit is hit, allocations, such as malloc, will fail. Physical memory limits resident memory. When this limit is hit, the zone will page pages in memory to disk swap. In general, your example config is only useful if the zone uses a lot of physical memory, but does not reserve as much swap. An example is an application which maps a large on-disk file into memory. No swap is needed for the file, (because the file can be paged back to the a filesystem), but a large amount of physical memory may be needed to pull the file into RAM. Such applications are rare, so your example config is not often used. Your basically right is saying that this config does not make any sense in most cases. -Steve L. On Fri, Feb 20, 2009 at 08:36:20PM +0100, Alexander Skwar wrote: Hi! On Fri, Feb 20, 2009 at 17:50, Asif Iqbal vad...@gmail.com wrote: capped-memory: physical: 1G [swap: 512M] A question regarding this setting - does that setting really make sense? I suppose he tries to achieve that the zone as a max. uses 1G of real memory and no more than 512M of Swap. But does it really do that? Or is he rather limiting the amount of allocable mem to 512M? Alexander -- [ Soc. = http://twitter.com/alexs77 | http://www.plurk.com/alexs77 ] [ Mehr = http://zyb.com/alexws77 ] [ Chat = Jabber: alexw...@jabber80.com | Google Talk: a.sk...@gmail.com ] [ Mehr = MSN: alexw...@live.de | Yahoo!: askwar | ICQ: 350677419 ] ___ zones-discuss mailing list zones-discuss@opensolaris.org ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] Update on attach and upgrades
On Thu, Nov 06, 2008 at 10:20:43AM -0500, Dr. Hung-Sheng Tsao (LaoTsao) wrote: anyone know when the brandz for s10 will be out? e.g. running s10 with opensolaris zone? No target has been set for this. We cannot reasonably manage such a project until s10 begins taking less change. The current understanding is the need for such a feature will co-incide with the release of an enterprise version of opensolaris, or an early update (6 months?) to an enterprise opensolaris-based release. -Steve L. Jerry Jelinek wrote: Mike Gerdts wrote: On Thu, Nov 6, 2008 at 8:16 AM, Jerry Jelinek [EMAIL PROTECTED] wrote: Henrik Johansson wrote: The easiest way would probably be to identify packages that are not to be updated, in my experience packages do not differ that much between local zones in production environments, but that is only based on the system I have worked with. I always keep zones as similar as possible, but full zones still leaves the possibility to make some changes to the packages and patches in case its necessary. Unfortunately we have no way to know which pkgs you deliberately want to be different between the global and non-global zone and which you want to be in sync. Thats why a list where the user could control that would be needed. Isn't that the purpose of pkgadd -G? -G Add package(s) in the current zone only. When used in the global zone, the package is added to the global zone only and is not propagated to any existing or yet-to-be- created non-global zone. When used in a non-global zone, the package(s) are added to the non-global zone only. This option causes package installation to fail if, in the pkginfo file for a package, SUNW_PKG_ALLZONES is set to true. See pkginfo(4). A package added to the global zone with pkgadd -G should not be upgraded in the non-global zone. The problem is when you look at a zone, how do you know what to sync with the global zone? For example, if you have a whole-root zone, that means you've explicitly decided you want the ability to manage pkgs in /usr, etc. independently of the global zone. With a true upgrade, those pkgs that are part of the release are upgraded anyway. What do we do with a zone migration? What pkg metadata do we have inside the zone to tell us which pkgs to sync and which not to? Jerry ___ zones-discuss mailing list zones-discuss@opensolaris.org ___ zones-discuss mailing list zones-discuss@opensolaris.org ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] Confirming Zone running Container
The other way that the global zone identity normally leaks through to the non-global zones is through the system's hostid. So if you compare the output of `/usr/bin/hostid` with `for e in $allglobalzones ; do ssh $e /usr/bin/hostid ; done`, you can easily see which global zone matches your local. That's also a way for your application administrators (using application-level clustering) to verify that they are not running on the same physical node. If their hostids are different, they're different. This is not always reliable. S8 and S9 branded zones have configureable hostids, and in the future this is likely to be available for native zones. -Steve L. ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] Oracle/RMAN in a Zone
I think you need this or later: http://www-01.ibm.com/support/docview.wss?rs=666context=SSTFZRuid=swg21254543loc=en_UScs=utf-8lang=en Some ibm docs: http://publib.boulder.ibm.com/infocenter/tivihelp/v1r1/index.jsp?topic=/com.ibm.itsmreadme.doc/readme_server541.html http://publib.boulder.ibm.com/infocenter/tivihelp/v1r1/index.jsp?topic=/com.ibm.itsmreadme.doc/readme_server541.html http://publib.boulder.ibm.com/infocenter/tivihelp/v1r1/index.jsp?topic=/com.ibm.itsmreadme.doc/readme_server541.html I'm not sure if the tivoli device driver protects against multiple zones accessing the same tape device at the same time, or if that is up to the admin to enforce. -Steve L. On Mon, Sep 22, 2008 at 12:57:54PM -0700, Arif Khan wrote: Is anyone out there. Can someone please point me in the right direction, any other alias that I should post this to ? Thanks Arif On Sep 19, 2008, at 10:01 AM, Arif Khan wrote: Hi, Please let me know if there is another alias that I should post this to. (already tried [EMAIL PROTECTED] and it doesn't exist ) My Customer is trying to test LAN-Free backups using Tivoli Data Protection (TDP) for Oracle/RMAN in a zone. First, Is this supported by Sun and Oracle and secondly, do we have any documentation on setting this up ? Thanks Arif ___ zones-discuss mailing list zones-discuss@opensolaris.org ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] zones/SNAP design
During the zone installation and after the zone is installed, the zone's ZBE1 dataset is explicitly mounted by the global zone onto the zone root (note, the dataset is a ZFS legacy mount so zones infrastructure itself must manage the mounting. It uses the dataset properties to determine which dataset to mount, as described below.): e.g. # mount -f zfs rpool/export/zones/z1/rpool/ZBE1 /export/zones/z1/root The rpool dataset (and by default, its child datasets) will be implicitly delegated to the zone. That is, the zonecfg for the zone does not need to explicitly mention this as a delegated dataset. The zones code must be enhanced to delegate this automatically: Is there any requirement to have a flag go disallow a zone from doing zfs/BE operations? I'm not sure when an admin may want to make this restrction. rpool/export/zones/z1/rpool Once the zone is booted, running a sw management operation within the zone does the equivalent of the following sequence of commands: 1) Create the snapshot and clone # zfs snapshot rpool/export/zones/z1/rpool/[EMAIL PROTECTED] # zfs clone rpool/export/zones/z1/rpool/[EMAIL PROTECTED] \ rpool/export/zones/z1/rpool/ZBE2 2) Mount the clone and install sw into ZBE2 # mount -f zfs rpool/export/zones/z1/rpool/ZBE2 /a 3) Install sw 4) Finish # unmount /a Within the zone, the admin then makes the new BE active by the equivalent of the following sequence of commands: # zfs set org.opensolaris.libbe:active=off rpool/export/zones/z1/rpool/ZBE1 # zfs set org.opensolaris.libbe:active=on rpool/export/zones/z1/rpool/ZBE2 Note that these commands will not need to be explicitly performed by the zone admin. Instead, a utility such as beadm does this work (see issue #2). Inside a zone, beadm should fix this. From the global zone, beadm should be able to fix a (halted?) zone in this state so that it may be booted. I think this means that the global zone should be able to do some explict beadm operations on a zone (perhaps only when it is halted?), in addition to the automatic ones that happen when the GBE is manipulated. When the zone boots, the zones infrastructure code in the global zone will look for the zone's dataset that has the org.opensolaris.libbe:active property set to on and explicitly mount it on the zone root, as with the following commands to mount the new BE based on the sw management task just performed within the zone: # umount /export/zones/z1/root # mount -f zfs rpool/export/zones/z1/rpool/ZBE2 /export/zones/z1/root Note that the global zone is still running GBE1 but the non-global zone is now using its own ZBE2. If there is more than one dataset with a matching org.opensolaris.libbe:parentbe property and the org.opensolaris.libbe:active property set to on, the zone won't boot. Likewise, if none of the datasets have this property set. When global zone sw management takes place, the following will happen. Only the active zone BE will be cloned. This is the equivalent of the following commands: # zfs snapshot -r rpool/export/zones/z1/[EMAIL PROTECTED] # zfs clone rpool/export/zones/z1/[EMAIL PROTECTED] rpool/export/zones/z1/ZBE3 (Note that this is using the zone's ZBE2 dataset created in the previous example to create a zone ZBE3 dataset, even though the global zone is going from GBE1 to GBE2.) When global zone BE is activated and the system reboots, the zone root must be explicitly mounted by the zones code: # mount -f zfs rpool/export/zones/z1/rpool/ZBE3 /export/zones/z1/root Note that the global zone and non-global zone BE names move along independently as sw management operations are performed in the global and non-global zone and the different BEs are activated, again by the global and non-global zone. One concern with this design is that the zone has access to its datasets that correspond to a global zone BE which is not active. The zone admin could delete the zone's inactive BE datasets which are associated with a non-active global zone BE, causing the zone to be unusable if the global zone boots back to an earlier global BE. One solution is for the global zone to turn off the zoned property on the datasets that correspond to a non-active global zone BE. However, there seems to be a bug in ZFS, since these datasets can still be mounted within the zone. This is being looked at by the ZFS team. If necessary, we can work around this by using a combination of a mountpoint along with turning off the canmount property, although a ZFS fix is the preferred solution. Another concern is that the zone must be able to promote one of its datasets that is associated with a non-active global zone BE. This can occur if the global zone boots back to one of its earlier BEs. This would then cause an earlier non-global zone BE to become the active BE for that zone. If the zone then
Re: [zones-discuss] [smf-discuss] 6725004 - installing single-user-mode patches automatically
On Thu, Aug 21, 2008 at 12:54:14PM -0700, Jordan Brown wrote: [ Which brain-dead mail client turns all of the spaces in the Subject into tabs? ] Zones folks: the current proposed answers to this problem involve moving system/filesystem/local into milestone/single-user. That was apparently considered and rejected as the answer for the patchadd problem that resulted in the fix that brought us here. Can you offer any insight into why that change was rejected? I assume you are targeting this change for s10. The single-user milestone is intended to mimic the traditional unix run-level 1 (S?) This is typically where an admin would run stuff like fsck (on filesystems that are not yet mounted). I don't think it is ok to change this behavior in a patch. I'm not sure I understand all the details of the problem you are trying to solve. For example, I thought it was desired that the patch service run during a boot to all, but then I saw following mail stating that the patch service should not run in this case, and something about the user explicitly booting to single user. I don't think I know what the use cases are. You may want to draft a brief ARC fastrack describing the desired behavior(s), and the issues, and perhaps proposed solutions. Getting it all on one page will faciliate a solution. -Steve L. ___ zones-discuss mailing list zones-discuss@opensolaris.org ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] [smf-discuss] 6725004 - installing single-user-mode patches automatically
The list of use cases is really pretty simple: 1) Administrator has in hand a patch that says install in single user mode. What does this administrator do? The answer seems self-evident: take the system to single-user mode (either by booting the system in single-user mode using boot -s or boot -m milestone/single-user, or dropping the system to single-user mode using init s or svcadm milestone milestone/single-user) and install the patch using patchadd. 2) An automated tool has in hand a patch that says install in single user mode. What does it do? How about: A. Make patchadd verify that the system is in single user milestone when installing a single-user patch. B. If patchadd discovers that it needs to patch a zone, patchadd should first make sure the zone's zonepath is properly mounted. An overkill for this could be to issue a svcadm enable -srt fileystem/local IF patchadd is not being run from the context of an SMF service, otherwise, fail. (sorry, no patchadd from smf services or rc*.d scripts). An alternate solution is to fail patchadd with a message stating that filesystem/local must be enabled to install the patch due to the installed zones. The admin could then do as instructed. C. (2) above will need to somehow set the milestone to single-user, wait until single user is reached, and then do the patchadd, which will do A and B. This automated tool could also do the: svcadm enable -rt fileystem/local If B fails do to the alternate solution. The automation tool could also enable filesystem/local in cases where the patchadd version the system does not have this functionality. For simplicity, perhaps just always enable filesystem/local in the automation tool after single-user is reached. I think to implement (2), at some point you are going to need to fork off some asyncronous process which changes the milestone, waits, and then addes the patch, potentially also enabling filesystem/local before patching if needed (or just always). -Steve L. It is when we start to look at solutions that the problem becomes more difficult. ___ zones-discuss mailing list zones-discuss@opensolaris.org ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] [smf-discuss] 6725004 - installing single-user-mode patches automatically
On Thu, Aug 21, 2008 at 04:01:43PM -0700, Jordan Brown wrote: Steve Lawrence wrote: A. Make patchadd verify that the system is in single user milestone when installing a single-user patch. That's a non-starter. *Many* of our customers ignore our recommendation to install patches in single-user mode, and will revolt if we attempt to enforce it. In addition, for many patches the single-user mode recommendation is only the first approximation, primarily intended for automata. If a human is installing the patch, it may be acceptable to install it after manually shutting down the affected services. That seems completely unsupportable, but ok. Admins are used to getting away with not following the patch README or suggested procedure. (A) could be dropped without much impact to the solution. B. If patchadd discovers that it needs to patch a zone, patchadd should first make sure the zone's zonepath is properly mounted. An overkill for this could be to issue a svcadm enable -srt fileystem/local IF patchadd is not being run from the context of an SMF service, otherwise, fail. (sorry, no patchadd from smf services or rc*.d scripts). No patchadd from smf services or rc*.d scripts means no automated installation of single-user patches. That's a non-starter. Your final comment addresses this. Post filesystem/local, it is safe for SMF services to call patchadd. An alternate solution is to fail patchadd with a message stating that filesystem/local must be enabled to install the patch due to the installed zones. The admin could then do as instructed. Also a killer for automated installation. C. (2) above will need to somehow set the milestone to single-user, wait until single user is reached, and then do the patchadd, which will do A and B. This automated tool could also do the: svcadm enable -rt fileystem/local If B fails do to the alternate solution. The automation tool could also enable filesystem/local in cases where the patchadd version the system does not have this functionality. For simplicity, perhaps just always enable filesystem/local in the automation tool after single-user is reached. I think to implement (2), at some point you are going to need to fork off some asyncronous process which changes the milestone, waits, and then addes the patch, potentially also enabling filesystem/local before patching if needed (or just always). I'm not happy with doing this stuff outside the bounds of SMF, or with approaches where the user is offered a single-user login while the automated tools are installing patches in the background and will asynchronously reboot the system. I don't think either is necessary. Call this requirement (no login prompt) out in your use case. I assume the patch service will patch, set the boot milestone, and reboot before the patch milestone is actually met, avoiding the maint prompt. Definately get some console messages out of the patch-service so folks don't think their boot is hung and freak out. :) My favorite approach is, approximately: 1) Move system/filesystem/local into milestone/single-user. Note that this alone addresses the issues for interactive patchadd. 2) Define milestone/patching, dependent on milestone/single-user. 3) Define new a new patching service (or services), dependent on milestone/single-user and depended on by milestone/patching. 4) When patch automation needs to install a single-user patch, have it boot the system to milestone/patching. Or just set the boot milestone if patching deferred to next reboot. 5) When the patch services are done with their work, have them let the system come up to its default milestone, or reboot it to its default milestone, as required. There are approximately two tricky parts to this puzzle: a) How does the patch tool reboot the system to milestone/patching? It could use reboot -- -m milestone/patching, but that would mean that the patching work wouldn't get done if the reboot was done through other mechanisms. It could set the system default milestone, but then how would it determine the milestone to set the system back to when patching was complete? Neither answer is pretty, but either is workable. I'm sure you could write the old boot milestone down somewhere. If the admin modifies the milestone after the patch tool sets it to milestone-patching, the patch-on-next-boot will just get clobbered. I suppose the patch-service could then re-instate it on the next boot, and hope to get it on the subsequent boot. I suppose being inside the bounds of SMF also makes the implementation vulnerable to other admins manimpulating SMF. The above issue is basically by design. b) How do the patching services *avoid* running when the system is coming up normally - even if they have work to do? Probably
Re: [zones-discuss] [smf-discuss] 6725004 - installing single-user-mode patches automatically
So you want to be able to interrupt any boot to any milestone, and instead do the patch processing if a patch is pending. You basically want to interrupt the current milestone, and instead just boot to filesystem-local and do the patching. The question is, can the smf milestone be changed mid-milestone? My test shows that it can. How about: 1. Create patch-test-service, on which single-user depends. This will svcadm milestone patch-install-milestone if a patch needs to be installed. This service is always enabled. 2. Create patch-install-milestone, which depends on patch-install-service below. 3. Create patch-install service, which depends on: single-user filesystem-local This service is always enabled. It will install a patch if it is pending, otherwise, do nothing. If the service fails, it might need to: # svcadm milestone single-user So that a maintenance prompt will be appear on the console. This might not be necessary. you might get this anyway, as console-login is not reached. It should be ok to issue smf commands from an smf service, as long as they do not try to do any synchronous operations (-s). This approach is also good because an explicit boot to single user WILL NOT attempt to install pending patches. Disabling the patch-test and patch-install services will disable the automatic installation of pending patches on reboot. Thoughts? -Steve L. On Tue, Aug 19, 2008 at 01:03:55PM -0700, Jordan Brown wrote: Bob Netherton wrote: And further refinement would only impact patching rather than the booting process as a whole. Hmm. I don't know how to have a service that runs when a particular milestone is selected, that *doesn't* run when all is selected. (Other than by dynamically enabling and disabling it.) rc scripts doing things with SMF seem a permanent solution to a temporary problem. In my virtual universe there are no rc scripts :) And then the alarm clock goes off and I return to reality. But it does promote rc hackery rather than fixing the problem in SMF where it belongs. Agreed. Besides, I believe that SMF is locked while rc scripts are running, and that any attempt to manipulate it deadlocks. There are related schemes that could work, but the problem is getting them properly sequenced into system startup. reboot -- -m milestone=patchinstall seems elegantly simple. Plausible, though it doesn't exactly fit the current application usage model. At the moment, the reboot might or might not be triggered by the patching application. The patching application leaves the system set up to do the patching at the next shutdown/reboot, whenever that might be. (For SunUC-S's shutdown-time processing, it's require that that reboot be via the clean mechanisms - init, shutdown - so that the processing gets done.) This scheme would require either 1) having the patch application set the default milestone, and then having the startup-time processing set it back, or 2) having the patch application do the reboot. There's still the issue with how to keep this service from running when you boot to all. Hmm. How does single-user-mode login work? What stops it from running on a normal boot? Is it a special case? --- BTW: I'm not in a position to commit the patch applications. I'm in the middle here because I'm relatively familiar with all of the players and the issues, but in my day job I'm not responsible for *any* of them. ___ zones-discuss mailing list zones-discuss@opensolaris.org ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] [smf-discuss] 6725004 - installing single-user-mode patches automatically
2. Create patch-install-milestone, which depends on patch-install-service below. The patch-install-milestone could also depend on single-user and filesystem-local so that it is generally useful for admins manually installing patches as well, even if they don't have the patch-test and patch-install services, but want to safely install patches manually when they have zones on local filesystems. -Steve L. On Tue, Aug 19, 2008 at 01:03:55PM -0700, Jordan Brown wrote: Bob Netherton wrote: And further refinement would only impact patching rather than the booting process as a whole. Hmm. I don't know how to have a service that runs when a particular milestone is selected, that *doesn't* run when all is selected. (Other than by dynamically enabling and disabling it.) rc scripts doing things with SMF seem a permanent solution to a temporary problem. In my virtual universe there are no rc scripts :) And then the alarm clock goes off and I return to reality. But it does promote rc hackery rather than fixing the problem in SMF where it belongs. Agreed. Besides, I believe that SMF is locked while rc scripts are running, and that any attempt to manipulate it deadlocks. There are related schemes that could work, but the problem is getting them properly sequenced into system startup. reboot -- -m milestone=patchinstall seems elegantly simple. Plausible, though it doesn't exactly fit the current application usage model. At the moment, the reboot might or might not be triggered by the patching application. The patching application leaves the system set up to do the patching at the next shutdown/reboot, whenever that might be. (For SunUC-S's shutdown-time processing, it's require that that reboot be via the clean mechanisms - init, shutdown - so that the processing gets done.) This scheme would require either 1) having the patch application set the default milestone, and then having the startup-time processing set it back, or 2) having the patch application do the reboot. There's still the issue with how to keep this service from running when you boot to all. Hmm. How does single-user-mode login work? What stops it from running on a normal boot? Is it a special case? --- BTW: I'm not in a position to commit the patch applications. I'm in the middle here because I'm relatively familiar with all of the players and the issues, but in my day job I'm not responsible for *any* of them. ___ zones-discuss mailing list zones-discuss@opensolaris.org ___ zones-discuss mailing list zones-discuss@opensolaris.org ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] how to configure port listening in a zone?
By default, a zone does not have privilege to snoop: http://blogs.sun.com/JeffV/entry/snoop_zoney_zone Could just be a network config/routing issue. Can you ping 10.5.185.103? Can you access other network services, like ssh? -Steve L. On Tue, Aug 19, 2008 at 03:01:37PM -0700, Russ Petruzzelli wrote: I have a glassfish webserver running in my Solaris 10 zone. It will not respond to remote jmxrmi requests on port 8686. If I connect locally with (for instance) jconsole it works fine. I believe it is somehow related to being in a zone. Because snoop -p 8686 from my zone... tells me, no network interface devices found. [EMAIL PROTECTED]: ifconfig -a lo0:1: flags=2001000849UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL mtu 8232 index 1 inet 127.0.0.1 netmask ff00 bge0:1: flags=1000843UP,BROADCAST,RUNNING,MULTICAST,IPv4 mtu 1500 index 2 inet 10.5.185.103 netmask fc00 broadcast 10.5.187.255 Is there something I can configure to do this? Thanks, Russ ___ zones-discuss mailing list zones-discuss@opensolaris.org ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] [smf-discuss] 6725004 - installing single-user-mode patches automatically
The only way that you can get *that* guarantee is by using the milestone mechanism to limit the system to a particular milestone, as you suggest. In fact, argh. This problem affects even your proposed scheme. By the time that your patch-test-service is running, there could (in theory) be all kinds of services running that didn't happen to depend on anything. Maybe in practice we could ignore that possibility, but it's still bothersome. Argh. Not quite back to Square One, but that certainly tosses a wrench into most of my theories on how to solve this problem. Argh again. Currently startd hard codes the allowable milestones. My proposal would require patching startd :( static int dgraph_set_milestone(const char *fmri, scf_handle_t *h, boolean_t norepository) { const char *cfmri, *fs; graph_vertex_t *nm, *v; int ret = 0, r; scf_instance_t *inst; boolean_t isall, isnone, rebound = B_FALSE; /* Validate fmri */ isall = (strcmp(fmri, all) == 0); isnone = (strcmp(fmri, none) == 0); if (!isall !isnone) { if (fmri_canonify(fmri, (char **)cfmri, B_FALSE) == EINVAL) goto reject; if (strcmp(cfmri, single_user_fmri) != 0 strcmp(cfmri, multi_user_fmri) != 0 strcmp(cfmri, multi_user_svr_fmri) != 0) { startd_free((void *)cfmri, max_scf_fmri_size); reject: log_framework(LOG_WARNING, Rejecting request for invalid milestone \%s\.\n, fmri); return (EINVAL); } } -Steve L. ___ zones-discuss mailing list zones-discuss@opensolaris.org ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] [osol-discuss] solaris 8 container installation failed
You could try: s8 cpio patch:112097-08 s8 compress patch:108823-02 s8 flar patch:109318-39 (requires some other patches) You could be hitting 4384301, fixed in 109318-12, which was obsoleted by the flar patch above. -Steve L. On Fri, Aug 08, 2008 at 04:58:40PM -0400, Asif Iqbal wrote: On Fri, Aug 8, 2008 at 2:13 PM, Edward Pilatowicz [EMAIL PROTECTED] wrote: [ changing the cc to zones-discuss which is a more appropriate alias for this kind of question. ] hey asif, i'm guessing that cpio is not your problem. the key error below is: uncompress: stdin: corrupt input perhaps the s8_install script is somehow not detecting the compression type correctly? could you run the following command: grep files_compressed_method /engr.flar bash-3.00# grep files_compressed_method /engr.flar files_compressed_method=compress bash-3.00# flar info /engr.flar archive_id=d37bfbb8e8f8940362746dfc5af2fbed files_archived_method=cpio creation_date=20080807215755 creation_master=engr-01 content_name=engr creation_node=engr-01 creation_hardware_class=sun4u creation_platform=SUNW,UltraSPARC-IIi-cEngine creation_processor=sparc creation_release=5.8 creation_os_name=SunOS creation_os_version=Generic_117350-39 files_compressed_method=compress content_architectures=sun4u thanks ed On Fri, Aug 08, 2008 at 12:24:25AM -0400, Asif Iqbal wrote: I am failing to install flash archive of a solaris 8 02/04 running on a netra t1. I can install the example solaris8-image.flar (optional download from sun). Here is the failed log bash-3.00# more /var/tmp/engr-01.install.7078.log [Thu Aug 7 23:59:44 EDT 2008] Log File: /var/tmp/engr-01.install.7078.log [Thu Aug 7 23:59:44 EDT 2008]Product: Solaris 8 Containers 1.0 [Thu Aug 7 23:59:44 EDT 2008] Installer: solaris8 brand installer 1.22 [Thu Aug 7 23:59:44 EDT 2008] Zone: engr-01 [Thu Aug 7 23:59:44 EDT 2008] Path: /zones/solaris8 [Thu Aug 7 23:59:44 EDT 2008] Starting pre-installation tasks. [Thu Aug 7 23:59:44 EDT 2008] Installation started for zone engr-01 [Thu Aug 7 23:59:44 EDT 2008] Source: /engr.flar [Thu Aug 7 23:59:44 EDT 2008] Media Type: flash archive [Thu Aug 7 23:59:44 EDT 2008] Installing: This may take several minutes... [Thu Aug 7 23:59:44 EDT 2008] cd /zones/solaris8/root [Thu Aug 7 23:59:44 EDT 2008] do_flar /engr.flar uncompress: stdin: corrupt input [Fri Aug 8 00:01:31 EDT 2008] Postprocessing: This may take several minutes... [Fri Aug 8 00:01:31 EDT 2008] running: p2v -u eug-engr-01 [Fri Aug 8 00:01:31 EDT 2008]Postprocess: Gathering information about zone engr-01 [Fri Aug 8 00:01:31 EDT 2008]Postprocess: Creating mount points touch: /zones/solaris8/root/etc/mnttab cannot create chmod: WARNING: can't access /zones/solaris8/root/etc/mnttab [Fri Aug 8 00:01:31 EDT 2008]Postprocess: Processing /etc/system cp: cannot access /zones/solaris8/root/etc/system [Fri Aug 8 00:01:32 EDT 2008] Result: Postprocessing failed. [Fri Aug 8 00:01:32 EDT 2008] [Fri Aug 8 00:01:32 EDT 2008] Result: *** Installation FAILED *** [Fri Aug 8 00:01:32 EDT 2008] Log File: /var/tmp/engr-01.install.7078.log Here is the good log where installing from the example solaris 8 image from SUN site that comes as optional download with solaris 8 container software bash-3.00# more /var/tmp/engr-01.install.5279.log [Thu Aug 7 23:31:12 EDT 2008] Log File: /var/tmp/engr-01.install.5279.log [Thu Aug 7 23:31:12 EDT 2008]Product: Solaris 8 Containers 1.0 [Thu Aug 7 23:31:12 EDT 2008] Installer: solaris8 brand installer 1.22 [Thu Aug 7 23:31:12 EDT 2008] Zone: engr-01 [Thu Aug 7 23:31:12 EDT 2008] Path: /zones/solaris8 [Thu Aug 7 23:31:12 EDT 2008] Starting pre-installation tasks. [Thu Aug 7 23:31:12 EDT 2008] Installation started for zone engr-01 [Thu Aug 7 23:31:12 EDT 2008] Source: /tmp/solaris8-image.flar [Thu Aug 7 23:31:12 EDT 2008] Media Type: flash archive [Thu Aug 7 23:31:13 EDT 2008] Installing: This may take several minutes... [Thu Aug 7 23:31:13 EDT 2008] cd /zones/solaris8/root [Thu Aug 7 23:31:13 EDT 2008] do_flar /tmp/solaris8-image.flar [Thu Aug 7 23:35:28 EDT 2008] Sanity Check: Passed. Looks like a Solaris 8 system. [Thu Aug 7 23:35:28 EDT 2008] Postprocessing: This may take several minutes... [Thu Aug 7 23:35:28 EDT 2008] running: p2v -u eug-engr-01 [Thu Aug 7 23:35:28 EDT 2008]Postprocess: Gathering information about zone eug-engr-01 [Thu Aug 7 23:35:29 EDT 2008]Postprocess: Creating mount points [Thu Aug 7 23:35:30 EDT 2008]Postprocess: Processing /etc/system [Thu Aug 7 23:35:30 EDT 2008]Postprocess: Booting zone to single user mode [Thu Aug 7
Re: [zones-discuss] mongrel rails in a zone
My guess is that your zones lack /var/ruby.* Did you install ruby+friends in the global zone using packages, from a tar file, or from source compilation? A package install from the global zone should install the package contents into all zones, properly handling /usr verses /var. If your means of installation insists on writing to /usr/ruby, they you could create a writeable /usr/ruby filesystem (using zonecfg add fs) so that you can install ruby into every zone. Adding a lofs filesystem, that maps to a directory in the global zone, is straightfoward. -Steve L. On Sat, Jul 12, 2008 at 08:10:24PM +0100, Matt Harrison wrote: Hello, I've got an opensolaris box running several websites from mongrel clusters, and I'd like to have each site's cluster running in an isolated zone. I've been through the administrative guides and FAQs pertaining to zones and managed to get a zone created and running without problems. The problem I'm having is that although the zone is sharing /usr and /var, the zone cannot use any of the ruby gems that I have installed. For example I've installed mongrel, rails, capistrano and all dependencies into the global zone and they work perfectly. Unfortunately when I try to use these gems from within the non-global zone I get gem not found errors like so: $ mongrel_rails cluster::start /usr/ruby/1.8/lib/ruby/site_ruby/1.8/rubygems.rb:304:in `report_activate_error': Could not find RubyGem mongrel ( 0) (Gem::LoadError) [] Of course I can't re-install the gems from the zone as /usr and /var are not writable from there. Does anyone have any experience with running ruby gems from zones and might be able to give me some guidance with this? Many Thanks Matt ___ zones-discuss mailing list zones-discuss@opensolaris.org ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] Memory Cap accounting for shared memory
As of s10u4, (and nevada build 56?) rcapd (and prstat -ZJTta) account for shared memory (both sysV, anon, and text) between processes in the same zone, project, etc. -Steve L. On Wed, Jun 18, 2008 at 01:13:21PM -0500, Brian Smith wrote: My reading of the documentation is that if I try to cap the memory of a zone, that cap applies to the total RSS, calculated by summing the RSS of each process in the zone. According to the documentation, there is not any accounting for processes sharing memory. For example, if I have 10 processes with private RSS of 2MB each and shared RSS of 100MB, then the zone would be considered to be using 10*(2+100)=1,020MB of memory, and not 100+2*10=120MB of memory. Restricting the zone to 500MB would make it practically unusable even though only 120MB of memory is being used. Is that correct? Is there any way to cause rcapd to take shared memory into account when deciding which zones to start swapping to disk? Regards, Brian ___ zones-discuss mailing list zones-discuss@opensolaris.org ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] Making zoneadm more like the other adms...
Hey Darren, Are you interested in drafting an arc fasttrack for these interface additions? Do you see zoneadm being used as: # zoneadm boot myzone -s That would be: - myzone is an operand to zoneadm that comes after the subcommand. This is not compliant with getopt or clip guidelines. You may want to review the info here: http://opensolaris.org/os/community/arc/caselog/2006/062/spec-clip-html/ I think the usability issue is with the zoneadm syntax: # command -opt optarg subcommand ... The above clip guidlines make this comment: Q: I'd like to have options before my subcommands. This makes sense because some options apply to all operations. A: This often makes sense from an engineering perspective, but our usability data says most users don't understand the system model well enough to be able to predict whether the option should go before or after the subcommand. From this you could argue that the current zoneadm command is getopt, but not fully clip compliant. I think that you are proposing that the zonename could also take a special operand to the zoneadm command, which comes after the subcommand. The good commands that you are comparing zoneadm to were nice enough to not have any options/optargs before the subcommand. You might try running a fastrack, arguing that the syntax: # zoneadm boot myzone -m milestone=single-user being: # command subcommand operand suboptions suboperands while conforming to no standards/guidelines, is more usable than: # command option optarg subcommand suboptions suboperand. I think this could be defined as: if -z is not present, a subcommand is present, and the token after the subcommand is not an option, then it is the operand to an implicit -z. This is of course not compliant with anything. You could argue that this is more usable anyway, or you could find a compliant solution. -Steve L. On Sun, Jun 15, 2008 at 06:15:46PM -0700, Darren Reed wrote: Tony Ambrozie wrote: Your code changes for both zoneadm and zonecfg would preserve the current zonexxx -z zonename for backwards compatibility purposes, is that correct? Correct. There are some command line options that the changes I've made don't support, such as using -R. That's quite deliberate. The aim of the changes was to address the common use cases of the commands and make their use more intuitive when viewed with the other commands in OpenSolaris. Darren Thank you, On Mon, Jun 9, 2008 at 11:51 AM, Darren Reed [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: Someone mentioned zonecfg was the cause of some similar awkwardness... So here's a patch attached for that. Darren --- usr/src/cmd/zonecfg/zonecfg.c --- Index: usr/src/cmd/zonecfg/zonecfg.c *** /biscuit/onnv/usr/src/cmd/zonecfg/zonecfg.c Mon Mar 24 17:30:38 2008 --- /biscuit/onnv_20080608/usr/src/cmd/zonecfg/zonecfg.c Mon Jun 9 11:47:41 2008 *** *** 1071,1076 --- 1071,1077 execname, cmd_to_str(CMD_HELP)); (void) fprintf(fp, \t%s -z zone\t\t\t(%s)\n, execname, gettext(interactive)); + (void) fprintf(fp, \t%s command zone\n, execname); (void) fprintf(fp, \t%s -z zone command\n, execname); (void) fprintf(fp, \t%s -z zone -f command-file\n, execname); *** *** 6653,6689 return (execbasename); } ! int ! main(int argc, char *argv[]) { ! int err, arg; ! struct stat st; ! ! /* This must be before anything goes to stdout. */ ! setbuf(stdout, NULL); ! ! saw_error = B_FALSE; ! cmd_file_mode = B_FALSE; ! execname = get_execbasename(argv[0]); ! ! (void) setlocale(LC_ALL, ); ! (void) textdomain(TEXT_DOMAIN); ! ! if (getzoneid() != GLOBAL_ZONEID) { ! zerr(gettext(%s can only be run from the global zone.), ! execname); ! exit(Z_ERR); ! } ! ! if (argc 2) { ! usage(B_FALSE, HELP_USAGE | HELP_SUBCMDS); exit(Z_USAGE); } ! if (strcmp(argv[1], cmd_to_str(CMD_HELP)) == 0) { ! (void) one_command_at_a_time(argc - 1, (argv[1])); ! exit(Z_OK); ! } while ((arg = getopt(argc, argv, ?f:R:z:)) != EOF) { switch (arg) { case '?': --- 6654,6679 return (execbasename); } ! static void ! set_zonename(char *zonename) {
Re: [zones-discuss] S10 branded zones
Are you running sparc or x86? On x86, you can use Xen or Virtualbox today to run s10 guests. On sparc, you can use ldoms on sun4v. If you indeed need a zones-based solution, please elaborate on your requirements. -Thanks, -Steve L. On Sun, May 25, 2008 at 10:41:29AM +1000, Rodney Lindner - SCT wrote: Hi all, has there been any thought of a S10 branded zone running under NV. Most of my servers run NV, but at times I need to test software that only runs on S10. Running up a branded zone would make my life very simple. Regards Rodney -- = Rodney Lindner Services Chief Technologist Sun Microsystems Australia Phone: +61 (0)2 94669674 (EXTN:59674) Mobile +61 (0)404 815 842 Email: [EMAIL PROTECTED] = ___ zones-discuss mailing list zones-discuss@opensolaris.org ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] code review: native brand refactoring
Hey Jerry, Does this address this comment in 6621020: This appears to point out at least one bug in zlogin, namely that it keeps stdout_pipe[1] and stderr_pipe[1] from noninteractive_login() open when returning to the parent. Basically, I think the filer expected to see something like: pipe(stdout); pipe(stderr); if (fork() == 0) { /* in child, close pipe sides read by parent) */ close(stdout[0]); close(stderr[0]); .. write to std*[1]... ... exit(..); } /* in parent, close pipe sides written by child */ close(stdout([1]); close(stderr([1]); ... read from std*[0] ... ... I think the in child part is handled by the closefrom on line 1559, but the parent does not close the sides of the pipes that the child writes to. -Steve On Tue, May 27, 2008 at 08:48:17AM -0600, Jerry Jelinek wrote: I have updated the webrev at: http://cr.opensolaris.org/~gjelinek/webrev/ This includes the changes for the feedback I have received so far. I also added the zlogin.c file to the webrev with two bug fixes. One of these was for a bug I was hitting during testing of these changes and there is a second bug in zlogin that came in which I also fixed. So, at a minimum, it would be good to take a look at that additional file. Thanks, Jerry ___ zones-discuss mailing list zones-discuss@opensolaris.org ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] code review: native brand refactoring
It seems to me that the first comment in the NOTES section of fork(2) would only apply to vfork(). ?? -Steve On Fri, May 23, 2008 at 01:31:58PM +0200, Joerg Barfurth wrote: Hi, I just stumbled over this: Edward Pilatowicz schrieb: - nit: in start_zoneadmd(), instead of: if ((child_pid = fork()) == -1) { zperror(gettext(could not fork)); goto out; } else if (child_pid == 0) { ... _exit(1); } else { ... } how about: if ((child_pid = fork()) == -1) { zperror(gettext(could not fork)); goto out; } if (child_pid == 0) { ... _exit(1); } ... - nit: we have direct bindings now. :) so why bother with: _exit(1) instead just call: exit(1) Beware: _exit() is not just a linker synonym for exit(). See the exit(2) man page for differences. When fork-ing without exec, calling _exit is the proper way to exit; see the Notes section in the fork(2) man page. - J?rg -- Joerg Barfurth phone: +49 40 23646662 / x2 Software Engineermailto:[EMAIL PROTECTED] Desktop Technology http://reserv.ireland/twiki/bin/view/Argus/ Thin Client Software http://www.sun.com/software/sunray/ Sun Microsystems GmbHhttp://www.sun.com/software/javadesktopsystem/ ___ zones-discuss mailing list zones-discuss@opensolaris.org ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] physical= not obeyed when ip-type=shared and physical dev part of IPMP group in global zone
In the global zone, do you have two ip addresses (one on vnet0, one on vnet1) or is vnet1 configured as standby? Steve L. On Wed, May 21, 2008 at 09:12:35AM +0100, Lewis Thompson wrote: On Tue, 2008-05-20 at 13:56 -0700, Steve Lawrence wrote: It is documented here: http://docs.sun.com/app/docs/doc/819-2450/z.admin.task-60?l=koa=viewq=multipathing The behavior you are seeing is not specifically documented, but it seems reasonable. What behavior are you expecting? Hi Steve, For now I am just looking for confirmation that this is expected behaviour. However, as the customer has specifically set physical equal to the other interface in the IPMP group, I would expect this to be used, when the interface is not in a FAILED state. Thanks, Lewis ___ zones-discuss mailing list zones-discuss@opensolaris.org ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] physical= not obeyed when ip-type=shared and physical dev part of IPMP group in global zone
This appears to be the affect of selecting an interface that is a member of an ipmp group in the global zone. It is documented here: http://docs.sun.com/app/docs/doc/819-2450/z.admin.task-60?l=koa=viewq=multipathing The behavior you are seeing is not specifically documented, but it seems reasonable. What behavior are you expecting? -Steve L. On Tue, May 20, 2008 at 03:20:04PM +0100, Lewis Thompson wrote: Hi, I have a customer who has a basic IPMP config in his global zone: vnet0 vnet1 [currently vnet0 has the 'floating' IP] In addition he has a zone with ip-type=shared where physical=vnet1 When the zone boots the zone interface gets created on vnet0 instead of vnet1 I have verified this behaviour on a test system and can confirm that the zone interface acts in the same way as the floating IP. i.e. if I run 'if_mpadm -d vnet0', then the zone's IP fails over to vnet1 Unfortunately I can't find any documentation that discusses this behaviour. Is anybody aware of any documents that explain what we are seeing? Thanks, Lewis ___ zones-discuss mailing list zones-discuss@opensolaris.org ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] Can zone id be same on different global zones?
/usr/bin/tar on solaris 10. My comment was incorrect. I was referring to the preservation of hard links. I need to investigate the status of this in the various verions of tar. Thanks, -Steve L. On Fri, May 16, 2008 at 10:59:10AM +0200, Joerg Schilling wrote: Steve Lawrence [EMAIL PROTECTED] wrote: tar is usually not the best archiving tool, as tends to deal poorly with symbolic links. There are some examples of migrating a zone using both ufs (via pax) and zfs (via zfs send/recieve). Please elaborate what you like to point to when using the term tar? The Sun tar implementation has _security_ bugs related to archives that contain symbolic links but the same problem applies to the OpenGroup owned pax implementation currently found at /usr/bin/pax. star on the other side has no known issues related to symbolic links. What are you talking about? J?rg -- EMail:[EMAIL PROTECTED] (home) J?rg Schilling D-13353 Berlin [EMAIL PROTECTED](uni) [EMAIL PROTECTED] (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] How to migrate a seperate running server into a zone?
Are you talking about zone migration? (attach/detach). That should work well. Both hosts need to be running the same version of solaris (patchlevel). We are planning a update-on-attach feature, which will allow a zone to be detached, and then attached to a host running a higher patch level. -Steve L. On Thu, May 15, 2008 at 11:20:20AM +0300, mehmet cebeci wrote: Hi Steve, Thanks for info , i was searching for an unresolved solution all day. In this case my solution will be preparing the zones with the same configuration as existing servers and copying disks.It seems the only stable way of solving this issue. Regards, Mehmet On Thu, May 15, 2008 at 6:02 AM, Steve Lawrence [EMAIL PROTECTED] wrote: We currently don't have p2v support for native zones. Future work for this is under consideration, but no timeframe established (yet). I could think of ways to make it work in the short term, but nothing that we could support. It could also result in furthur issues later when patching/upgrading. -Steve L. O Wed, May 14, 2008 at 03:12:36PM +0300, mehmet cebeci wrote: Hi all, I would like to get any idea about how to migrate a seperate running servers on solaris 10 into a solaris zone seperately? For detail ; Our customer has running servers and he requests to migrate them one by one into another server with zones inside. In solaris 8 i searched that it may be done by flar,but i dont have any idea how to achieve it on solaris 10. Thanks for any opinion, ___ zones-discuss mailing list [EMAIL PROTECTED] References Visible links 1. mailto:[EMAIL PROTECTED] 2. mailto:zones-discuss@opensolaris.org ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] Dynamic dedicated-cpu
zone reboot is required. -Steve L. On Tue, Apr 15, 2008 at 04:20:40PM +0100, Terry Smith wrote: Hi When adding dedicated-cpus to a zone does the configuration take effect immediately or is a zone reboot required? T ___ zones-discuss mailing list zones-discuss@opensolaris.org ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] zlogin and locales
On Wed, Apr 02, 2008 at 11:11:12AM -0400, Moore, Joe wrote: Steve Lawrence wrote: Looks like the environment contained in /etc/default/init is read and set by startd and init. Since zlogin'ed processes are not child of startd or init in the zone, they do not have these environment settings. Given brands, to fix this, we would need to add a hook that asks the zone: Please fetch me the default login environment. And hope that the zone adminstrator hasn't figured out a way to violate security constraints by setting malicious variables in that default login environment... You mean fetch the environment, but don't set for the zlogin process in the global zone. Just pass it into the zone and set it when exec'ing the login process. Such as a specially-corrupted termcap (pushing data to the global-zone xterm, for example), or a locale with similar features It would be similar to the hook that we currently have for fetching the passwd entry for a given user. passwd entries are fairly easy to validate. Arbitrary environment variables should not be accepted from an untrusted source. --Joe ___ zones-discuss mailing list zones-discuss@opensolaris.org ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] Capped memory observability
On Mon, Feb 05, 2007 at 10:17:42PM -0600, Mike Gerdts wrote: I just got a chance to start playing with the capped memory resource controls in build 56. At first blush, this looks to be *very* good stuff. My initial testing included some very basic single process memory hog tests and multiple process mmap(..., MAP_SHARED,...) tests. In each case, the limits kicked in as I expected, and prstat -Z running from the global zone gave what appeared to be accurate information. Great job! One of the effects of setting capped-memory resource control for swap is that the size of /tmp is also limited. Unlike when a tmpfs size limit is set with the size=... mount option, df /tmp does not display a value that is reflective of the limits that are put in place. Similarly, vmstat and swap -l running inside the zone give no indication that there is a cap smaller than the system-wide limits. Am I missing something here? Swap -l shows details about swap devices, and we don't know in particular how much each zone is using each swap device. We could do something with vmstat and swap -s. We are looking to improve the general observability of resource limits and utilization for zones. It is a bit tricky though, as caps are not reservations. Since all zones share the same swap devices, mocking up size to be equal to cap can lead to confusing output, such as used size, capacity 100%, etc. Other confusing things include the fact that your usage can be less than your cap, but there can still be no swap available if the swap devices are full. The swap and vmstat commands as they currently exist cannot express these scenerios. The sort answer is, we're working on it. I do see that some of the values I am looking for are available through kstat (thank you!). Is there some more user-friendly tool (already or coming) to use inside the zone? No ETA at this time. Oh, and the question that everyone at work will ask when I tell them about this - when will it find its way into Solaris? :) Mike -- Mike Gerdts http://mgerdts.blogspot.com/ ___ zones-discuss mailing list zones-discuss@opensolaris.org ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] Dynamic Resource Pools (DRP) and RCM ?
The short answer is no. When a processor is transfered from one pset to another, no RCM event is generated in the global zone, or in any non-global zones. RCM events are only generated when DR operations take place. The rcm_daemon only runs in the global zone, as part of the sysevent:default service. Currently, no events of any kind are generated when a processor is tranfered from one pset to another. pool_conf_update(3pool) can be used to detect a change in a processor set. pool_conf_update(3pool) is not a blocking interface, so you need to call it periodically to detect a change. -Steve L. On Tue, Jan 23, 2007 at 01:42:08PM -0800, Gary Combs wrote: If a zone is bound to a resource pool with a processor set also bound there and DRP is enabled, will the addition of a processor added to the pset via DRP generate an insertion event via RCM? Would all zones bound to the same pool see the RCM event? Would only the global zone see the insert event? Thanks, Gary -- http://www.sun.com/solaris * Gary Combs * Technical Marketing *Sun Microsystems, Inc.* 3295 NW 211th Terrace Hillsboro, OR 97124 US Phone x32604/+1 503 715 3517 Cell 1-503-887-7519 Fax 503-715-3517 Email [EMAIL PROTECTED] Arguing with an engineer is like wrestling in mud with a pig; after a while you realize the pig likes it!! http://www.sun.com/solaris ___ zones-discuss mailing list zones-discuss@opensolaris.org ___ zones-discuss mailing list zones-discuss@opensolaris.org
[zones-discuss] Re: Restart: PSARC/2006/598 Swap resource control; locked memory RM improvements
Good question. These are essentially virtual system requirements. What is the behaviour of Solaris intended to be when someone makes these changes (or attempts to make them) on a system that has no swap space? All systems have reservable swap space. Systems with no swap devices use physical memory to back swap reservations. Furthermore, why shouldn't I be able to say a zone has no swap space available to it - i.e. to force it to all run from RAM? Solaris's vm system has no such concept. All anonymous allocations reserve swap. I think you suggesting a zone switch so that an admin can choose from one of: A. reserve swap from disk only B. reserve swap from memory only C. reserve swap from disk, then memory D. reserve swap from memory, then disk. Currently, system behavior is C for everyone. zone.max-swap simply limits swap reservation. It does not provide an interface for choosing a swap allocation policy. These concepts are orthogonal. I can see a swap sets feature addressing allocation policy, since swap sets could be used to associate a given zone with a particular set of swap devices. -Steve Darren ___ zones-discuss mailing list zones-discuss@opensolaris.org
[zones-discuss] Re: Restart: PSARC/2006/598 Swap resource control; locked memory RM improvements
On Sat, Nov 11, 2006 at 09:02:48PM -0800, Gary Winiger wrote: First off, sorry for the stutter in the spec update mail. The project team didn't supply a summary of the changes, so I'll be asking for one in a follow on. I've addressed your comments way below. Here is my change summary and case discussion summary: SUMMARY OF CHANGES 1. Change to the proposed uncommitted kstat names and statistics. From the form: zone:{zoneid}:vm with statistics: zonename swap_reserved max_swap_reserved locked_memory max_locked_memory To the form: caps:{zoneid}:swaprsev_zone_{zoneid} caps:{zoneid}:lockedmem_zone_{zoneid} caps:{zoneid}:lockedmem_project_{projid} with statistics: zonename usage value This sets up a generic scheme for adding kstats to project and zone rctls. A kstat is created per rctl, instead of per zone. 2. Addition of zonecfg(1m) minimums for setting zone.max-swap. When setting zone.max-swap via zonecfg(1m), a minimum value will be enforced: global zone: 100M non-global zone: 50M Currently, this is about 20M more than is needed to boot after a default installation. 3. Addition of zonecfg(1m) warnings when setting zone.max-swap and zone.max-lwps on the global zone. global:capped-memory set swap=200M Warning: Setting capped swap on the global zone can impact system availability. SUMMARY OF CASE DISCUSSION: The case disussion has focused on the problem that the zone.max-swap rctl on the global zone can affect system availability. An identical problem exists today with task/project/zone.max-lwps. Solutions to this problem may involve one or more of: - Exempting project 0 in the global zone from zone.* rctls. - Preventing task/project.* rctls from being set on project 0 in the global zone. - Modifying root's default project. - Adding a new privilege to exempt a process from rctls. - Updating system service manifests to drop the new privilege. Solving this problem in a way that will prevent the global zone (on a default system) from becoming unavailable due to a resource control setting will require a signficant change to the system. I believe solving this problem is outside the scope of the zone.max-swap case, and would be better solved by another case which is not seeking patch binding. To minimize this problem for zone.max-swap (and zone.max-lwps), I've instead proposed zonecfg enhacements to assist the admin in configuring these rctls safely. 1. This case proposes adding the following resource control: INTERFACE COMMITMENT BINDING zone.max-swap CommittedPatch This control will limit the swap reserved by processes and tmpfs mounts within the global zone and non-global zones. This resource control serves to address the referenced RFE[6]. There was some considerable discussion on the global zone aspect of this part of the proposal. Perhaps I missed in the spec how the new proposal mitigates the risk of the global zone not being able to administer the system. DETAIL: 1. zone.max-swap resource control. Limits swap consumed by user process address space mappings and tmpfs mounts within a zone. While a low zone.max-swap setting for the global zone can lead to a difficult-to-administer global zone, the same problem exists today when configuring the zone.max-lwps resource control on the global zone, or when all system swap is reserved. The zonecfg(1m) enhancements detailed below will help administrators configure zone.max-swap safely. Perhaps I misunderstood the interaction between project 0 and zone.max-lwps in the global zone. If a max-lwps is set is project 0 bound by it? Currently yes. zone.* rctls bound all processes in the global zone, regardless of project. This is the issue that my other proposal is attempting to address. Perhaps a short summary of the offline discussion on project 0 and the project teams feeling that the discussions conclusions might not be patch qualified. I realize the need for this project to have a patch binding. I've added this summary above. 2. swap and locked properties for zonecfg(1m) capped_memory resource. To prevent administrators from configuring a low swap limit that will prevent a system from booting, zonecfg will not allow a swap limit to be configured to less than: Global zone: 100M Non-global zone: 50M. These numbers are based on the swap needed to boota zone after a default installation.
[zones-discuss] Re: PSARC/2006/598 Swap resource control; locked memory RM improvements
On Tue, Oct 31, 2006 at 03:28:31PM -0800, Dan Price wrote: On Tue 31 Oct 2006 at 03:24PM, Steve Lawrence wrote: It seems reasonable to amend this case to say: 1. Any process with priv_sys_resource running in the global zone's system project (project 0) will not subject to project.* or zone.* resource controls. System daemons which wish to be subject to the global zone's resource controls can drop priv_sys_resource. This feels risky for a patch... the effect here is to un-manage things which the customer may be expecting to be resource managed, potentially including a workload. Or is the point that no-one using RM will be relying on the system project to limit things? Currently, if rctls are specified in project(4) for the system project, they are never actually applied during boot. While this could be considered a bug, this also means that currently, by default, admins will have to take additional manual steps to resource manage project 0. They are more likely to put what they want to manage in a new/different project, and use project(4) to configure rctls. -Steve This implies that can't give the system project 10 (or 100) shares after this proposal? -dp -- Daniel Price - Solaris Kernel Engineering - [EMAIL PROTECTED] - blogs.sun.com/dp ___ zones-discuss mailing list zones-discuss@opensolaris.org
[zones-discuss] Re: PSARC/2006/598 Swap resource control; locked memory RM improvements
I am working on a new spec. I have an unanswered question from the discussion: The SIZE column will also be changed to SWAP for prstat options a, T, and J, for users, tasks, and projects. The reason for not changing this column in the default output would be helpful. I have a seperate private interface used by prstat(1m) to get aggregate swap reserved by users, tasks, projects, and zones. Default prstat output is per-process, and the information is accessed via /proc. Currently, per-process swap reservation is not counted or made available via /proc. From proc(4): typedef struct psinfo { ... size_t pr_size; /* size of process image in Kbytes */ ... size of process image is pretty meaningless. If we can change pr_size to be swap reserved by process, then we could change SIZE to SWAP for all prstat(1m) output. Would such a change to psinfo_t be reasonable If not, could potentially convert pr_pad1 to pr_swap -Steve On Mon, Nov 06, 2006 at 12:13:41PM -0800, Gary Winiger wrote: At the request of the project team, I've put this case into waiting need spec. When the spec is updated, it will be sent out and the timer reset. Gary.. ___ zones-discuss mailing list zones-discuss@opensolaris.org
[zones-discuss] Re: PSARC/2006/598 Swap resource control; locked memory RM improvements
On Fri, Nov 03, 2006 at 09:36:45AM +, Darren J Moffat wrote: Steve Lawrence wrote: Given a lack of supportive feedback, I'm going to revoke the proposed amendment below. To mitigate a zone admin setting a problematic swap limit on the global zone, we will enhance zonecfg to: 1. Print a warning when setting swap (and lwp) limits on the global zone. Since the swap limit will not go into effect until reboot, the admin has a change to modify his setting before it takes affect. 2. Enforce a reasonable minimum when setting swap (and lwp) limits on the global zone. What is the definition of reasonable ? I think that being a default should be part of the case. I'm not sure what your second sentence means. As for the first, how about: reasonable minimum: The amount of resource neccessary to boot a zone that is has a default installation and configuration. My minimum, I do not mean the system default value. For example, for the following rctl, 128 is the system default value: # prctl -n project.max-shm-ids $$ process: 425781: -sh NAMEPRIVILEGE VALUEFLAG ACTION RECIPIENT project.max-shm-ids privileged128 - deny - system 16.8M max deny - zone.max-swap will not have a system default value. By default, every zone will have access to all swap available on the system. By minimum, I mean the minimum that zonecfg will allow you to set. zonecfg add capped memory zonecfg:capped memory set zone.max-swap=1K Error, minimum value for zone.max-swap is ### -Steve -- Darren J Moffat ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] Re: PSARC/2006/598 Swap resource control; locked memory RM improvements
I'm not sure it is within the domain of this case to to tell admins what they should and shouldn't use the global zone for. In any event, we are making it easy for admins to manage swap limits for zones via zonecfg. On Tue, Oct 31, 2006 at 05:58:24PM -0800, Michael Barto wrote: After all thus juggling, let make it simple for the system admin and use some sort of fair share process to assignment and manage the swap for all the zones. Personally I think that the global zone should use minimum resources and be considered in the IT management processes to be only like a system controller on a complex server. Keep your applications out of the global zone!!! Gary Winiger wrote: This will not help root logins directly, but could by setting: usermod -K project=system root Or perhaps deliver root's entry this way to start with. Would that be a reasonable change to make via patch? Perhaps this change could be delivered to nevada, but not backported. It would be confusing to deliver this change, and also deliver the user.root project. If we made root's default project system, then the user.root project should be removed. user.root is kind of a bug anyhow, as SMF does not run root services in user.root. Currently, only root processes spawned by login/pam run in user.root. Perhaps this issue should be run as a seperate fasttrack? I need to investigate the implementation impact. I'm looking for this case to define how to preserve the current model of unlimited unless one asks for a limit model in the global zone. I believe it is important from a system integrity and maintenance perspective. Other's may have different opinions. If there is a compelling reason to deliver in phases, please discuss that. The global zone will have no swap limit by default. The default zone.max-swap rctl delivered on the global zone is UINT64_MAX, which is essentially unlimited. Is that what you mean? My point(s) here is not so much how things get done, but that the global zone is in some ways special. IIRC, before this project, the GZ doesn't have a swap limit. After this project an administrator could set swap limit on the GZ. Granted this is administrative action and they get what they deserve/ask for. However, it seemed to me that part of this case should (my judgement) include some way to override the limit in case override is really desired. As implied, perhaps by putting root into project 0 at login or as part of daemon/service start is a way to bypass the administrator's choice in the GZ for some processes. What I didn't see as part of this case is the architecture to allow this bypass. Perhaps I'm off base for thinking it's necessary to protect against inadvertantly not being able to administer the system from the GZ. Gary.. ___ zones-discuss mailing list zones-discuss@opensolaris.org -- *Michael Barto* Software Architect LogiQwest Circle LogiQwest Inc. 16458 Bolsa Chica Street, # 15 Huntington Beach, CA 92649 http://www.logiqwest.com/ [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] Tel: 714 377 3705 Fax: 714 840 3937 Cell: 714 883 1949 *'tis a gift to be simple* This e-mail may contain LogiQwest proprietary information and should be treated as confidential. ___ zones-discuss mailing list zones-discuss@opensolaris.org
[zones-discuss] Re: PSARC/2006/598 Swap resource control; locked memory RM improvements
Given a lack of supportive feedback, I'm going to revoke the proposed amendment below. To mitigate a zone admin setting a problematic swap limit on the global zone, we will enhance zonecfg to: 1. Print a warning when setting swap (and lwp) limits on the global zone. Since the swap limit will not go into effect until reboot, the admin has a change to modify his setting before it takes affect. 2. Enforce a reasonable minimum when setting swap (and lwp) limits on the global zone. Currently, the rctl framework provides many mechanisms by which the admin can make the system difficult to manage. For instance, setting task.max-lwps on project user.root can prevent root login. If we want to make changes to prevent admins from resource-controlling their way out of the box, I think we need a broader case to address the whole problem. -Steve On Tue, Oct 31, 2006 at 03:24:18PM -0800, Steve Lawrence wrote: I'm looking for this case to define how to preserve the current model of unlimited unless one asks for a limit model in the global zone. I believe it is important from a system integrity and maintenance perspective. Other's may have different opinions. If there is a compelling reason to deliver in phases, please discuss that. The global zone will have no swap limit by default. The default zone.max-swap rctl delivered on the global zone is UINT64_MAX, which is essentially unlimited. Is that what you mean? My point(s) here is not so much how things get done, but that the global zone is in some ways special. IIRC, before this project, the GZ doesn't have a swap limit. After this project an administrator could set swap limit on the GZ. Granted this is administrative action and they get what they deserve/ask for. However, it seemed to me that part of this case should (my judgement) include some way to override the limit in case override is really desired. As implied, perhaps by putting root into project 0 at login or as part of daemon/service start is a way to bypass the administrator's choice in the GZ for some processes. What I didn't see as part of this case is the architecture to allow this bypass. Perhaps I'm off base for thinking it's necessary to protect against inadvertantly not being able to administer the system from the GZ. It seems reasonable to amend this case to say: 1. Any process with priv_sys_resource running in the global zone's system project (project 0) will not subject to project.* or zone.* resource controls. System daemons which wish to be subject to the global zone's resource controls can drop priv_sys_resource. 2. The user.root project will be removed, and root's default project will be set to the system project via /etc/user_attr. I'm not sure if (2) can be delivered via patch. I need some guidance here. I'm also not sure how implementable (1) is until I do more investigation. -Steve Gary.. ___ zones-discuss mailing list zones-discuss@opensolaris.org
[zones-discuss] Re: PSARC/2006/598 Swap resource control; locked memory RM improvements
Would it be reasonable to propose special treatment of the global project 0 for all project and zone rctls? Once could argue that capping system daemons can only lead some sort of undesireable system failure. This would of course exempt all global zone system daemons from resource management. To mitigate this, SMF could be leveraged to run application daemons (or leaky/bad system daemons) in other projects. Please don't do that in a hardcoded way. I don't mind if it is the default but it can't be hard coded. One of the main reasons we have nfsd (and kcfd) is so that resource controls can be placed on system services. Do you mean one of the reasons that nfs and kcfd run in the system project?? With the advent of SMF this is actually really easy to do since you just need to set the project/pool the start method runs in. Also it is perfectly reasonable to have a case where no useful customer work happens in the global zone, ie it is a service processor really, and all the real work happens in the non global zones. Please elaborate on this. I'm not sure I understand what you're getting at. -Steve -- Darren J Moffat ___ zones-discuss mailing list zones-discuss@opensolaris.org
Re: [zones-discuss] PSARC/2006/598 Swap resource control; locked memory RM improvements
Comments inline. I've snipped stuff not relevant to comments. 4. prstat(1m) output changes to report swap reserved. INTERFACE COMMITMENT BINDING prstat(1m) output Uncommitted Patch This case proposes changing the SIZE column of prstat -Z zone output lines to SWAP. The swap reported will be the total swap consumed by the zone's processes and tmpfs mounts. This value will assist administrators in monitoring the swap reserved by each zone, allowing them to choose a reasonable zone.max-swap settings. The SIZE column will also be changed to SWAP for prstat options a, T, and J, for users, tasks, and projects. The reason for not changing this column in the default output would be helpful. I have a seperate private interface used by prstat(1m) to get aggregate swap reserved by users, tasks, projects, and zones. Default prstat output is per-process, and the information is accessed via /proc. Currently, per-process, or per-address-space, swap reservation is not counted or made available via /proc. From proc(4): typedef struct psinfo { ... size_t pr_size; /* size of process image in Kbytes */ ... size of process image is pretty meaningless. If we can change pr_size to be swap reserved by process, then we could change SIZE to SWAP for all prstat(1m) output. Would such a change to psinfo_t be reasonable? Currently a global or non-global zone can consume all swap resources available on the system, limiting the usefulness of zones as an application container. zone.max-swap provides a mechanism to I would rephrase that as the container of an application to avoid confusion with the Solaris feature set called Containers. I assume that the former was meant moreso than the latter even though Containers are Solaris' implementation of an application container. I'm not sure what you mean, but ok. By the Solaris feature set called Containers., do you mean zones + RM, or do you mean zones, xen, ldoms. zone.max-swap will be configurable on both the global zone, and non-global zones. The affect on processes in a zone reaching its zone.max-swap limit is the same as if all system swap is reserved. Callers of mmap(2) and sbrk(2) will receive EAGAIN. Writes to tmpfs will return ENOSPC, which is the same errno returned when a tmpfs mount reaches it's size mount option. The size mount option limits the quantity of swap that a tmpfs mount can reserve. With S10 11/06, some zone limitations are now configurable, e.g. setting the system time clock. Similarly, the ability to modify a zone's swap limit could be given to the zone's root user, which might be valuable in some situations. This would be analogous to the 'basic' privilege level. It would allow an advisory limit to be placed on a zone - a limit that the zone admin could modify in unusual circumstances. I realize that this opens a can of worms in that most rctls are protected by the sys_res_config priv, which is not allowed in a zone even with 11/06. Further, it makes sense to consistently allow or forbid rctl-modification in zones. I just wanted to mention this idea so that it is not unintentionally overlooked. Currently, all zone.* rctls are not modifiable from a non global zone. The established mechanism for a zone admin to set rctls within the zone is via project.* rctls set on projects within the zone. Granted, in the zone.max-swap case, we are not proposing a project.max-swap, due to implementation complexity and risk. With sufficient customer damand, we could investigate implementing project.max-swap in the future. Currently no zone.* rctls allow basic rctl values to be set. The only project.* rctl which allows basic is project.max-contracts, and perhaps that is a bug. A basic rctl is an unprivileged rctl that only affects the process within the task, project, or zone which sets it. It is pretty useless, except for process.* rctls. I'd be happy to address the general issues of privilege related to project and zone rctls as a seperate case. A possible solution may be to redefine basic for project and zone rctls, and/or introduce more fine grained privileges. I agree that work is needed here. STATISTIC DESCRIPTION zonenameThe name of the zone with {zoneid} swap reserved: swap reserved by zone in bytes. Does swap_reserved include pages shared with other zones, e.g. text pages? Each process mapping text reserves unique swap for that mapping. Even though the underlying physical page may be shared between processes/zones, each process needs it's own swap reservation. This is because each process may cow the page, and then may need to page the private copy to disk. max_swap_reserved: current zone.max-swap limit