Re: [zones-discuss] Difference between capped.memory and zone.max-shm-memory

2012-04-18 Thread Steve Lawrence



On 04/18/12 02:42 PM, Jordi Espasa Clofent wrote:

El 2012-04-18 19.22, Hung-Sheng Tsao (LaoTsao) Ph.D escribió:

hi
may be one could add
in solaris resource control used to be project based
one need to setup project and limit the resoure pool
then assign the poll to zone.
it is not easy to  use .

since then many  shortcut for resource pool control are added to zonecfg
make it very easy to add resource control  inside the zone


The cuestion sill, more or less, there: It is possible to limit the 
amout of RAM that a zone can borrow from the global zone without rcapd?


Without rcapd, it is only possible to limit certain types of ram usage:
- zone.max-shm-memory: limits ram used by sysV mappings.
- zone.max-locked-memory: limits ram used by locked mappings 
(mlock, ISM, DISM).


Both of these do not limit ram used by other means, such as text pages, 
mapped files, and anonymous memory (like malloc()'ed memory)
As far as I can understand, if a zone only uses zone.max-shm-memory 
instead, potencially can borrow all the available RAM. So?


Correct.  In this case, only memory used by sysV mappings (shmget(2), 
shmat(2) is limited.


-Steve L.





___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] Zones disappeared fron zoneadm on OpenSolaris

2011-04-20 Thread Steve Lawrence

 Never seen this failure before.

Is it possible that you ran out of disk space?

For reference, which opensolaris build (cat /etc/release, and uname 
-v).  Did you do any upgrading?  If so, from which prior build?


It seems that you've lost your /etc/zones/index file.  Perhaps there is 
a backup or temporary copy in /etc/zones.


- Steve L.




On 04/20/11 02:37 AM, Jonatan Walck wrote:

I'm new to this list, checked the archives and hope I found the right
place.

A few days ago I rebooted a opensolaris server, at the first boot
zones:default didn't go up properly, and after another reboot it did
but with one quirk: It finds no zones.

zoneadm list -cv shows only the global zone. zonecfg -z [zonename]
works for all zones though, all configs are there. zfs list shows all
the file systems, zpool status lists no errors.

Anyone have a idea for how to get zones back up? How is the zoneadm
list list built up?

Attaching part of the service log for zones:default, showing something
going wrong but I have been unable to track down the problem.

[ Apr 14 22:23:51 Enabled. ]
[ Apr 14 22:24:42 Executing start method (/lib/svc/method/svc-zones start). ]
Booting zones: badger cannonball lolcat mudkip cerberus rockdove tapir kangaroo 
molerat llama tuna narwhal tiger medusa dolphin tanukiERROR: error while 
acquiring slave h
andle of zone console for tapir: No such device or address
console setup: device initialization failed
ERROR: error while acquiring slave handle of zone console for kangaroo: No such 
file or directory
console setup: device initialization failed
ERROR: error while acquiring slave handle of zone console for tanuki: No such 
device or address
console setup: device initialization failed
ERROR: error while acquiring slave handle of zone console for cerberus: No such 
device or address
console setup: device initialization failed
ERROR: error while acquiring slave handle of zone console for narwhal: No such 
file or directory
console setup: device initialization failed
zone 'tapir': could not start zoneadmd
zoneadm: zone 'tapir': call to zoneadmd failed
zone 'cerberus': zone 'zone 'could not start tanuki': zoneadmd
zone 'kangaroo': narwhal': could not start could not start could not start 
zoneadmd
zoneadmdzoneadmdzoneadm: zone 'cerberus': call to zoneadmd failed
zoneadm

: zoneadmzoneadm: zone 'tanuki': call to zoneadmd failed
zone ': narwhalzone '': kangaroo': call to zoneadmd failed
call to zoneadmd failed
.
[ Apr 14 22:24:56 Method start exited with status 0. ]
[ Apr 19 08:51:42 Enabled. ]
[ Apr 19 08:52:11 Executing start method (/lib/svc/method/svc-zones start). ]
Booting zones: badger cannonball lolcat mudkip cerberus rockdove tapir kangaroo 
molerat llama tuna narwhal tiger medusa dolphinERROR: error while acquiring 
slave handle of zone console for tuna: No such device or address
console setup: device initialization failed
ERROR: error while acquiring slave handle of zone console for cerberus: No such 
file or directory
console setup: device initialization failed
zone 'tuna': could not start zoneadmd
zone 'cerberus': zoneadm: could not start zoneadmd
zone 'tuna': zoneadm: zone 'cerberus': call to zoneadmd failed
call to zoneadmd failed
  tanuki[ Apr 19 09:09:43 Enabled. ]
[ Apr 19 09:10:16 Executing start method (/lib/svc/method/svc-zones start). ]
[ Apr 19 09:10:16 Method start exited with status 0. ]

Thanks for any advice,
Jonatan Walck


___
zones-discuss mailing list
zones-discuss@opensolaris.org
___
zones-discuss mailing list
zones-discuss@opensolaris.org

Re: [zones-discuss] Solaris 10 zone migration to Solaris 11 Express

2011-04-06 Thread Steve Lawrence

 Look for unmount on this page:

http://download.oracle.com/docs/cd/E19797-01/817-1592/gjwmp/index.html

On 04/ 6/11 06:18 AM, Mike Gerdts wrote:

On Wed 06 Apr 2011 at 02:33AM, Ketan wrote:

I was testing of migrating  the solaris10 zone to solaris 11 express zone. I 
used cpio to create the archive with following syntax

#find db_zone -print | cpio -oP@ | gzip/swdump/ovpidb_zone.cpio.gz

Then i created a solaris10 brand zone on the Solaris 11 environment and tried 
to attach the zone but i got following error.

***

  zoneadm  -z s10zone1 attach -a /home/vneb/ovpidb_zone.cpio.gz
Log File: /var/tmp/s10zone1.attach_log.oFaavh
Attaching...

ERROR: The image was created with an incompatible libc.so.1 hwcap lofs mount.
The zone will not boot on this platform.  See the zone's
documentation for the recommended way to create the archive.


I 'm moving solaris 10u8 zone from M5000 to a Ldom2.0 Solaris11 express

It sounds like the zone was running when you created the archive.  As a
result, the version of libc that is optimized for the SPARC64 CPU found
in the M5000 was mounted on top of /lib/libc.so.1.  On the T-series box
that you are moving to, the CPU architecture is different and
incompatible with the type of optimization done for the SPARC64 CPU.

It looks like you were following the instructions at
http://download.oracle.com/docs/cd/E19963-01/html/821-1460/gentextid-12093.html#gcglo
but there shut down the zone while creating the archive step seems to
be missing.


___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] psets for zones

2011-03-10 Thread Steve Lawrence
 When pooladm -c is done to commit the configuration, it will try and 
satisfy the min/max constraints of all psets.
In your case, there are plenty of cpus, so pset1 gets 5 cpus.  If you 
had more psets configured, there might not be enough cpus to give pset1 
5 cpus.


If the pset.min off all psets cannot be satisfied, pooladm -c will fail.

-Steve L.

On 03/10/11 09:54 AM, Ketan wrote:

I configured the the pset1 with 2 different ways

1. poolcfg -c 'create pset pset1 (uint pset.min = 5; uint pset.max = 5)'

pset pset1
int pset.sys_id 1
boolean pset.default false
uint pset.min 5
uint pset.max 5
string pset.units population
uint pset.load 7
uint pset.size 5
string pset.comment

* 
*
2. 1. poolcfg -c 'create pset pset1 (uint pset.min = 1; uint pset.max = 5)'

pset pset1
int pset.sys_id 1
boolean pset.default false
uint pset.min 1
uint pset.max 5
string pset.units population
uint pset.load 10
uint pset.size 5
string pset.comment

whats the difference in both of them .. the only difference i can see is 
uint.min=5 and uint.min=1 but the pset.size is 5 for both the cases

so is it just a different way of notation of some other difference too

___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] Zone Resource Management Issue.

2010-12-13 Thread Steve Lawrence

 Do you mean zone_caps?  You are looking at project_caps.


On 12/13/10 01:50 AM, Ketan wrote:

Ok got it .. but still if i want to check what's current usage by a whole 
zone/project for locked-memory  what would be the best way .. i 'm using

kstat -c project_caps -n 'lockedmem*'

but with this command the usage is very less  as reported by the application 
user which says that db is doin very slow and they are getting memory related 
errors

module: capsinstance: 0
name:   lockedmem_project_  class:project_caps
 usage   2488147563
 value   36335423324



___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] All zones continuously core dump after upgrade to Solaris Express

2010-11-18 Thread Steve Lawrence



On 11/18/10 12:38 PM, Ian Collins wrote:

On 11/19/10 09:12 AM, Steve Lawrence wrote:

 What build are you upgrading from?


134 through 134b as recommended in the release notes.


Is this during the attach -u portion of the upgrade for each zone?

It happens after rebooting into the new BE.  I didn't detach the zones 
before upgrading.


Oh.  I that case, you're zones are still downrev at build 134.  You need 
to detach them, and attach them again with -u.


I'm not sure if you'll be able to detach them successfully with zoneadm 
detach.  If not, you'll need to boot back to the 134 BE, detach them, 
and upgrade again.


-Steve

Can you gather any core files (or pstacks of core files)?  These 
might be at zonepath/root/



pstack is short:

core '/tmp/xx/zoneRoot/webhost/root/core' of 3094:/sbin/init
 feef3c97 _fxstat  (0, 8047560, 180, 8058927) + 7
 08058973 st_init  (fee201a8, 38, 0, fefccc54, 0, feffb804) + 8f
 080543dc main (1, 8047f6c, 8047f74, feffb804) + 150
 0805418d _start   (1, 8047fe0, 0, 0, 7d8, 8047feb) + 7d

I can send the core (it's only 2MB) if that helps.

My guess is that init (in the zone) is starting using a downrev 
libc (aka libc not upgraded yet), and is making a system call that 
has changed.  12 is SIGSYS.


-Steve

On 11/18/10 11:14 AM, Ian Collins wrote:
I run through the upgrade process on a system with half a dozen 
zones and on restart, they all get locked into a core dump/restart 
loop:


Nov 19 07:57:50 i7 genunix: [ID 729207 kern.warning] WARNING: 
init(1M) for zone webhost (pid 3094) core dumped on signal 12: 
restarting automatically


They all run through this cycle in tight loops.

Oops.


___
zones-discuss mailing list
zones-discuss@opensolaris.org





___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] zonestat, prstat -Z, and cpu accounting

2010-08-17 Thread Steve Lawrence



Peter Tribble wrote:

On Tue, Jul 27, 2010 at 5:34 PM, Steve Lawrence
stephen.lawre...@oracle.com wrote:
  

Now, if you want to account for cpu utilization by children,
why not use the pr_ctime member of the psinfo structure?
As far as I understand it, that collects cpu for child processes
that exit, so why can't that be used instead of monkeying
about with extended accounting?

What am I overlooking here?
  

I haven't explored an algorithm for tracking that.  At some point, the child
is running, in which case I may have added its current usage to the zone's
usage.  When the children exit later, I would need to figure out the parent
the usage was added to so I can avoid double counting usage I've already
charged to the zone.



Well, no, you just create the process tree and walk down it from the top.

Some might worry about missing data from a child that exits while you're
doing the measurement; I tend not to worry too much, because you'll
catch it on the next interval in any event.

  
I'm more worried about double counting.  For instance, if I count a 
proc, and then it exits, and I then count its parent's child usage, I 
will count the processes's usage twice.  I don't think that /proc 
guarantees to report all parents before all children when doing 
readdir().  I would need to investigate that further.




The other bit to solve would be zone-entered processes, which will have a
parent in the global zone.  The usage by thein-the-zone children would
bubble up to a parent in the global zone.  This would certainly be wrong.



Well, how common are zone-entered processes? And how should they be
accounted for anyway? (What sort of examples do we have?)

  
Anything that is zlogin'ed into the zone.   Admin's can zlogin anything 
they want into a zone, and that usage should be counted towards the 
zone's usage.




Thanks!

  

___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] zonestat, prstat -Z, and cpu accounting

2010-07-27 Thread Steve Lawrence



Peter Tribble wrote:

Looking at the recent zonestat arc case reminded me of something
I've been meaning to ask for a while.

In the case, it says:

  

prstat polls /proc, and will not account for
cpu used by short-lived processes.



and

  

Extended accounting must be used to
compute the cpu utilization because it will contain all data
associated with processes which have exited.



And this is something that's been discussed before.

Now, if you want to account for cpu utilization by children,
why not use the pr_ctime member of the psinfo structure?
As far as I understand it, that collects cpu for child processes
that exit, so why can't that be used instead of monkeying
about with extended accounting?

What am I overlooking here?

  
I haven't explored an algorithm for tracking that.  At some point, the 
child is running, in which case I may have added its current usage to 
the zone's usage.  When the children exit later, I would need to figure 
out the parent the usage was added to so I can avoid double counting 
usage I've already charged to the zone.


The other bit to solve would be zone-entered processes, which will have 
a parent in the global zone.  The usage by thein-the-zone children 
would bubble up to a parent in the global zone.  This would certainly be 
wrong.


-Steve










___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] failed to move zones in os2009.06 (attach failed)

2010-07-16 Thread Steve Lawrence

Did you try:

 # zoneadm attach -z  bibcmi4 -d rpool/zones/bibcmi4/ROOT/zbe-2

-d is an ipkg specific option.

-Steve L.

Gerard Henry wrote:
hello all, 
i need to move zones from serv1 to serv2. Every server is os2009.06 b111b

On serv1, i have, after detach:

serv1 # zfs list -r rpool/zones/bibcmi4
NAME USED  AVAIL  REFER  MOUNTPOINT
rpool/zones/bibcmi4 1.46G  8.43G38K  /zones/bibcmi4
rpool/zones/bibcmi4/ROOT1.46G  8.43G18K  legacy
rpool/zones/bibcmi4/ROOT/zbe1.33M  8.43G   828M  legacy
rpool/zones/bibcmi4/ROOT/zbe-2  1.45G  8.43G   877M  /zones/bibcmi4/root

following this post:
http://mail.opensolaris.org/pipermail/zones-discuss/2010-February/006060.html

i send the snapshots on serv2, and i have:
serv2 # zfs list -r rpool/zones/bibcmi4
NAME USED  AVAIL  REFER  MOUNTPOINT
rpool/zones/bibcmi4 1.29G  45.9G38K  /zones/bibcmi4
rpool/zones/bibcmi4/ROOT1.29G  45.9G19K  legacy
rpool/zones/bibcmi4/ROOT/zbe 494M  45.9G   494M  legacy
rpool/zones/bibcmi4/ROOT/zbe-2   825M  45.9G   825M  /zones/bibcmi4/root

(the last line was legacy, and i set the mountpoint manually)
i did a zonecfg, with create -a

but now, it fails to attach:
serv2 # zoneadm -z bibcmi4 attach 
ERROR: The -a, -d or -r option is required when there is no active root dataset.


i found another thread related to this message:
http://opensolaris.org/jive/thread.jspa?messageID=439124

but unfortunately, it doesn't help. 
The option doesn't exist in zoneadm, so i don't understand what's happen.

The official docs http://docs.sun.com/app/docs/doc/817-1592/zone?a=view doesn't 
take this case into account (zonepaths on zfs).

I need to move zones before upgrading to b134, 


thanks in advance for help,

gerard
  

___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] ZFS ARC cache issue

2010-06-03 Thread Steve Lawrence

Try zfs-discuss.


Ketan wrote:
We are having a server running zfs root with 64G RAM and the system has 3 zones running oracle fusion app and zfs cache is using 40G memory as per 

kstat zfs:0:arcstats:size.   and system shows only 5G of memory is free rest is taken by kernel and 2 remaining zones. 


Now my problem is that fusion guys are getting not enough memory message while 
starting their application due to top and vmstat shows 5G as free memory. But i 
read ZFS cache releases memory as required by the application so why fusion 
application is not starting up. Is there some we can do to decrease the ARC 
Cache usage on the fly without rebooting the global zone ?
  

___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] Webrev for CR 6909222

2009-12-21 Thread Steve Lawrence
The bug mentions that this can also impact a nevada zone that was p2v'ed.
Should you fix usr/src/lib/brand/native/zone as well?

-Steve

On Mon, Dec 21, 2009 at 03:46:00PM -0800, Jordan Vaughan wrote:
 I need someone to review my fix for

 6909222 reboot of system upgraded from 128 to build 129 generated error  
 from an s10 zone due to boot-archive

 My webrev is accessible via

 http://cr.opensolaris.org/~flippedb/onnv-s10c

 Thanks,
 Jordan
 ___
 zones-discuss mailing list
 zones-discuss@opensolaris.org
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] Webrev for CR 6782448

2009-12-21 Thread Steve Lawrence
Minor nit.  You could use != POC_STRING, put the Z_NO_ENTRY in the {}, and
put the success case after.  Not a required change.

LGTM.

-Steve

On Fri, Dec 18, 2009 at 07:28:52PM -0800, Jordan Vaughan wrote:
 I expanded my webrev to include my fix for

 6910339 zonecfg coredumps with badly formed 'select net defrouter'

 I need someone to review my changes.  The webrev is still accessible via

 http://cr.opensolaris.org/~flippedb/onnv-zone2

 Thanks,
 Jordan
 ___
 zones-discuss mailing list
 zones-discuss@opensolaris.org
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] Application leaking on local zone

2009-12-17 Thread Steve Lawrence
I recommend using libumem on the application.

Some folks were nice enough to write about it.

http://blogs.sun.com/pnayak/entry/finding_memory_leaks_within_solaris
http://blogs.sun.com/dlutz/entry/memory_leak_detection_with_libumem

-Steve

On Thu, Dec 17, 2009 at 12:09:11PM +0200, AdinaKalin wrote:
 Hello,

 I'm struggling with the following problem and I have no idea how to
 solve it.
 I'm testing an application which is running fine on a global zone,but
 memory leaking when installed on a local zone.

 The local zone has its whole root and a very simple, basic configuration:
 bash-3.00# zonecfg -z mdmMDMzone
 zonecfg:mdmMDMzone info
 zonename: mdmMDMzone
 zonepath: /mdmMDMzone
 brand: native
 autoboot: true
 bootargs:
 pool:
 limitpriv: default,dtrace_proc,dtrace_user,proc_priocntl,proc_lock_memory
 scheduling-class: FSS
 ip-type: shared
 net:
  address: 192.168.109.14
  physical: e1000g0
  defrouter not specified

 One of the application processes, when started on global zone, has an
 rss of about 5 GB ( prstat -s rss ) and it keeps this size to the end of
 the test. If I stop the application on global zone and I start it on
 local zone, the same process starts with the normal size ( 5gb on prstat
 -s rss ) but is growing  during the test ( I saw it 25GB on a server
 with 32 gb RAM ) until is failing. I don't understand why is this
 behavior and if the application has a memory leak, why I don't see it on
 the
 global zone.

 Any help is more than welcome!!!






 ___
 zones-discuss mailing list
 zones-discuss@opensolaris.org

___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] Application leaking on local zone

2009-12-17 Thread Steve Lawrence
If you can provide some ::findleak details about the particular memory leak,
perhaps someone can help.  Zones do not leak memory by design.

-Steve

On Thu, Dec 17, 2009 at 09:27:07PM +0200, AdinaKalin wrote:
Yes. I already read this sites. The question is why on global zone there
isn't a memory leak and on local zone there is ?!
 
Steve Lawrence wrote:
 
  I recommend using libumem on the application.
 
  Some folks were nice enough to write about it.
 
  [1]http://blogs.sun.com/pnayak/entry/finding_memory_leaks_within_solaris
  [2]http://blogs.sun.com/dlutz/entry/memory_leak_detection_with_libumem
 
  -Steve
 
  On Thu, Dec 17, 2009 at 12:09:11PM +0200, AdinaKalin wrote:
 
 
  Hello,
 
  I'm struggling with the following problem and I have no idea how to
  solve it.
  I'm testing an application which is running fine on a global zone,but
  memory leaking when installed on a local zone.
 
  The local zone has its whole root and a very simple, basic configuration:
  bash-3.00# zonecfg -z mdmMDMzone
  zonecfg:mdmMDMzone info
  zonename: mdmMDMzone
  zonepath: /mdmMDMzone
  brand: native
  autoboot: true
  bootargs:
  pool:
  limitpriv: default,dtrace_proc,dtrace_user,proc_priocntl,proc_lock_memory
  scheduling-class: FSS
  ip-type: shared
  net:
   address: 192.168.109.14
   physical: e1000g0
   defrouter not specified
 
  One of the application processes, when started on global zone, has an
  rss of about 5 GB ( prstat -s rss ) and it keeps this size to the end of
  the test. If I stop the application on global zone and I start it on
  local zone, the same process starts with the normal size ( 5gb on prstat
  -s rss ) but is growing  during the test ( I saw it 25GB on a server
  with 32 gb RAM ) until is failing. I don't understand why is this
  behavior and if the application has a memory leak, why I don't see it on
  the
  global zone.
 
  Any help is more than welcome!!!
 
 
 
 
 
 
 
 
 
  ___
  zones-discuss mailing list
  [3]zones-disc...@opensolaris.org
 
 
 
 
 References
 
Visible links
1. http://blogs.sun.com/pnayak/entry/finding_memory_leaks_within_solaris
2. http://blogs.sun.com/dlutz/entry/memory_leak_detection_with_libumem
3. mailto:zones-discuss@opensolaris.org


___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] zoneadm hangs after repeated boot/halt use

2009-12-11 Thread Steve Lawrence
Looks a lot like 6894901.  Can you try build 128?

-Steve

On Fri, Dec 11, 2009 at 03:48:52PM -0500, Glenn Brunette wrote:

 As part of some Immutable Service Container[1] demonstration that I am
 creating for an event in January.  I have the need to start/stop a zone
 quite a few times (as part of a Self-Cleansing[2] demo).  During the
 course of my testing, I have been able to repeatedly get zoneadm to
 hang.

 Since I am working with a highly customized configuration, I started
 over with a default zone on OpenSolaris (b127) and was able to repeat
 this issue.  To reproduce this problem use the following script after
 creating a zone usual the normal/default steps:

 isc...@osol-isc:~$ while : ; do
  echo `date`: ZONE BOOT
  pfexec zoneadm -z test boot
  sleep 30
  pfexec zoneamd -z test halt
  echo `date`: ZONE HALT
  sleep 10
  done

 This script works just fine for a while, but eventually zoneadm hangs
 (was at pass #90 in my last test).  When this happens, zoneadm is shown
 to be consuming quite a bit of CPU:

PID USERNAME  SIZE   RSS STATE  PRI NICE  TIME  CPU PROCESS/NLWP 

  16598 root   11M 3140K run  10   0:54:49  74% zoneadm/1


 A stack trace of zoneadm shows:

 isc...@osol-isc:~$ pfexec pstack `pgrep zoneadm`
 16082:zoneadmd -z test
 -  lwp# 1  
 -  lwp# 2  
  feef41c6 door (0, 0, 0, 0, 0, 8)
  feed99f7 door_unref_func (3ed2, fef81000, fe33efe8, f39e) + 67
  f3f3 _thrp_setup (fe5b0a00) + 9b
  f680 _lwp_start (fe5b0a00, 0, 0, 0, 0, 0)
 -  lwp# 3  
  feef420f __door_return () + 2f
 -  lwp# 4  
  feef420f door (0, 0, 0, fe140e00, f5f00, a)
  feed9f57 door_create_func (0, fef81000, fe140fe8, f39e) + 2f
  f3f3 _thrp_setup (fe5b1a00) + 9b
  f680 _lwp_start (fe5b1a00, 0, 0, 0, 0, 0)
 16598:zoneadm -z test boot
  feef3fc8 door (6, 80476d0, 0, 0, 0, 3)
  feede653 door_call (6, 80476d0, 400, fe3d43f7) + 7b
  fe3d44f0 zonecfg_call_zoneadmd (8047e33, 8047730, 8078448, 1) + 124
  0805792d boot_func (0, 8047d74, 100, 805ff0b) + 1cd
  08060125 main (4, 8047d64, 8047d78, 805570f) + 2b9
  0805576d _start   (4, 8047e28, 8047e30, 8047e33, 8047e38, 0) + 7d


 A stack trace of zoneadmd shows:

 isc...@osol-isc:~$ pfexec pstack `pgrep zoneadmd`
 16082:zoneadmd -z test
 -  lwp# 1  
 -  lwp# 2  
  feef41c6 door (0, 0, 0, 0, 0, 8)
  feed99f7 door_unref_func (3ed2, fef81000, fe33efe8, f39e) + 67
  f3f3 _thrp_setup (fe5b0a00) + 9b
  f680 _lwp_start (fe5b0a00, 0, 0, 0, 0, 0)
 -  lwp# 3  
  feef4147 __door_ucred (80a37c8, fef81000, fe23e838, feed9cfe) + 27
  feed9d0d door_ucred (fe23f870, 1000, 0, 0) + 32
  08058a88 server   (0, fe23f8f0, 510, 0, 0, 8058a04) + 84
  feef4240 __door_return () + 60
 -  lwp# 4  
  feef420f door (0, 0, 0, fe140e00, f5f00, a)
  feed9f57 door_create_func (0, fef81000, fe140fe8, f39e) + 2f
  f3f3 _thrp_setup (fe5b1a00) + 9b
  f680 _lwp_start (fe5b1a00, 0, 0, 0, 0, 0)


 A truss of zoneadm (-f -vall -wall -tall) shows this looping:

 16598:  door_call(6, 0x080476D0)= 0
 16598:  data_ptr=8047730 data_size=0
 16598:  desc_ptr=0x0 desc_num=0
 16598:  rbuf=0x807F2D8 rsize=4096
 16598:  close(6)= 0
 16598:  mkdir(/var/run/zones, 0700)   Err#17 EEXIST
 16598:  chmod(/var/run/zones, 0700)   = 0
 16598:  open(/var/run/zones/test.zoneadm.lock, O_RDWR|O_CREAT, 0600) = 6
 16598:  fcntl(6, F_SETLKW, 0x08046DC0)  = 0
 16598:  typ=F_WRLCK  whence=SEEK_SET start=0 len=0  
 sys=4277003009 pid=6
 16598:  open(/var/run/zones/test.zoneadmd_door, O_RDONLY) = 7
 16598:  door_info(7, 0x08047230)= 0
 16598:  target=16082 proc=0x8058A04 data=0x0
 16598:  attributes=DOOR_UNREF|DOOR_REFUSE_DESC|DOOR_NO_CANCEL
 16598:  uniquifier=26426
 16598:  close(7)= 0
 16598:  close(6)= 0
 16598:  open(/var/run/zones/test.zoneadmd_door, O_RDONLY) = 6
 16082/3:door_return(0x, 0, 0x, 0xFE23FE00,  
 1007360) = 0
 16082/3:door_ucred(0x080A37C8)  = 0
 16082/3:euid=0 egid=0
 16082/3:ruid=0 rgid=0
 16082/3:pid=16598 zoneid=0
 16082/3:E: all
 16082/3:I: basic
 16082/3:P: all
 16082/3:L: all


 PID 16598 is zoneadm and PID 16082 is zoneadmd.


 Is this a known issue?  Are there any other things that I can do to
 

Re: [zones-discuss] /var/run/zones not cleaned up ?

2009-12-10 Thread Steve Lawrence
Feature.  It is the F_WRLCK operation which takes the lock.  I suppose this
avoids having to deal with stale lock files from dead zoneadm's.

Similar for the door.  The door file is he who fattaches, not he who
creates the door file.

Saying that, I don't see a problem with the unlock/fdetach operations
removing these files, as long at they are truely done, and it ok for
a new lock/door to be created.

-Steve

On Thu, Dec 10, 2009 at 04:37:17PM +0100, Frank Batschulat (Home) wrote:
 is it to be expected that after no zoneadm/zoneadmd is running
 anymore, /var/run/zones still contains the corresponding lock files ?
 
 (also I looked at the current threadlist of my system and no zone releated
  kernel threads are running anymore)
 
 osoldev.root./var/run/zones.= zoneadm list -cp
 0:global:running:/::ipkg:shared
 -:zone2:configured:/tank/zones/zone2::ipkg:shared
 osoldev.root./var/run/zones.= ps -eafd|grep zone
 root  2961  2734   0 16:35:06 pts/2   0:00 grep zone
 osoldev.root./var/run/zones.= ls -la
 total 16
 drwx--   2 root root 335 Dec 10 12:23 .
 drwxr-xr-x  11 root sys 2423 Dec 10 12:21 ..
 -rw-r--r--   1 root root   0 Dec 10 12:23 index.lock
 -rw---   1 root root   0 Dec 10 12:21 zone1.zoneadm.lock
 -rw---   1 root root   0 Dec 10 12:21 zone1.zoneadmd_door
 
 this was after a zone boot/zone halt/zone uninstall/zone delete cycle.
 
 bug, feature ? 
 
 ---
 frankB
 ___
 zones-discuss mailing list
 zones-discuss@opensolaris.org
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] s10 p2v

2009-11-25 Thread Steve Lawrence
This feature exists in nevada, (nevada global to nevada zone), and is
currently being backported to s10u9.

-Steve L.

On Tue, Nov 24, 2009 at 07:41:03AM -0500, Dr. Hung-Sheng Tsao wrote:
 hi
 Is there p2v in s10 to move  from physical host to zone env?
 It seems that cpio of the apps directory should work
 regards



 ___
 zones-discuss mailing list
 zones-discuss@opensolaris.org

___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] restrcit physical memory with zone.max-locked-memory

2009-10-28 Thread Steve Lawrence
It limits the amount of physical memory that can be pinned by a zone by
mlock() or shmat(SHM_SHARE_MMU).  These are typically done by databases
or performance critical apps.

locked memory cannot be paged out.

-Steve L.

On Wed, Oct 28, 2009 at 10:25:01AM -0700, Ketan wrote:
 So for what purpose zone.max-locked-memory is used ?
 -- 
 This message posted from opensolaris.org
 ___
 zones-discuss mailing list
 zones-discuss@opensolaris.org
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] Difference between resource management attribbutes

2009-10-21 Thread Steve Lawrence
There is a kstat.  Look at the output of:

$ kstat -c project_caps -n 'lockedmem*'

-Steve L.

On Wed, Oct 21, 2009 at 05:43:03AM -0700, Ketan wrote:
 But there is one more thing if i set max-rss i can test it and see the task 
 under specified project does not exceeds the specified rss value but if i set 
 max-locked-memory to 1 gig how can i test that one ?
 -- 
 This message posted from opensolaris.org
 ___
 zones-discuss mailing list
 zones-discuss@opensolaris.org
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] Difference between resource management attribbutes

2009-10-20 Thread Steve Lawrence
On Tue, Oct 20, 2009 at 10:20:14AM -0700, Ketan wrote:
 Can anyone answer my questions 
 
 1. Whats the difference between project.max-locked-memory  and max-rss.
 And out these 2 which is the preferred way of limiting the physical memory in 
 a project or zone. 

max-rss limits both pageable and locked physical memory used by projects, so
it is an overall physical memory cap.  max-locked-memory is useful in limiting
applications which specifically lock physical memory.  This is useful because
if the project locks up to it's max-rss, there will be no memory left for
it, and it will page non-stop.  max-locked-memory can be set lower than
max-rss to protect against this.

-Steve L.

 
 2. How to restrict the swap memory in projects

There is no mechanism.  swap limits are only on zones.

 -- 
 This message posted from opensolaris.org
 ___
 zones-discuss mailing list
 zones-discuss@opensolaris.org
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] Zone copy in Live Upgrade

2009-07-29 Thread Steve Lawrence
On Wed, Jul 29, 2009 at 08:43:05AM +0200, Martin Rehak wrote:
 Hi Steve,
 
 On 2009.07.23 14:34:22 -0700, Steve Lawrence wrote:
  On Thu, Jul 23, 2009 at 09:14:55AM +0200, Martin Rehak wrote:
   Hi Steve,
   
   On 2009.07.22 12:32:01 -0700, Steve Lawrence wrote:
The issue is that from the global zone context (non-zlogin), stuff like
symbolic links to something like /etc could copy files from the global
zone.
   
   I don't understand it. cpio preserves symlinks, so symlinks will appear
   just like symlinks in NGZ and files as a files. That means no mapping/no
   risk. Am I right?
   
I'm not sure why this is dangerous in this case, as we are only reading
from the zone, as cpio does not traverse/open sym links, it just copes 
the
link itself.
  
  I don't see a problem with it, but you should get feedback from others as 
  well.
  I see a problem with the current implementation.  A spoofed cpio program in
  an evil non-global zone could create a desctructive cpio stream.  The
  cpio -icdmP@ in the global zone could write to /.
  
  Another solution could be to do the restore within the context of the
  zlogin, to a path mounted within the zone's root.
 
 I see.
 
 Is there any reason why we are doing a zone copy in the zlogin at all?
 Which problems would we face if we copy a zone from global zone. That
 would eliminate problems with evil zone environment completely.

I don't see one, but check with the install team.  I'm also not sure what
is being copied here.  Is this clause to copy the / filesystem inside
a zone, or just those added via add fs.  If the latter, I'm not sure why
they are being copied.  Does LU treat any of the zone's filesystems as
shared between BE's, similar to how it treats /export/home in the global
zone?

-Steve L.

 
 Many thanks
 -- 
 Martin
 
  -Steve L.
  
   
   That's what I think.
   
Does this all end up going through zlogin one byte at a time?
   
   Yes, the whole stream goes through zlogin from NGZ to GZ where it is
   expanded.
   
   What would be the problem if we wouldn't use any zlogin? Just a cpio on
   zone root to a cpio to other zone root? What is the risk there?
   
   Thank you
   -- 
   Martin
   
-Steve

On Wed, Jul 22, 2009 at 04:57:47PM +0200, Martin Rehak wrote:
 Hi,
 
 I am trying to get Live Upgrade better by reimplementing some parts of
 the code. What I am not sure of is whether is it safe to do a copy of
 non global zone imports (filesystems dedicated to a zone in its 
 config)
 from the global zone.
 
 This is existing code (lucopy.sh:1808, install-nv-clone):
 http://grok.czech.sun.com:8080/source/xref/install-nv-clone/usr/src/cmd/inst/liveupgrade/scripts/lucopy.sh
 
 1808  (
 1809  fgrep -xv $mountpoint /tmp/lucopy.zonefs.$$
 1810  cat /tmp/lucopy.zoneipd.$$
 1811  ) | sed 's+.*+^/+' |
 1812  zlogin $ozonename \
 1813  cat  /tmp/lucopy.excl.$$; \
 1814  (
 1815  if [ -s /tmp/lucopy.excl.$$ ]; then
 1816  cd $zroot$mountpoint  \
 1817  find . -depth -print | \
 1818  egrep -vf /tmp/lucopy.excl.$$ | \
 1819  cpio -ocmP@
 1820  else
 1821  cd $zroot$mountpoint  \
 1822find . -depth -print | cpio -ocmP@
 1823  fi
 1824  ) |
 1825  ( cd $tdir  cpio -icdmP@ )
 1826  lulib_unmount_pathname $tdir
 
 To describe it, I would say that it will zlogin into the non global
 zone, generates there a listing which it sends onto stdin of cpio 
 which
 writes an archive on its stdout. That archive is directed to the
 stdin of cpio running _OUTSIDE_ the zone (in the global zone) which
 finally expands it and writes it to a target directory.
 
 Unfortunatelly few lines above there is this comment:
 
 1769  # Mount each non-lofs zone import in a temporary location
 1770  # and copy over the bits that belong there, extracted from
 1771  # the running zone.  We are now reaching through zone-
 1772  # controlled paths and thus must be extremely careful.
 1773  # Direct copies are not safe.
 
 And the question is: What can happen if I simply will not generate the
 listing and the archive inside the zone but will do it in the global
 zone and using 'cpio -p'?
 
 If I generalize the problem a little bit more I would like to know 
 your
 opinion about my idea of copying whole BE including zones in just one
 'cpio -p'. Why it wouldn't work, please?
 
 Thank you very much for your any reply
 -- 
 Martin Rehak
 ___
 zones-discuss mailing list
 zones-discuss@opensolaris.org
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] Zone copy in Live Upgrade

2009-07-23 Thread Steve Lawrence
On Thu, Jul 23, 2009 at 09:14:55AM +0200, Martin Rehak wrote:
 Hi Steve,
 
 On 2009.07.22 12:32:01 -0700, Steve Lawrence wrote:
  The issue is that from the global zone context (non-zlogin), stuff like
  symbolic links to something like /etc could copy files from the global
  zone.
 
 I don't understand it. cpio preserves symlinks, so symlinks will appear
 just like symlinks in NGZ and files as a files. That means no mapping/no
 risk. Am I right?
 
  I'm not sure why this is dangerous in this case, as we are only reading
  from the zone, as cpio does not traverse/open sym links, it just copes the
  link itself.

I don't see a problem with it, but you should get feedback from others as well.
I see a problem with the current implementation.  A spoofed cpio program in
an evil non-global zone could create a desctructive cpio stream.  The
cpio -icdmP@ in the global zone could write to /.

Another solution could be to do the restore within the context of the
zlogin, to a path mounted within the zone's root.

-Steve L.

 
 That's what I think.
 
  Does this all end up going through zlogin one byte at a time?
 
 Yes, the whole stream goes through zlogin from NGZ to GZ where it is
 expanded.
 
 What would be the problem if we wouldn't use any zlogin? Just a cpio on
 zone root to a cpio to other zone root? What is the risk there?
 
 Thank you
 -- 
 Martin
 
  -Steve
  
  On Wed, Jul 22, 2009 at 04:57:47PM +0200, Martin Rehak wrote:
   Hi,
   
   I am trying to get Live Upgrade better by reimplementing some parts of
   the code. What I am not sure of is whether is it safe to do a copy of
   non global zone imports (filesystems dedicated to a zone in its config)
   from the global zone.
   
   This is existing code (lucopy.sh:1808, install-nv-clone):
   http://grok.czech.sun.com:8080/source/xref/install-nv-clone/usr/src/cmd/inst/liveupgrade/scripts/lucopy.sh
   
   1808  (
   1809  fgrep -xv $mountpoint /tmp/lucopy.zonefs.$$
   1810  cat /tmp/lucopy.zoneipd.$$
   1811  ) | sed 's+.*+^/+' |
   1812  zlogin $ozonename \
   1813  cat  /tmp/lucopy.excl.$$; \
   1814  (
   1815  if [ -s /tmp/lucopy.excl.$$ ]; then
   1816  cd $zroot$mountpoint  \
   1817  find . -depth -print | \
   1818  egrep -vf /tmp/lucopy.excl.$$ | \
   1819  cpio -ocmP@
   1820  else
   1821  cd $zroot$mountpoint  \
   1822find . -depth -print | cpio -ocmP@
   1823  fi
   1824  ) |
   1825  ( cd $tdir  cpio -icdmP@ )
   1826  lulib_unmount_pathname $tdir
   
   To describe it, I would say that it will zlogin into the non global
   zone, generates there a listing which it sends onto stdin of cpio which
   writes an archive on its stdout. That archive is directed to the
   stdin of cpio running _OUTSIDE_ the zone (in the global zone) which
   finally expands it and writes it to a target directory.
   
   Unfortunatelly few lines above there is this comment:
   
   1769  # Mount each non-lofs zone import in a temporary location
   1770  # and copy over the bits that belong there, extracted from
   1771  # the running zone.  We are now reaching through zone-
   1772  # controlled paths and thus must be extremely careful.
   1773  # Direct copies are not safe.
   
   And the question is: What can happen if I simply will not generate the
   listing and the archive inside the zone but will do it in the global
   zone and using 'cpio -p'?
   
   If I generalize the problem a little bit more I would like to know your
   opinion about my idea of copying whole BE including zones in just one
   'cpio -p'. Why it wouldn't work, please?
   
   Thank you very much for your any reply
   -- 
   Martin Rehak
   ___
   zones-discuss mailing list
   zones-discuss@opensolaris.org
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] Zone copy in Live Upgrade

2009-07-22 Thread Steve Lawrence
The issue is that from the global zone context (non-zlogin), stuff like
symbolic links to something like /etc could copy files from the global
zone.

I'm not sure why this is dangerous in this case, as we are only reading
from the zone, as cpio does not traverse/open sym links, it just copes the
link itself.

Does this all end up going through zlogin one byte at a time?

-Steve

On Wed, Jul 22, 2009 at 04:57:47PM +0200, Martin Rehak wrote:
 Hi,
 
 I am trying to get Live Upgrade better by reimplementing some parts of
 the code. What I am not sure of is whether is it safe to do a copy of
 non global zone imports (filesystems dedicated to a zone in its config)
 from the global zone.
 
 This is existing code (lucopy.sh:1808, install-nv-clone):
 http://grok.czech.sun.com:8080/source/xref/install-nv-clone/usr/src/cmd/inst/liveupgrade/scripts/lucopy.sh
 
 1808  (
 1809  fgrep -xv $mountpoint /tmp/lucopy.zonefs.$$
 1810  cat /tmp/lucopy.zoneipd.$$
 1811  ) | sed 's+.*+^/+' |
 1812  zlogin $ozonename \
 1813  cat  /tmp/lucopy.excl.$$; \
 1814  (
 1815  if [ -s /tmp/lucopy.excl.$$ ]; then
 1816  cd $zroot$mountpoint  \
 1817  find . -depth -print | \
 1818  egrep -vf /tmp/lucopy.excl.$$ | \
 1819  cpio -ocmP@
 1820  else
 1821  cd $zroot$mountpoint  \
 1822find . -depth -print | cpio -ocmP@
 1823  fi
 1824  ) |
 1825  ( cd $tdir  cpio -icdmP@ )
 1826  lulib_unmount_pathname $tdir
 
 To describe it, I would say that it will zlogin into the non global
 zone, generates there a listing which it sends onto stdin of cpio which
 writes an archive on its stdout. That archive is directed to the
 stdin of cpio running _OUTSIDE_ the zone (in the global zone) which
 finally expands it and writes it to a target directory.
 
 Unfortunatelly few lines above there is this comment:
 
 1769  # Mount each non-lofs zone import in a temporary location
 1770  # and copy over the bits that belong there, extracted from
 1771  # the running zone.  We are now reaching through zone-
 1772  # controlled paths and thus must be extremely careful.
 1773  # Direct copies are not safe.
 
 And the question is: What can happen if I simply will not generate the
 listing and the archive inside the zone but will do it in the global
 zone and using 'cpio -p'?
 
 If I generalize the problem a little bit more I would like to know your
 opinion about my idea of copying whole BE including zones in just one
 'cpio -p'. Why it wouldn't work, please?
 
 Thank you very much for your any reply
 -- 
 Martin Rehak
 ___
 zones-discuss mailing list
 zones-discuss@opensolaris.org
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] sysidcfg requires zlogin

2009-07-21 Thread Steve Lawrence
If you want to configure the ip addresss within the zone with
sysidcfg/hostname.* files, then you need to use exclusive ip stack zones:

zonecfg -z zweb$Z set iptype=exclusive

-Steve L.

On Wed, Jul 15, 2009 at 03:55:27PM -0700, Patrick J. McEvoy wrote:
  the only thing that comes to mind for me right now is
  a possible 
  mis-match of the installation repository used and the
  repository when 
  the zone was created.
 
 I could re-create the master -- i.e. that which I am cloning -- if you think 
 that would make a difference. BTW, the master has never been booted -- I 
 assume this makes no difference. This is the script I run to test cloning:
 
 Z=1
 
 zoneadm -z zweb$Z halt
 zoneadm -z zweb$Z uninstall -F
 zonecfg -z zweb$Z delete -F
 
 zonecfg -z zclone export | zonecfg -z zweb$Z -f -
 zonecfg -z zweb$Z set zonepath=/zonefs/zweb$Z
 zonecfg -z zweb$Z add net; set physical=vphys1; end
 zonecfg -z zweb$Z add net; set physical=vweb$Z; end
 zoneadm -z zweb$Z clone zclone
 zoneadm -z zweb$Z ready
 cp ./sysidcfg /zonefs/zweb$Z/root/etc/sysidcfg
 touch /zonefs/zweb$Z/root/etc/hostname.vphys1
 touch /zonefs/zweb$Z/root/etc/hostname.vweb1
 #zoneadm -z zweb$Z halt
 
 zoneadm -z zweb$Z boot
 #zlogin -C zweb$Z
 -- 
 This message posted from opensolaris.org
 ___
 zones-discuss mailing list
 zones-discuss@opensolaris.org
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] Parallel mount question

2009-06-29 Thread Steve Lawrence
On Mon, Jun 29, 2009 at 08:00:28PM +0200, William Roche wrote:
 Hello Vladi,

 Yes you can use LOFS to all your zones to share the file system providing 
 r/w access. I would even say that this is your BEST option.

 NFS mount in your local zones of a file system shared by the global zone is 
 absolutely not supported (including autofs access of course).

I think each zone's automounter is smart enough to use lofs instead of nfs for
mounts from a non-global to a global zone.

-Steve L.


 HTH,
 William.


 On 06/29/09 18:25, Yanakiev, Vladimir wrote:
 Need a help with a problem. We have VxFS file system, created in a
 global zone, and mounted under non-global zone as LOFS. Later, two new
 zones were created on the same server, that needed access to the very
 same file system. Someone decided to NFS-shareout this file system from
 the global zone, and NFS mount it on these two new zones. This (to my
 understanding) after few weeks corrupted bravely the file system, and
 today we experienced the same for second time.

 My question is - can I keep the file system in the global zone, loop
 back it (with LOFS) to all three zones, providing r/w access to all of
 them, without risk to corrupt it again?

 Thanks in advance for the help!

 Vladi
 ___
 zones-discuss mailing list
 zones-discuss@opensolaris.org
 ___
 zones-discuss mailing list
 zones-discuss@opensolaris.org
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] Weird Solaris 8 container problem (fwd)

2009-06-23 Thread Steve Lawrence

Hey Rich,

Looks like it is crashing in the jvm (in JIT code). There are also some
jni libraries loaded and being called into by other threads.  It would help
to know what library is mapped in.  I think support should be contacted.
My first guess is that they are hitting either an old bug in the jvm, or
whatever jni library they are using.  This could be an s8c issue, but I'd
have somebody look ath the java first.

-Steve L.

On Tue, Jun 23, 2009 at 09:35:06AM -0700, Rich Teer wrote:
 Hi Steve,
 
 Here are the answers to your questions, as provided by my customer.
 I hope the attached pstack and mdb sessions get through unscathed!
 
 Cheers,
 
 -- 
 Rich Teer, SCSA, SCNA, SCSECA
 
 URLs: http://www.rite-group.com/rich
   http://www.linkedin.com/in/richteer
 
 -- Forwarded message --
 Date: Thu, 18 Jun 2009 12:55:23 -0700
 
 Thank you Rich, 
 
 I am collecting the requested information. Can you help me please answer
 these questions better? Here are preliminary versions of the answers
 
 
  Do you mean you've tried both 1.2.2 and 1.5?  Is the failure identical
  with
  both jvms?
 
 Answer: The server would not start up with Java 1.5. The Solaris
 native version of 1.2.2 crashes on the first (?) execution of the Java
 code. 
 
 
  The ::stack should be correct.  Have you looked at the core from the
  global zone?  It shouldn't matter, but it can't hurt.  Also use pstack
  on the core.
 
 Answer: I am attaching the results of pstack and MDB session to this
 email 
 I did not look at the core from the global zone. I am not sure how to do
 that. 
 
  
  Is it in hotspot (dynamcially generated) code?  If so the function
 names
  will
  just be hex.  Is it dying the jvm itself, or in jni code (java
 bindings to
  native code provided by Vantive)?
 
 I don't know how to answer this question. 
 The server uses libjvm.so rather than creates a JVM session (process)
 using fork() or exec().
 
 
  Is the application threaded?
 
 Yes, it is threaded. 
 
 
 Thank you, 
 
 Vlad
 
 P.S. Currently playing with MDB with a little success. I don't know
 assembler 
 
 
 
 
  -Original Message-
  From: Rich Teer [mailto:rich.t...@rite-group.com]
  Sent: Thursday, June 18, 2009 10:14 AM
  To: Vladimir Ryzhov; Andy Woodward
  Subject: Re: [zones-discuss] Weird Solaris 8 container problem (fwd)
  
  Hi guys,
  
  Here's a response I got on the Zones mailing list about the weird
  Vantive crashes.  Could you please reply to me with the answers to
  Steve's questions, and I'll forward them to the list.
  
  --
  Rich Teer, SCSA, SCNA, SCSECA
  
  URLs: http://www.rite-group.com/rich
http://www.linkedin.com/in/richteer
  
  -- Forwarded message --
  Date: Wed, 17 Jun 2009 17:59:04 -0700
  From: Steve Lawrence stephen.lawre...@sun.com
  To: Rich Teer rich.t...@rite-group.com
  Cc: Zones discuss zones-discuss@opensolaris.org
  Subject: Re: [zones-discuss] Weird Solaris 8 container problem
  
  On Wed, Jun 17, 2009 at 02:56:59PM -0700, Rich Teer wrote:
   Hi all,
  
   IHAC who is trying to run one of their applications in a Solaris 8
   branded zone.  The global OS is Solaris 10 5/09 and we're using
   Solaris 8 containers version 1.0.1, on a Sun Fire 280R server with
   a 750 MHz CPU and 6 GB of RAM.
  
   Although apparently quite tempremental, their app runs acceptably
   when run on Solaris 8 natively (i.e., S8 on bare metal rather than
   in a branded zone), but crashes very frequently when run in the
   branded container.  Of course, the applicatiopn source code is
   unavailable...  :-(
  
   The application, called Vantive, talks to an Oracle 8.1.6 database,
   and is written in Java.  The Vantive app ships with version 1.2.2
   of the Java runtime, and we've tried version 1.5.
  
  Do you mean you've tried both 1.2.2 and 1.5?  Is the failure identical
  with
  both jvms?
  
  
   Annoyingly, when we try trussing the errant process, it doesn't
 crash!
   When a crash does happen, a core dump usually occurs, but hasn't
 been
   too helpful.  The crashes do seem to be happening from within the
 JVM,
   if the ::stack output from mdb is to be believed.
  
  The ::stack should be correct.  Have you looked at the core from the
  global zone?  It shouldn't matter, but it can't hurt.  Also use pstack
  on the core.
  
  Is it in hotspot (dynamcially generated) code?  If so the function
 names
  will
  just be hex.  Is it dying the jvm itself, or in jni code (java
 bindings to
  native code provided by Vantive)?
  
  
   Does this ring any bells?  Is there anything we can do to help debug
  this?
   Note that the branded zone seems to work just fine apart from this
 one
   (rather major) issue.
  
  No bells.  The only Java problem I've seen was in 1.1.8, and it does
 not
  exist on java 1.2+.  Best to contact support to debug.  Could be a jvm
 (or
  other bug) that they were lucky enough to never hit on their native s8
  system.  Given the truss thing, it is likely a race/timing

Re: [zones-discuss] Probs with S8 Branded Zone on b114

2009-06-05 Thread Steve Lawrence
On Fri, Jun 05, 2009 at 07:13:14AM -0700, Rich Teer wrote:
 On Thu, 4 Jun 2009, Steve Lawrence wrote:
 
  That's correct.  You need only install the 1.0.1 SUNWs?brandk package for
  each, which enable the brand(s).
 
 Cool.  So I don't need to install the packages under the 1.0 tree before
 I install the 1.0.1 package?

No.  Those are only used for s10u4 or s10u5 systems.  You should already
have them if you install s10u6 or later.

-Steve

 
 -- 
 Rich Teer, SCSA, SCNA, SCSECA
 
 URLs: http://www.rite-group.com/rich
   http://www.linkedin.com/in/richteer
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] Probs with S8 Branded Zone on b114

2009-06-04 Thread Steve Lawrence
S8C and S9C do not run on sxce or opensolaris.  They can be hosted on
Solaris 10, using any filesystem which supports zones, including zfs.

-Steve L.

On Thu, Jun 04, 2009 at 12:06:10PM -0700, Rich Teer wrote:
 Hi all,
 
 I'm trying to install a Solaris 8 branded zone on a 280R running
 SXCE b114.  I've downloaded the Solaris 8 zone software version
 1.0.1 and installed the SUNWs8brandr and SUNWs8brandu packages
 from the 1.0/Product directory, followed by the SUNWs8brandk
 package from the 1.0.1/Product directory.  All packages appeared
 to install OK, except for a warning about a missing S10 Kernel
 patch which I ignored because I'm running SXCE b114.  :-)
 
 When I run the zonecfg command to create the branded zone, it all
 goes pear-shaped:
 
 bash-3.2# zonecfg -z vantive_test
 vantive_test: No such zone configured
 Use 'create' to begin configuring a new zone.
 zonecfg:vantive_test create -t SUNWsolaris8
 zonecfg:vantive_test set zonepath=/zones/vantive_test
 zonecfg:vantive_test set autoboot=true
 zonecfg:vantive_test verify
 vantive_test: unknown brand.
 
 vantive_test: Invalid document
 
 The zonepath will be living on ZFS, but I doubt that's important.
 
 Any clues greatfully received!
 
 Cheers,
 
 -- 
 Rich Teer, SCSA, SCNA, SCSECA
 
 URLs: http://www.rite-group.com/rich
   http://www.linkedin.com/in/richteer
 ___
 zones-discuss mailing list
 zones-discuss@opensolaris.org
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] Probs with S8 Branded Zone on b114

2009-06-04 Thread Steve Lawrence
On Thu, Jun 04, 2009 at 01:12:00PM -0700, Rich Teer wrote:
 On Thu, 4 Jun 2009, Steve Lawrence wrote:
 
 Hi Steve,
 
  S8C and S9C do not run on sxce or opensolaris.  They can be hosted on
  Solaris 10, using any filesystem which supports zones, including zfs.
 
 Thanks for confirming the bad news  :-(
 
 Would I be correct in thinking that S8C and S9C work fine in the latest
 version of Solaris 10, i.e., S10 5/09?

That's correct.  You need only install the 1.0.1 SUNWs?brandk package for
each, which enable the brand(s).

-Steve

 
 Cheers,
 
 -- 
 Rich Teer, SCSA, SCNA, SCSECA
 
 URLs: http://www.rite-group.com/rich
   http://www.linkedin.com/in/richteer
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] Zone Stuck in a shutting_down state

2009-05-06 Thread Steve Lawrence
I already tried killing the zoneadmd process and issuing the halt and all
it does is start back up the zoneadmd process and hang.* I can't force a
crashdump on the system since I can't take the box down.
 
Bug 6272846 makes reference to nfs version 3, (which is the version we are
using), and the client apparently leaking rnodes. Is there any way to
verify this other then a forced crashdump? I might take a live core of the
system and open a case to see if that yields anything.

The zone_ref  1 means that something in the kernel is holding the zone.
You should be able to use mdb -k on the live system, and issue dcmds similar
to the comments of 6272846.  No need to force a crashdump or take a live
crashdump.

-Steve L.

 
Derek
 
On Wed, May 6, 2009 at 4:08 PM, Steve Lawrence
[1]stephen.lawre...@sun.com wrote:
 
  zsched is always unkillable. *It will only exit when instructed to by
  zoneadmd.
 
  Is the remaining zone shutting down, or down? *(zoneadm list -v).
 
  What is the ref_count on the zone?
 
  # mdb -k
   ::walk zone | ::print zone_t zone_name zone_ref
 
  If the refcount is greater than 0x1, it could be:
  * * * *6272846 User orders zone death; NFS client thumbs nose
 
  No workaround for this one. *A crashdump would help investigate a
  zone_ref
  greater than 1.
 
  Is there a zoneadmd process for the given zone?
  # pgrep -lf zoneadmd
 
  If so, please provide *truss -p pid of this process. *You may also
  attempt
  killing this zoneadmd process (which lives in the global zone), and then
  re-attempting zoneadm -z zonename halt.
 
  Thanks,
 
  -Steve L.
 
 References
 
Visible links
1. mailto:stephen.lawre...@sun.com
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] Zone Stuck in a shutting_down state

2009-05-06 Thread Steve Lawrence
Related comments from bug below (X'ed out some paths):

The zone in question clearly has too many references

 030004a09680::print zone_t zone_ref  
zone_ref = 0t11

Ten too many, to be precise.  So what's holding onto the zone?  Well the rnode
cache has 5 entries

 ::kmem_cache ! grep rnode
030003a1e988 rnode_cache    00  640   572988
030003a20988 rnode4_cache   00  9840
 030003a1e988::walk kmem | ::print rnode_t r_vnode | ::vnode2path
/opt/zones/z1/root/
/opt/zones/z1/root/
/opt/zones/z1/root/
/opt/zones/z1/root/
/opt/zones/z1/root/

even though no nfs filesystems are mounted

 ::fsinfo
VFSP FS  MOUNT
0187f420 ufs /
0187f508 devfs   /devices
03315780 ctfs/system/contract
033156c0 proc/proc
03315600 mntfs   /etc/mnttab
03315480 tmpfs   /etc/svc/volatile
033153c0 objfs   /system/object
0300039987c0 namefs  /etc/svc/volatile/repository_door
0300039984c0 fd  /dev/fd
030003a99e00 ufs /var
030003998400 tmpfs   /tmp
030003a99680 tmpfs   /var/run
030003a98f00 namefs  /var/run/name_service_door
030003a98b40 namefs  /var/run/sysevent_channels/syseventd_channel...
030003a989c0 namefs  /etc/sysevent/sysevent_door
030003a98780 namefs  /etc/sysevent/devfsadm_event_channel/1
030003a98540 namefs  /dev/.zone_reg_door
030003a983c0 namefs  /dev/.devfsadm_synch_door
030003a99380 namefs  /etc/sysevent/piclevent_door
0300044b1d80 namefs  /var/run/picld_door
030003a99200 ufs /opt
0300044b0700 namefs  /var/run/zones/z1.zoneadmd_door

And as apparent from the path, all of those rnodes refer to zone z1 through
their mntinfo structure

 030003a1e988::walk kmem | ::print rnode_t r_vnode-v_vfsp-vfs_data | 
 ::print mntinfo_t mi_zone | ::zone
ADDR ID NAME PATH
030004a09680  1 z1   /opt/zones/z1/root/
030004a09680  1 z1   /opt/zones/z1/root/
030004a09680  1 z1   /opt/zones/z1/root/
030004a09680  1 z1   /opt/zones/z1/root/
030004a09680  1 z1   /opt/zones/z1/root/

So if each of those rnodes has two holds on the zone, then that accounts for
all of the extra holds exactly.


On Wed, May 06, 2009 at 09:04:54PM -0500, Derek McEachern wrote:
I don't believe that I can see the comments since they are not public.
 
Is that something you can pass along?
 
On Wed, May 6, 2009 at 5:27 PM, Steve Lawrence
[1]stephen.lawre...@sun.com wrote:
 
   * *I already tried killing the zoneadmd process and issuing the halt
  and all
   * *it does is start back up the zoneadmd process and hang.* I can't
  force a
   * *crashdump on the system since I can't take the box down.
  
   * *Bug 6272846 makes reference to nfs version 3, (which is the version
  we are
   * *using), and the client apparently leaking rnodes. Is there any way
  to
   * *verify this other then a forced crashdump? I might take a live core
  of the
   * *system and open a case to see if that yields anything.
 
  The zone_ref  1 means that something in the kernel is holding the zone.
  You should be able to use mdb -k on the live system, and issue dcmds
  similar
  to the comments of 6272846. *No need to force a crashdump or take a live
  crashdump.
 
  -Steve L.
  
   * *Derek
  
   * *On Wed, May 6, 2009 at 4:08 PM, Steve Lawrence
   * *[1][2]stephen.lawre...@sun.com wrote:
  
   * * *zsched is always unkillable. *It will only exit when instructed
  to by
   * * *zoneadmd.
  
   * * *Is the remaining zone shutting down, or down? *(zoneadm list
  -v).
  
   * * *What is the ref_count on the zone?
  
   * * *# mdb -k
   * * * ::walk zone | ::print zone_t zone_name zone_ref
  
   * * *If the refcount is greater than 0x1, it could be:
   * * ** * * *6272846 User orders zone death; NFS client thumbs nose
  
   * * *No workaround for this one. *A crashdump would help investigate a
   * * *zone_ref
   * * *greater than 1.
  
   * * *Is there a zoneadmd process for the given zone?
   * * *# pgrep -lf zoneadmd
  
   * * *If so, please provide *truss -p pid of this process. *You may
  also
   * * *attempt
   * * *killing this zoneadmd process (which lives in the global zone),
  and then
   * * *re-attempting zoneadm -z zonename halt.
  
   * * *Thanks,
  
   * * *-Steve L.
  
   References
  
   * *Visible links
   * *1. mailto:[3]stephen.lawre...@sun.com

Re: [zones-discuss] Zone in a pset with high load generating high packet loss at the frame level

2009-03-05 Thread Steve Lawrence
On Thu, Mar 05, 2009 at 01:22:25PM -0500, Jeff Victor wrote:
 Thanks for the great feedback Gael. Comments below.
 
 On Thu, Mar 5, 2009 at 11:00 AM, Gael gael.marti...@gmail.com wrote:
 
  On Wed, Mar 4, 2009 at 9:06 AM, Jeff Victor jeff.j.vic...@gmail.com wrote:
 
  Some questions:
  1. Do you use set pool= anymore, now that the dedicated-cpu feature 
  exists?
 
  We got over one hundred physical frames running zones here, covering nearly
  all versions of Solaris 10, we are currently sticking to set pool until we
  can get the whole environment upgraded. Before that, cannot afford to have
  the whole team of admins handling zones differently depending on the OS
  version. Headache...
 
 It is now clear to me that this feature would need to support
 disabling interrupts when a zone uses set pool=. Currently, all pool
 attributes are configured using the pool tools (poolcfg, pooladm) and
 I don't see any reason to not continue. When I write this up, it will
 fulfill that need.

Ae you proposing that we add support for pset-interrupt disposition config
to the pools framework?  Such as a property on a pool-pset
boolean pset.interrupts = false??

I think the right solution for pool= is this or similar.  It could also
be a string value, such as:

none  no interrupts handled on cpus in the pool-pset.
zone  Device interrupts for bound zones are serviced.
any   Any device interrupts can be dispatched to the pset.

Zonecfg could make use of these pool-pset properties to implement the
desired behavior for dedicated-cpu.

The default value should be any.  zonecfg should set zone for all
dedicated-cpu zones.  zoneadm could warn if pool= is set, the zone has
dedicated devices, zone the pset for that pool has not been configured to
be zone.

legacy psets (psrset) could be extended to support this property via some
new flags.

Ther other part of this is how to reconsile zonecfg and/or pools settings
for interrupts, with device-cpu mappings that are specified via dladm.
Currently, dladm allows the specification of a list of cpu ids.  Another
way to approach this would be to point dladm directly at the desired pool.

-Steve
 
  2. Is it sufficient to simply disable interrupts on a zone's pset?
 
  In our case, we do pset only when licensing requires it (aka
  oracle,datastage,sybase,borland apps) or when the applications behave poorly
  and we keep hearing that by lack of budget/resources, the issue cannot be
  addressed and without direct impact on the business itself, nothing will
  change.
 
 Gael, I realized that my question was vague. When you use a pool,
 you're using a pset. Do you mean that you only use pools and psets
 when licensing requires it?
 
 Also, I couldn't tell how the comment responded to the question.
 
  What about creating an IO pset, and then disabling the interrupt on
  everything else while using it as a FSS pool or psets pools ? Very similar
  to ldom I would think...
 
 Yes, that occurred to me, too. You can do that now, either with a pset
 that's being used by a zone or with the default pset. But I'm not
 convinced there's enough reason to separate an I/O pset from the
 default pset. There's great potential for wasted CPU cycles.
 
 
 --JeffV
 ___
 zones-discuss mailing list
 zones-discuss@opensolaris.org
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] Zone in a pset with high load generating high packet loss at the frame level

2009-03-05 Thread Steve Lawrence
On Thu, Mar 05, 2009 at 04:12:19PM -0500, Jeff Victor wrote:
 On Thu, Mar 5, 2009 at 1:48 PM, Steve Lawrence stephen.lawre...@sun.com 
 wrote:
  On Thu, Mar 05, 2009 at 01:22:25PM -0500, Jeff Victor wrote:
  On Thu, Mar 5, 2009 at 11:00 AM, Gael gael.marti...@gmail.com wrote:
   On Wed, Mar 4, 2009 at 9:06 AM, Jeff Victor jeff.j.vic...@gmail.com 
   wrote:
  
   Some questions:
   1. Do you use set pool= anymore, now that the dedicated-cpu feature 
   exists?
  
  It is now clear to me that this feature would need to support
  disabling interrupts when a zone uses set pool=. Currently, all pool
  attributes are configured using the pool tools (poolcfg, pooladm) and
  I don't see any reason to not continue. When I write this up, it will
  fulfill that need.
 
  Ae you proposing that we add support for pset-interrupt disposition config
  to the pools framework?  Such as a property on a pool-pset
 boolean pset.interrupts = false??
 
 The short answer is yes.  BobN and I came to the same conclusion
 just a few hours ago... :-)
 
 CPUs already have cpu.status which can be  on-line, no-intr (LWPs but
 no interrupt handlers), or off-line (no LWPs but still able to handle
 interrupts). A pset.interrupts field would allow Solaris to set
 cpu.status on CPUs as they enter the pset.  Zones could then use that
 so we can increase their isolation. When a CPU re-enters the default
 pset, it becomes able to handle interrupts again. When needed, intrd
 will give it one (or more).
 
 
 
  I think the right solution for pool= is this or similar.  It could also
  be a string value, such as:
 
 none  no interrupts handled on cpus in the pool-pset.
 zone  Device interrupts for bound zones are serviced.
 any   Any device interrupts can be dispatched to the pset.
 
 I don't see how we could do zone in all situations - there isn't a
 1:1 mapping between zone and device (except for exclusive-IP).
 
  Imagine zoneA and zoneB on a pset (psetAB) with pset.interrupts=zone.
 Further, zoneA and zoneC share e1000g0, but zoneB doesn't. Finally,
 zoneC has its own pset. Where does the interrupt handler for e1000g0
 go - psetAB or psetC?

I was thinking in the exclusive case.  For shared stack zones, the devices
would all be bound to the global zone's (aka default) pset.

 
 Or are you suggesting that interrupts from one device can be
 intercepted and diverted to a CPU associated with a specific pset,
 based on which process the interrupt is/should be associated with?

No, although I'm not sure what is configurable for vnics.  It may be possible
for shared stack zones using a exclusive vnic (not exclusive stack) to have
some of the vnic workload bound to it's pset.

 
 Or am I misunderstanding the description of zone?
 
 
  Zonecfg could make use of these pool-pset properties to implement the
  desired behavior for dedicated-cpu.
 
 Exactly.
 
  The default value should be any.  zonecfg should set zone for all
  dedicated-cpu zones.  zoneadm could warn if pool= is set, the zone has
  dedicated devices, zone the pset for that pool has not been configured to
  be zone.
 
 The only devices we can be sure are dedicated for the boot-session of
 a zone are NICs. So this whole segregate the interrupts per zone/pset
 combo will be limited at best. It would be nice if we could
 generalize it like you say, but I don't think it's workable yet.

Agreed.  This is really just for network devices at this point.

 
  legacy psets (psrset) could be extended to support this property via some 
  new flags.
 
  Ther other part of this is how to reconsile zonecfg and/or pools settings
  for interrupts, with device-cpu mappings that are specified via dladm.
  Currently, dladm allows the specification of a list of cpu ids.  Another
  way to approach this would be to point dladm directly at the desired pool.
 
 Which currently are you on? :-)  I'm on NV94 and I don't see
 anything like that in dladm(1M)

Crossbow when into 105.

http://blogs.sun.com/nitin/entry/resource_allocation_for_network_processing

 
 I'm beginning to think this is really a two-phase project:
 * Phase 1: make it easier to disable interrupts on a zone's pset (one
 configured with the pool property or dedicated-cpu resource)
 * Phase 2: optimize this by enabling a zone's pset to handle
 interrupts from a device which is exclusively bound to this zone.

As long as phase one is compatable with phase two, meaning that this
case such as this one are properly defined:
1. pool mypool has property interrupts=disabled.
2. Zone has pool=mypool
3. Zone property stating to bind network interrupts to pool.

One solution would be to alow this config, and bind the net interrupts to
mypool anyway.  Another would be to only allow auto-net-binding in zonecfg
when using dedicated-cpu.

 
 I think that most people that need any of this only need Phase 1.

Agreed.

 Philosophically, shifting interrupt handlers into the default pset is
 consistent

Re: [zones-discuss] Capped-Memory - swap physical? (was: Failing to NFS mount on non-global zone)

2009-02-23 Thread Steve Lawrence
Your config is basically what you want.  The zone will be able to reserve
up to 6gb of memory, 4 of which may reside in physical memory.  The 4gb of
ram is not reserved for the zone, so the zone could get less RAM due to
global memory pressure, and use more than 2gb of disk swap.  For instance,
the zone could get 3 gb of ram and 3gb of disk swap (for a total of 6gb of
virtual memory).

-Steve L.

On Mon, Feb 23, 2009 at 09:06:13PM +0100, Bernd Schemmer wrote:
 Hi,
 
 So if we want a to configure a Zone with 4 GB physical memory and max 2 
 GB swap the values have to be
 
 apped-memory:
physical: 4G
[swap: 6G]
 
 
 Is that correct?
 
 I just reread the documentation about swap and from that it's not clear 
 to me that swap in the zone configuration is used that way
 
 regards
 
 Bernd
 
 
 Steve Lawrence wrote:
 Swap limits how much of the systems total memory (ram + disk) can be 
 reserved.
 When this limit is hit, allocations, such as malloc, will fail.  Physical
 memory limits resident memory.  When this limit is hit, the zone will page
 pages in memory to disk swap.
 
 In general, your example config is only useful if the zone uses a lot of
 physical memory, but does not reserve as much swap.  An example is an
 application which maps a large on-disk file into memory.  No swap is needed
 for the file, (because the file can be paged back to the a filesystem), 
 but a
 large amount of physical memory may be needed to pull the file into RAM.
 
 Such applications are rare, so your example config is not often used.  Your
 basically right is saying that this config does not make any sense in most
 cases.
 
 -Steve L.
 
 On Fri, Feb 20, 2009 at 08:36:20PM +0100, Alexander Skwar wrote:
   
 Hi!
 
 On Fri, Feb 20, 2009 at 17:50, Asif Iqbal vad...@gmail.com wrote:
 
 
 capped-memory:
physical: 1G
[swap: 512M]
   
 A question regarding this setting - does that setting really make
 sense? I suppose he tries to achieve that the zone as a max.
 uses 1G of real memory and no more than 512M of Swap.
 
 But does it really do that?
 
 Or is he rather limiting the amount of allocable mem to 512M?
 
 Alexander
 -- 
 [ Soc. = http://twitter.com/alexs77 | http://www.plurk.com/alexs77 ]
 [ Mehr = http://zyb.com/alexws77 ]
 [ Chat = Jabber: alexw...@jabber80.com | Google Talk: a.sk...@gmail.com ]
 [ Mehr = MSN: alexw...@live.de | Yahoo!: askwar | ICQ: 350677419 ]
 ___
 zones-discuss mailing list
 zones-discuss@opensolaris.org
 
 ___
 zones-discuss mailing list
 zones-discuss@opensolaris.org
 
   
 
 
 -- 
 Bernd Schemmer, Frankfurt am Main, Germany
 http://home.arcor.de/bnsmb/index.html
 
 M s temprano que tarde el mundo cambiar .
Fidel Castro
 
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] Capped-Memory - swap physical? (was: Failing to NFS mount on non-global zone)

2009-02-20 Thread Steve Lawrence
Swap limits how much of the systems total memory (ram + disk) can be reserved.
When this limit is hit, allocations, such as malloc, will fail.  Physical
memory limits resident memory.  When this limit is hit, the zone will page
pages in memory to disk swap.

In general, your example config is only useful if the zone uses a lot of
physical memory, but does not reserve as much swap.  An example is an
application which maps a large on-disk file into memory.  No swap is needed
for the file, (because the file can be paged back to the a filesystem), but a
large amount of physical memory may be needed to pull the file into RAM.

Such applications are rare, so your example config is not often used.  Your
basically right is saying that this config does not make any sense in most
cases.

-Steve L.

On Fri, Feb 20, 2009 at 08:36:20PM +0100, Alexander Skwar wrote:
 Hi!
 
 On Fri, Feb 20, 2009 at 17:50, Asif Iqbal vad...@gmail.com wrote:
 
  capped-memory:
 physical: 1G
 [swap: 512M]
 
 A question regarding this setting - does that setting really make
 sense? I suppose he tries to achieve that the zone as a max.
 uses 1G of real memory and no more than 512M of Swap.
 
 But does it really do that?
 
 Or is he rather limiting the amount of allocable mem to 512M?
 
 Alexander
 -- 
 [ Soc. = http://twitter.com/alexs77 | http://www.plurk.com/alexs77 ]
 [ Mehr = http://zyb.com/alexws77 ]
 [ Chat = Jabber: alexw...@jabber80.com | Google Talk: a.sk...@gmail.com ]
 [ Mehr = MSN: alexw...@live.de | Yahoo!: askwar | ICQ: 350677419 ]
 ___
 zones-discuss mailing list
 zones-discuss@opensolaris.org
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] Update on attach and upgrades

2008-11-06 Thread Steve Lawrence
On Thu, Nov 06, 2008 at 10:20:43AM -0500, Dr. Hung-Sheng Tsao (LaoTsao) wrote:
 
 anyone know when the brandz for s10 will be out?
 e.g. running s10 with opensolaris zone?

No target has been set for this.  We cannot reasonably manage such a
project until s10 begins taking less change.  The current understanding is
the need for such a feature will co-incide with the release of an
enterprise version of opensolaris, or an early update (6 months?) to an
enterprise opensolaris-based release.

-Steve L.

 
 
 Jerry Jelinek wrote:
 Mike Gerdts wrote:
   
 On Thu, Nov 6, 2008 at 8:16 AM, Jerry Jelinek [EMAIL PROTECTED] 
 wrote:
 
 Henrik Johansson wrote:
   
 The easiest way would probably be to identify packages that are not to
 be updated, in my experience packages do not differ that much between
 local zones in production environments, but that is only based on the
 system I have worked with. I always keep zones as similar as possible,
 but full zones still leaves the possibility to make some changes to
 the packages and patches in case its necessary.
 
 Unfortunately we have no way to know which pkgs you deliberately
 want to be different between the global and non-global zone and
 which you want to be in sync.  Thats why a list where the user
 could control that would be needed.
   
 Isn't that the purpose of pkgadd -G?
 
  -G  Add package(s) in  the  current  zone  only.
  When used in the global zone, the package is
  added to the global zone  only  and  is  not
  propagated  to  any  existing  or yet-to-be-
  created non-global  zone.  When  used  in  a
  non-global zone, the package(s) are added to
  the non-global zone only.
 
  This option causes package  installation  to
  fail  if, in the pkginfo file for a package,
  SUNW_PKG_ALLZONES  is  set  to   true.   See
  pkginfo(4).
 
 A package added to the global zone with pkgadd -G should not be
 upgraded in the non-global zone.
 
 
 The problem is when you look at a zone, how do you know what
 to sync with the global zone?  For example, if you have a
 whole-root zone, that means you've explicitly decided you want
 the ability to manage pkgs in /usr, etc. independently of the
 global zone.   With a true upgrade, those pkgs that are part of
 the release are upgraded anyway.  What do we do with a zone
 migration?   What pkg metadata do we have inside the zone to tell
 us which pkgs to sync and which not to?
 
 Jerry
 ___
 zones-discuss mailing list
 zones-discuss@opensolaris.org
   


 ___
 zones-discuss mailing list
 zones-discuss@opensolaris.org

___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] Confirming Zone running Container

2008-10-02 Thread Steve Lawrence
 The other way that the global zone identity normally leaks through to the 
 non-global zones is through the system's hostid.  So if you compare the 
 output of `/usr/bin/hostid` with `for e in $allglobalzones ; do ssh $e 
 /usr/bin/hostid ; done`, you can easily see which global zone matches your 
 local.
 
 That's also a way for your application administrators (using 
 application-level clustering) to verify that they are not running on the same 
 physical node.  If their hostids are different, they're different.
 

This is not always reliable.  S8 and S9 branded zones have configureable
hostids, and in the future this is likely to be available for native zones.

-Steve L.

___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] Oracle/RMAN in a Zone

2008-09-22 Thread Steve Lawrence
I think you need this or later:


http://www-01.ibm.com/support/docview.wss?rs=666context=SSTFZRuid=swg21254543loc=en_UScs=utf-8lang=en

Some ibm docs:

http://publib.boulder.ibm.com/infocenter/tivihelp/v1r1/index.jsp?topic=/com.ibm.itsmreadme.doc/readme_server541.html

http://publib.boulder.ibm.com/infocenter/tivihelp/v1r1/index.jsp?topic=/com.ibm.itsmreadme.doc/readme_server541.html

http://publib.boulder.ibm.com/infocenter/tivihelp/v1r1/index.jsp?topic=/com.ibm.itsmreadme.doc/readme_server541.html

I'm not sure if the tivoli device driver protects against multiple zones
accessing the same tape device at the same time, or if that is up to the
admin to enforce.

-Steve L.


On Mon, Sep 22, 2008 at 12:57:54PM -0700, Arif Khan wrote:
 Is anyone out there. Can someone please point me in the right  
 direction, any other alias that I should post this to ?
 
 Thanks
 
 Arif
 
 On Sep 19, 2008, at 10:01 AM, Arif Khan wrote:
 
  Hi,
  Please let me know if there is another alias that I should post this  
  to.
  (already tried [EMAIL PROTECTED] and it doesn't exist )
 
  My Customer is trying to test LAN-Free backups using Tivoli Data
  Protection (TDP) for Oracle/RMAN in a zone.
 
  First, Is this supported by Sun and Oracle and secondly, do we have  
  any
  documentation on setting this up ?
 
  Thanks
 
  Arif
 
 
 
 
 ___
 zones-discuss mailing list
 zones-discuss@opensolaris.org
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] zones/SNAP design

2008-08-25 Thread Steve Lawrence
 During the zone installation and after the zone is installed, the zone's ZBE1
 dataset is explicitly mounted by the global zone onto the zone root (note, the
 dataset is a ZFS legacy mount so zones infrastructure itself must manage the
 mounting.  It uses the dataset properties to determine which dataset to
 mount, as described below.): e.g.
 
  # mount -f zfs rpool/export/zones/z1/rpool/ZBE1 /export/zones/z1/root
 
 The rpool dataset (and by default, its child datasets) will be implicitly
 delegated to the zone.  That is, the zonecfg for the zone does not need to
 explicitly mention this as a delegated dataset.  The zones code must be
 enhanced to delegate this automatically:

Is there any requirement to have a flag go disallow a zone from doing zfs/BE
operations?  I'm not sure when an admin may want to make this restrction.

 
  rpool/export/zones/z1/rpool
 
 Once the zone is booted, running a sw management operation within the zone
 does the equivalent of the following sequence of commands:
 1) Create the snapshot and clone
  # zfs snapshot rpool/export/zones/z1/rpool/[EMAIL PROTECTED]
  # zfs clone rpool/export/zones/z1/rpool/[EMAIL PROTECTED] \
rpool/export/zones/z1/rpool/ZBE2
 2) Mount the clone and install sw into ZBE2
  # mount -f zfs rpool/export/zones/z1/rpool/ZBE2 /a
 3) Install sw
 4) Finish
  # unmount /a
 
 Within the zone, the admin then makes the new BE active by the equivalent of
 the following sequence of commands:
 
  # zfs set org.opensolaris.libbe:active=off 
 rpool/export/zones/z1/rpool/ZBE1
  # zfs set org.opensolaris.libbe:active=on 
 rpool/export/zones/z1/rpool/ZBE2
 
 Note that these commands will not need to be explicitly performed by the
 zone admin.  Instead, a utility such as beadm does this work (see issue #2).

Inside a zone, beadm should fix this.

From the global zone, beadm should be able to fix a (halted?) zone in this
state so that it may be booted.

I think this means that the global zone should be able to do some explict
beadm operations on a zone (perhaps only when it is halted?), in addition
to the automatic ones that happen when the GBE is manipulated.

 
 When the zone boots, the zones infrastructure code in the global zone will 
 look
 for the zone's dataset that has the org.opensolaris.libbe:active property 
 set
 to on and explicitly mount it on the zone root, as with the following
 commands to mount the new BE based on the sw management task just performed
 within the zone:
 
 # umount /export/zones/z1/root
 # mount -f zfs rpool/export/zones/z1/rpool/ZBE2 /export/zones/z1/root
 
 Note that the global zone is still running GBE1 but the non-global zone is
 now using its own ZBE2.
 
 If there is more than one dataset with a matching
 org.opensolaris.libbe:parentbe property and the
 org.opensolaris.libbe:active property set to on, the zone won't boot.
 Likewise, if none of the datasets have this property set.
 
 When global zone sw management takes place, the following will happen.
 
 Only the active zone BE will be cloned.  This is the equivalent of the
 following commands:
 
  # zfs snapshot -r rpool/export/zones/z1/[EMAIL PROTECTED]
  # zfs clone rpool/export/zones/z1/[EMAIL PROTECTED] 
 rpool/export/zones/z1/ZBE3
 
 (Note that this is using the zone's ZBE2 dataset created in the previous
 example to create a zone ZBE3 dataset, even though the global zone is
 going from GBE1 to GBE2.)
 
 When global zone BE is activated and the system reboots, the zone root must
 be explicitly mounted by the zones code:
 
  # mount -f zfs rpool/export/zones/z1/rpool/ZBE3 /export/zones/z1/root
 
 Note that the global zone and non-global zone BE names move along 
 independently
 as sw management operations are performed in the global and non-global
 zone and the different BEs are activated, again by the global and non-global
 zone.
 
 One concern with this design is that the zone has access to its datasets that
 correspond to a global zone BE which is not active.  The zone admin could
 delete the zone's inactive BE datasets which are associated with a non-active
 global zone BE, causing the zone to be unusable if the global zone boots back
 to an earlier global BE.
 
 One solution is for the global zone to turn off the zoned property on
 the datasets that correspond to a non-active global zone BE.  However, there
 seems to be a bug in ZFS, since these datasets can still be mounted within
 the zone.  This is being looked at by the ZFS team.  If necessary, we can work
 around this by using a combination of a mountpoint along with turning off
 the canmount property, although a ZFS fix is the preferred solution.
 
 Another concern is that the zone must be able to promote one of its datasets
 that is associated with a non-active global zone BE.  This can occur if the
 global zone boots back to one of its earlier BEs.  This would then cause an
 earlier non-global zone BE to become the active BE for that zone.  If the zone
 then 

Re: [zones-discuss] [smf-discuss] 6725004 - installing single-user-mode patches automatically

2008-08-21 Thread Steve Lawrence
On Thu, Aug 21, 2008 at 12:54:14PM -0700, Jordan Brown wrote:
 [ Which brain-dead mail client turns all of the spaces in the Subject 
 into tabs? ]
 
 Zones folks:  the current proposed answers to this problem involve 
 moving system/filesystem/local into milestone/single-user.  That was 
 apparently considered and rejected as the answer for the patchadd 
 problem that resulted in the fix that brought us here.  Can you offer 
 any insight into why that change was rejected?

I assume you are targeting this change for s10.

The single-user milestone is intended to mimic the traditional unix
run-level 1 (S?)  This is typically where an admin would run stuff like
fsck (on filesystems that are not yet mounted).  I don't think it is ok
to change this behavior in a patch.

I'm not sure I understand all the details of the problem you are trying
to solve.  For example, I thought it was desired that the patch service
run during a boot to all, but then I saw following mail stating that
the patch service should not run in this case, and something about the
user explicitly booting to single user.  I don't think I know what the
use cases are.

You may want to draft a brief ARC fastrack describing the desired behavior(s),
and the issues, and perhaps proposed solutions.  Getting it all on one page
will faciliate a solution.

-Steve L.

 ___
 zones-discuss mailing list
 zones-discuss@opensolaris.org
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] [smf-discuss] 6725004 - installing single-user-mode patches automatically

2008-08-21 Thread Steve Lawrence
 The list of use cases is really pretty simple:
 
 1)  Administrator has in hand a patch that says install in single user 
 mode.  What does this administrator do?  The answer seems self-evident: 
   take the system to single-user mode (either by booting the system in 
 single-user mode using boot -s or boot -m milestone/single-user, or 
 dropping the system to single-user mode using init s or svcadm 
 milestone milestone/single-user) and install the patch using patchadd.
 
 2)  An automated tool has in hand a patch that says install in single 
 user mode.  What does it do?

How about:

A.  Make patchadd verify that the system is in single user milestone when
installing a single-user patch.

B.  If patchadd discovers that it needs to patch a zone, patchadd should first
make sure the zone's zonepath is properly mounted.  An overkill for this
could be to issue a svcadm enable -srt fileystem/local IF patchadd is
not being run from the context of an SMF service, otherwise, fail.
(sorry, no patchadd from smf services or rc*.d scripts).

An alternate solution is to fail patchadd with a message stating that
filesystem/local must be enabled to install the patch due to the
installed zones.  The admin could then do as instructed.

C.  (2) above will need to somehow set the milestone to single-user, wait
until single user is reached, and then do the patchadd, which will do
A and B.  This automated tool could also do the:

svcadm enable -rt fileystem/local

If B fails do to the alternate solution.  The automation tool could also
enable filesystem/local in cases where the patchadd version the system does
not have this functionality.  For simplicity, perhaps just always
enable filesystem/local in the automation tool after single-user is
reached.

I think to implement (2), at some point you are going to need to fork off
some asyncronous process which changes the milestone, waits, and then
addes the patch, potentially also enabling filesystem/local before patching
if needed (or just always).

-Steve L.

 
 It is when we start to look at solutions that the problem becomes more 
 difficult.
 ___
 zones-discuss mailing list
 zones-discuss@opensolaris.org
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] [smf-discuss] 6725004 - installing single-user-mode patches automatically

2008-08-21 Thread Steve Lawrence
On Thu, Aug 21, 2008 at 04:01:43PM -0700, Jordan Brown wrote:
 Steve Lawrence wrote:
  A.  Make patchadd verify that the system is in single user milestone when
  installing a single-user patch.
 
 That's a non-starter.  *Many* of our customers ignore our recommendation 
 to install patches in single-user mode, and will revolt if we attempt to 
 enforce it.
 
 In addition, for many patches the single-user mode recommendation is 
 only the first approximation, primarily intended for automata.  If a 
 human is installing the patch, it may be acceptable to install it after 
 manually shutting down the affected services.

That seems completely unsupportable, but ok.  Admins are used to getting
away with not following the patch README or suggested procedure. (A)
could be dropped without much impact to the solution.

 
  B.  If patchadd discovers that it needs to patch a zone, patchadd should 
  first
  make sure the zone's zonepath is properly mounted.  An overkill for this
  could be to issue a svcadm enable -srt fileystem/local IF patchadd is
  not being run from the context of an SMF service, otherwise, fail.
  (sorry, no patchadd from smf services or rc*.d scripts).
 
 No patchadd from smf services or rc*.d scripts means no automated 
 installation of single-user patches.  That's a non-starter.

Your final comment addresses this.  Post filesystem/local, it is safe for
SMF services to call patchadd.

 
  An alternate solution is to fail patchadd with a message stating that
  filesystem/local must be enabled to install the patch due to the
  installed zones.  The admin could then do as instructed.
 
 Also a killer for automated installation.
 
  C.  (2) above will need to somehow set the milestone to single-user, wait
  until single user is reached, and then do the patchadd, which will do
  A and B.  This automated tool could also do the:
  
  svcadm enable -rt fileystem/local
  
  If B fails do to the alternate solution.  The automation tool could also
  enable filesystem/local in cases where the patchadd version the system 
  does
  not have this functionality.  For simplicity, perhaps just always
  enable filesystem/local in the automation tool after single-user is
  reached.
  
  I think to implement (2), at some point you are going to need to fork off
  some asyncronous process which changes the milestone, waits, and then
  addes the patch, potentially also enabling filesystem/local before patching
  if needed (or just always).
 
 I'm not happy with doing this stuff outside the bounds of SMF, or with 
 approaches where the user is offered a single-user login while the 
 automated tools are installing patches in the background and will 
 asynchronously reboot the system.  I don't think either is necessary.

Call this requirement (no login prompt) out in your use case.  I assume
the patch service will patch, set the boot milestone, and reboot before
the patch milestone is actually met, avoiding the maint prompt.

Definately get some console messages out of the patch-service so folks don't
think their boot is hung and freak out. :)


 
 My favorite approach is, approximately:
 
 1)  Move system/filesystem/local into milestone/single-user.
 
   Note that this alone addresses the issues for interactive
   patchadd.
 
 2)  Define milestone/patching, dependent on milestone/single-user.
 3)  Define new a new patching service (or services), dependent on 
 milestone/single-user and depended on by milestone/patching.
 4)  When patch automation needs to install a single-user patch, have it 
 boot the system to milestone/patching.

Or just set the boot milestone if patching deferred to next reboot.

 5)  When the patch services are done with their work, have them let the 
 system come up to its default milestone, or reboot it to its default 
 milestone, as required.
 
 There are approximately two tricky parts to this puzzle:
 a)  How does the patch tool reboot the system to milestone/patching?  It 
 could use reboot -- -m milestone/patching, but that would mean that 
 the patching work wouldn't get done if the reboot was done through other 
 mechanisms.  It could set the system default milestone, but then how 
 would it determine the milestone to set the system back to when patching 
 was complete?  Neither answer is pretty, but either is workable.

I'm sure you could write the old boot milestone down somewhere.

If the admin modifies the milestone after the patch tool sets it
to milestone-patching, the patch-on-next-boot will just get clobbered.
I suppose the patch-service could then re-instate it on the next boot, and
hope to get it on the subsequent boot. 

I suppose being inside the bounds of SMF also makes the implementation
vulnerable to other admins manimpulating SMF.  The above issue is basically
by design.

 b)  How do the patching services *avoid* running when the system is 
 coming up normally - even if they have work to do?  Probably

Re: [zones-discuss] [smf-discuss] 6725004 - installing single-user-mode patches automatically

2008-08-19 Thread Steve Lawrence
So you want to be able to interrupt any boot to any milestone, and instead do
the patch processing if a patch is pending.  You basically want to interrupt
the current milestone, and instead just boot to filesystem-local and do the
patching.

The question is, can the smf milestone be changed mid-milestone?

My test shows that it can.  How about:

1. Create patch-test-service, on which single-user depends.  This will
   svcadm milestone patch-install-milestone if a patch needs to be
   installed.  This service is always enabled.

2. Create patch-install-milestone, which depends on patch-install-service
   below.
   
3. Create patch-install service, which depends on:
single-user
filesystem-local

   This service is always enabled. It will install a patch if it is pending,
   otherwise, do nothing.  If the service fails, it might need to:

# svcadm milestone single-user

   So that a maintenance prompt will be appear on the console.  This might
   not be necessary.  you might get this anyway, as console-login is not
   reached.

It should be ok to issue smf commands from an smf service, as long as they
do not try to do any synchronous operations (-s).

This approach is also good because an explicit boot to single user WILL NOT
attempt to install pending patches.

Disabling the patch-test and patch-install services will disable the
automatic installation of pending patches on reboot.

Thoughts?

-Steve L.

On Tue, Aug 19, 2008 at 01:03:55PM -0700, Jordan Brown wrote:
 Bob Netherton wrote:
  And further
  refinement would only impact patching rather than the booting process
  as a whole.
 
 Hmm.  I don't know how to have a service that runs when a particular 
 milestone is selected, that *doesn't* run when all is selected. 
 (Other than by dynamically enabling and disabling it.)
 
  rc scripts doing things with SMF seem a permanent solution to a
  temporary problem.  In my virtual universe there are no rc scripts :)
  And then the alarm clock goes off and I return to reality.  But it does
  promote rc hackery rather than fixing the problem in SMF where it
  belongs.
 
 Agreed.  Besides, I believe that SMF is locked while rc scripts are 
 running, and that any attempt to manipulate it deadlocks.
 
 There are related schemes that could work, but the problem is getting 
 them properly sequenced into system startup.
 
  reboot -- -m milestone=patchinstall seems elegantly simple.
 
 Plausible, though it doesn't exactly fit the current application usage 
 model.  At the moment, the reboot might or might not be triggered by the 
 patching application.  The patching application leaves the system set up 
 to do the patching at the next shutdown/reboot, whenever that might be. 
   (For SunUC-S's shutdown-time processing, it's require that that reboot 
 be via the clean mechanisms - init, shutdown - so that the processing 
 gets done.)
 
 This scheme would require either
 
 1)  having the patch application set the default milestone, and then 
 having the startup-time processing set it back, or
 2)  having the patch application do the reboot.
 
 There's still the issue with how to keep this service from running when 
 you boot to all.
 
 Hmm.  How does single-user-mode login work?  What stops it from running 
 on a normal boot?  Is it a special case?
 
 ---
 
 BTW:  I'm not in a position to commit the patch applications.  I'm in 
 the middle here because I'm relatively familiar with all of the players 
 and the issues, but in my day job I'm not responsible for *any* of them.
 ___
 zones-discuss mailing list
 zones-discuss@opensolaris.org
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] [smf-discuss] 6725004 - installing single-user-mode patches automatically

2008-08-19 Thread Steve Lawrence
 2. Create patch-install-milestone, which depends on patch-install-service
below.

The patch-install-milestone could also depend on single-user and
filesystem-local so that it is generally useful for admins manually
installing patches as well, even if they don't have the patch-test
and patch-install services, but want to safely install patches
manually when they have zones on local filesystems.

-Steve L.

 
 On Tue, Aug 19, 2008 at 01:03:55PM -0700, Jordan Brown wrote:
  Bob Netherton wrote:
   And further
   refinement would only impact patching rather than the booting process
   as a whole.
  
  Hmm.  I don't know how to have a service that runs when a particular 
  milestone is selected, that *doesn't* run when all is selected. 
  (Other than by dynamically enabling and disabling it.)
  
   rc scripts doing things with SMF seem a permanent solution to a
   temporary problem.  In my virtual universe there are no rc scripts :)
   And then the alarm clock goes off and I return to reality.  But it does
   promote rc hackery rather than fixing the problem in SMF where it
   belongs.
  
  Agreed.  Besides, I believe that SMF is locked while rc scripts are 
  running, and that any attempt to manipulate it deadlocks.
  
  There are related schemes that could work, but the problem is getting 
  them properly sequenced into system startup.
  
   reboot -- -m milestone=patchinstall seems elegantly simple.
  
  Plausible, though it doesn't exactly fit the current application usage 
  model.  At the moment, the reboot might or might not be triggered by the 
  patching application.  The patching application leaves the system set up 
  to do the patching at the next shutdown/reboot, whenever that might be. 
(For SunUC-S's shutdown-time processing, it's require that that reboot 
  be via the clean mechanisms - init, shutdown - so that the processing 
  gets done.)
  
  This scheme would require either
  
  1)  having the patch application set the default milestone, and then 
  having the startup-time processing set it back, or
  2)  having the patch application do the reboot.
  
  There's still the issue with how to keep this service from running when 
  you boot to all.
  
  Hmm.  How does single-user-mode login work?  What stops it from running 
  on a normal boot?  Is it a special case?
  
  ---
  
  BTW:  I'm not in a position to commit the patch applications.  I'm in 
  the middle here because I'm relatively familiar with all of the players 
  and the issues, but in my day job I'm not responsible for *any* of them.
  ___
  zones-discuss mailing list
  zones-discuss@opensolaris.org
 ___
 zones-discuss mailing list
 zones-discuss@opensolaris.org
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] how to configure port listening in a zone?

2008-08-19 Thread Steve Lawrence
By default, a zone does not have privilege to snoop:

http://blogs.sun.com/JeffV/entry/snoop_zoney_zone

Could just be a network config/routing issue.  Can you ping 10.5.185.103?
Can you access other network services, like ssh?

-Steve L.

On Tue, Aug 19, 2008 at 03:01:37PM -0700, Russ Petruzzelli wrote:
I have a glassfish webserver running in my Solaris 10 zone.
It will not respond to remote jmxrmi requests on port 8686.
If I connect locally with (for instance) jconsole it works fine.
 
I believe it is somehow related to being in a zone.  Because
snoop -p 8686 from my zone...
tells me, no network interface devices found.
 
  [EMAIL PROTECTED]: ifconfig -a
  lo0:1: flags=2001000849UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL mtu
  8232 index 1
  inet 127.0.0.1 netmask ff00
  bge0:1: flags=1000843UP,BROADCAST,RUNNING,MULTICAST,IPv4 mtu 1500
  index 2
  inet 10.5.185.103 netmask fc00 broadcast 10.5.187.255
 
Is there something I can configure to do this?
 
Thanks,
Russ

 ___
 zones-discuss mailing list
 zones-discuss@opensolaris.org

___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] [smf-discuss] 6725004 - installing single-user-mode patches automatically

2008-08-19 Thread Steve Lawrence
   The only way that you can get *that* guarantee is by using the 
 milestone mechanism to limit the system to a particular milestone, as 
 you suggest.
 
 In fact, argh.  This problem affects even your proposed scheme.  By the 
 time that your patch-test-service is running, there could (in theory) be 
 all kinds of services running that didn't happen to depend on anything. 
   Maybe in practice we could ignore that possibility, but it's still 
 bothersome.
 
 Argh.  Not quite back to Square One, but that certainly tosses a wrench 
 into most of my theories on how to solve this problem.

Argh again.  Currently startd hard codes the allowable milestones.  My
proposal would require patching startd :(

static int
dgraph_set_milestone(const char *fmri, scf_handle_t *h, boolean_t norepository)
{
const char *cfmri, *fs;
graph_vertex_t *nm, *v;
int ret = 0, r;
scf_instance_t *inst;
boolean_t isall, isnone, rebound = B_FALSE;

/* Validate fmri */
isall = (strcmp(fmri, all) == 0);
isnone = (strcmp(fmri, none) == 0);

if (!isall  !isnone) {
if (fmri_canonify(fmri, (char **)cfmri, B_FALSE) == EINVAL)
goto reject;

if (strcmp(cfmri, single_user_fmri) != 0 
strcmp(cfmri, multi_user_fmri) != 0 
strcmp(cfmri, multi_user_svr_fmri) != 0) {
startd_free((void *)cfmri, max_scf_fmri_size);
reject:
log_framework(LOG_WARNING,
Rejecting request for invalid milestone \%s\.\n,
fmri);
return (EINVAL);
}
}

-Steve L.

 ___
 zones-discuss mailing list
 zones-discuss@opensolaris.org
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] [osol-discuss] solaris 8 container installation failed

2008-08-08 Thread Steve Lawrence
You could try:

  s8 cpio patch:112097-08
  s8 compress patch:108823-02
  s8 flar patch:109318-39 (requires some other patches)

You could be hitting 4384301, fixed in 109318-12, which was obsoleted
by the flar patch above.

-Steve L.

On Fri, Aug 08, 2008 at 04:58:40PM -0400, Asif Iqbal wrote:
 On Fri, Aug 8, 2008 at 2:13 PM, Edward Pilatowicz
 [EMAIL PROTECTED] wrote:
  [ changing the cc to zones-discuss which is a more appropriate alias for
   this kind of question. ]
 
  hey asif,
 
  i'm guessing that cpio is not your problem.  the key error below is:
 uncompress: stdin: corrupt input
 
  perhaps the s8_install script is somehow not detecting the compression
  type correctly?  could you run the following command:
 grep files_compressed_method /engr.flar
 
 bash-3.00# grep files_compressed_method /engr.flar
 files_compressed_method=compress
 
 
 bash-3.00# flar info  /engr.flar
 archive_id=d37bfbb8e8f8940362746dfc5af2fbed
 files_archived_method=cpio
 creation_date=20080807215755
 creation_master=engr-01
 content_name=engr
 creation_node=engr-01
 creation_hardware_class=sun4u
 creation_platform=SUNW,UltraSPARC-IIi-cEngine
 creation_processor=sparc
 creation_release=5.8
 creation_os_name=SunOS
 creation_os_version=Generic_117350-39
 files_compressed_method=compress
 content_architectures=sun4u
 
 
  thanks
  ed
 
  On Fri, Aug 08, 2008 at 12:24:25AM -0400, Asif Iqbal wrote:
  I am failing to install flash archive of a solaris 8 02/04 running on
  a netra t1. I can install the example solaris8-image.flar (optional
  download from sun).
 
  Here is the failed log
 
  bash-3.00# more /var/tmp/engr-01.install.7078.log
  [Thu Aug  7 23:59:44 EDT 2008]   Log File: 
  /var/tmp/engr-01.install.7078.log
  [Thu Aug  7 23:59:44 EDT 2008]Product: Solaris 8 Containers 1.0
  [Thu Aug  7 23:59:44 EDT 2008]  Installer: solaris8 brand installer 
  1.22
  [Thu Aug  7 23:59:44 EDT 2008]   Zone: engr-01
  [Thu Aug  7 23:59:44 EDT 2008]   Path: /zones/solaris8
  [Thu Aug  7 23:59:44 EDT 2008] Starting pre-installation tasks.
  [Thu Aug  7 23:59:44 EDT 2008] Installation started for zone engr-01
  [Thu Aug  7 23:59:44 EDT 2008] Source: /engr.flar
  [Thu Aug  7 23:59:44 EDT 2008] Media Type: flash archive
  [Thu Aug  7 23:59:44 EDT 2008] Installing: This may take several 
  minutes...
  [Thu Aug  7 23:59:44 EDT 2008] cd /zones/solaris8/root 
  [Thu Aug  7 23:59:44 EDT 2008] do_flar  /engr.flar
  uncompress: stdin: corrupt input
  [Fri Aug  8 00:01:31 EDT 2008] Postprocessing: This may take several 
  minutes...
  [Fri Aug  8 00:01:31 EDT 2008] running: p2v  -u eug-engr-01
  [Fri Aug  8 00:01:31 EDT 2008]Postprocess: Gathering information
  about zone engr-01
  [Fri Aug  8 00:01:31 EDT 2008]Postprocess: Creating mount points
  touch: /zones/solaris8/root/etc/mnttab cannot create
  chmod: WARNING: can't access /zones/solaris8/root/etc/mnttab
  [Fri Aug  8 00:01:31 EDT 2008]Postprocess: Processing /etc/system
  cp: cannot access /zones/solaris8/root/etc/system
  [Fri Aug  8 00:01:32 EDT 2008] Result: Postprocessing failed.
  [Fri Aug  8 00:01:32 EDT 2008]
  [Fri Aug  8 00:01:32 EDT 2008] Result: *** Installation FAILED ***
  [Fri Aug  8 00:01:32 EDT 2008]   Log File: 
  /var/tmp/engr-01.install.7078.log
 
 
 
  Here is the good log where installing from the example solaris 8 image
  from SUN site that comes as optional download with solaris 8 container
  software
 
  bash-3.00# more /var/tmp/engr-01.install.5279.log
  [Thu Aug  7 23:31:12 EDT 2008]   Log File: 
  /var/tmp/engr-01.install.5279.log
  [Thu Aug  7 23:31:12 EDT 2008]Product: Solaris 8 Containers 1.0
  [Thu Aug  7 23:31:12 EDT 2008]  Installer: solaris8 brand installer 
  1.22
  [Thu Aug  7 23:31:12 EDT 2008]   Zone: engr-01
  [Thu Aug  7 23:31:12 EDT 2008]   Path: /zones/solaris8
  [Thu Aug  7 23:31:12 EDT 2008] Starting pre-installation tasks.
  [Thu Aug  7 23:31:12 EDT 2008] Installation started for zone engr-01
  [Thu Aug  7 23:31:12 EDT 2008] Source: /tmp/solaris8-image.flar
  [Thu Aug  7 23:31:12 EDT 2008] Media Type: flash archive
  [Thu Aug  7 23:31:13 EDT 2008] Installing: This may take several 
  minutes...
  [Thu Aug  7 23:31:13 EDT 2008] cd /zones/solaris8/root 
  [Thu Aug  7 23:31:13 EDT 2008] do_flar  /tmp/solaris8-image.flar
  [Thu Aug  7 23:35:28 EDT 2008]   Sanity Check: Passed.  Looks like a
  Solaris 8 system.
  [Thu Aug  7 23:35:28 EDT 2008] Postprocessing: This may take several 
  minutes...
  [Thu Aug  7 23:35:28 EDT 2008] running: p2v  -u eug-engr-01
  [Thu Aug  7 23:35:28 EDT 2008]Postprocess: Gathering information
  about zone eug-engr-01
  [Thu Aug  7 23:35:29 EDT 2008]Postprocess: Creating mount points
  [Thu Aug  7 23:35:30 EDT 2008]Postprocess: Processing /etc/system
  [Thu Aug  7 23:35:30 EDT 2008]Postprocess: Booting zone to single user 
  mode
  [Thu Aug  7 

Re: [zones-discuss] mongrel rails in a zone

2008-07-13 Thread Steve Lawrence
My guess is that your zones lack /var/ruby.*

Did you install ruby+friends in the global zone using packages, from
a tar file, or from source compilation?  A package install from the global
zone should install the package contents into all zones, properly handling
/usr verses /var.

If your means of installation insists on writing to /usr/ruby, they you
could create a writeable /usr/ruby filesystem (using zonecfg add fs) so
that you can install ruby into every zone.  Adding a lofs filesystem, that
maps to a directory in the global zone, is straightfoward.

-Steve L.

On Sat, Jul 12, 2008 at 08:10:24PM +0100, Matt Harrison wrote:
 Hello,
 
 I've got an opensolaris box running several websites from mongrel 
 clusters, and I'd like to have each site's cluster running in an 
 isolated zone.
 
 I've been through the administrative guides and FAQs pertaining to zones 
 and managed to get a zone created and running without problems.
 
 The problem I'm having is that although the zone is sharing /usr and 
 /var, the zone cannot use any of the ruby gems that I have installed.
 
 For example I've installed mongrel, rails, capistrano and all 
 dependencies into the global zone and they work perfectly.
 
 Unfortunately when I try to use these gems from within the non-global 
 zone I get gem not found errors like so:
 
 $ mongrel_rails cluster::start
 /usr/ruby/1.8/lib/ruby/site_ruby/1.8/rubygems.rb:304:in 
 `report_activate_error': Could not find RubyGem mongrel ( 0) 
 (Gem::LoadError)
 []
 
 Of course I can't re-install the gems from the zone as /usr and /var are 
 not writable from there.
 
 Does anyone have any experience with running ruby gems from zones and 
 might be able to give me some guidance with this?
 
 
 Many Thanks
 
 Matt
 ___
 zones-discuss mailing list
 zones-discuss@opensolaris.org
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] Memory Cap accounting for shared memory

2008-06-18 Thread Steve Lawrence
As of s10u4, (and nevada build 56?) rcapd (and prstat -ZJTta) account for
shared memory (both sysV, anon, and text) between processes in the same zone,
project, etc.

-Steve L.

On Wed, Jun 18, 2008 at 01:13:21PM -0500, Brian Smith wrote:
 My reading of the documentation is that if I try to cap the memory of a
 zone, that cap applies to the total RSS, calculated by summing the RSS of
 each process in the zone. According to the documentation, there is not any
 accounting for processes sharing memory. For example, if I have 10 processes
 with private RSS of 2MB each and shared RSS of 100MB, then the zone would be
 considered to be using 10*(2+100)=1,020MB of memory, and not 100+2*10=120MB
 of memory. Restricting the zone to 500MB would make it practically unusable
 even though only 120MB of memory is being used. Is that correct? Is there
 any way to cause rcapd to take shared memory into account when deciding
 which zones to start swapping to disk?
 
 Regards,
 Brian
 
 ___
 zones-discuss mailing list
 zones-discuss@opensolaris.org
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] Making zoneadm more like the other adms...

2008-06-18 Thread Steve Lawrence
Hey Darren,

Are you interested in drafting an arc fasttrack for these interface additions?
Do you see zoneadm being used as:

# zoneadm boot myzone -s

That would be:
- myzone is an operand to zoneadm that comes after the subcommand.

This is not compliant with getopt or clip guidelines.  You may
want to review the info here:

http://opensolaris.org/os/community/arc/caselog/2006/062/spec-clip-html/

I think the usability issue is with the zoneadm syntax:

# command -opt optarg subcommand ...

The above clip guidlines make this comment:

Q: I'd like to have options before my subcommands. This makes sense because
   some options apply to all operations.

A: This often makes sense from an engineering perspective, but our
usability data says most users don't understand the system model well
enough to be able to predict whether the option should go before or after   
the subcommand.

From this you could argue that the current zoneadm command is getopt, but not
fully clip compliant.

I think that you are proposing that the zonename could also take a special
operand to the zoneadm command, which comes after the subcommand.

The good commands that you are comparing zoneadm to were nice enough to not
have any options/optargs before the subcommand.

You might try running a fastrack, arguing that the syntax:

# zoneadm boot myzone -m milestone=single-user
being:
# command subcommand operand suboptions suboperands

while conforming to no standards/guidelines, is more usable than:

# command option optarg subcommand suboptions suboperand.

I think this could be defined as:

if -z is not present, a subcommand is present, and the token after
the subcommand is not an option, then it is the operand to an
implicit -z.

This is of course not compliant with anything.  You could argue that this
is more usable anyway, or you could find a compliant solution.

-Steve L.

On Sun, Jun 15, 2008 at 06:15:46PM -0700, Darren Reed wrote:
 Tony Ambrozie wrote:
 
  Your code changes for both zoneadm and zonecfg would preserve the 
  current zonexxx -z zonename for backwards compatibility purposes, is 
  that correct?
 
 
 Correct.  There are some command line options that the changes I've
 made don't support, such as using -R.  That's quite deliberate.
 
 The aim of the changes was to address the common use cases of the
 commands and make their use more intuitive when viewed with the
 other commands in OpenSolaris.
 
 Darren
 
 
  Thank you,
 
 
  On Mon, Jun 9, 2008 at 11:51 AM, Darren Reed [EMAIL PROTECTED] 
  mailto:[EMAIL PROTECTED] wrote:
 
  Someone mentioned zonecfg was the cause of some similar awkwardness...
 
  So here's a patch attached for that.
 
  Darren
 
 
 
  --- usr/src/cmd/zonecfg/zonecfg.c ---
 
  Index: usr/src/cmd/zonecfg/zonecfg.c
  *** /biscuit/onnv/usr/src/cmd/zonecfg/zonecfg.c Mon Mar 24
  17:30:38 2008
  --- /biscuit/onnv_20080608/usr/src/cmd/zonecfg/zonecfg.c  
   Mon Jun  9 11:47:41 2008
  ***
  *** 1071,1076 
  --- 1071,1077 
 execname, cmd_to_str(CMD_HELP));
 (void) fprintf(fp, \t%s -z zone\t\t\t(%s)\n,
 execname, gettext(interactive));
  +   (void) fprintf(fp, \t%s command zone\n,
  execname);
 (void) fprintf(fp, \t%s -z zone command\n,
  execname);
 (void) fprintf(fp, \t%s -z zone -f
  command-file\n,
 execname);
  ***
  *** 6653,6689 
 return (execbasename);
   }
 
  ! int
  ! main(int argc, char *argv[])
   {
  !   int err, arg;
  !   struct stat st;
  !
  !   /* This must be before anything goes to stdout. */
  !   setbuf(stdout, NULL);
  !
  !   saw_error = B_FALSE;
  !   cmd_file_mode = B_FALSE;
  !   execname = get_execbasename(argv[0]);
  !
  !   (void) setlocale(LC_ALL, );
  !   (void) textdomain(TEXT_DOMAIN);
  !
  !   if (getzoneid() != GLOBAL_ZONEID) {
  !   zerr(gettext(%s can only be run from the global
  zone.),
  !   execname);
  !   exit(Z_ERR);
  !   }
  !
  !   if (argc  2) {
  !   usage(B_FALSE, HELP_USAGE | HELP_SUBCMDS);
 exit(Z_USAGE);
 }
  !   if (strcmp(argv[1], cmd_to_str(CMD_HELP)) == 0) {
  !   (void) one_command_at_a_time(argc - 1, (argv[1]));
  !   exit(Z_OK);
  !   }
 
 while ((arg = getopt(argc, argv, ?f:R:z:)) != EOF) {
 switch (arg) {
 case '?':
  --- 6654,6679 
 return (execbasename);
   }
 
  ! static void
  ! set_zonename(char *zonename)
   {
 

Re: [zones-discuss] S10 branded zones

2008-05-27 Thread Steve Lawrence
Are you running sparc or x86?  On x86, you can use Xen or Virtualbox today
to run s10 guests.  On sparc, you can use ldoms on sun4v.  If you indeed need
a zones-based solution, please elaborate on your requirements.

-Thanks,

-Steve L.


On Sun, May 25, 2008 at 10:41:29AM +1000, Rodney Lindner - SCT wrote:
 Hi all,
 has there been any thought of a S10 branded zone running under NV.
 
 Most of my servers run NV, but at times I need to test software that 
 only runs on S10. Running up a branded zone would
 make my life very simple.
 
 Regards
 Rodney
 
 -- 
 =
 Rodney Lindner
 Services Chief Technologist
 Sun Microsystems Australia
 Phone: +61 (0)2 94669674 (EXTN:59674)
 Mobile +61 (0)404 815 842
 Email: [EMAIL PROTECTED]
 =
 
 ___
 zones-discuss mailing list
 zones-discuss@opensolaris.org
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] code review: native brand refactoring

2008-05-27 Thread Steve Lawrence
Hey Jerry,

Does this address this comment in 6621020:


This appears to point out at least one bug in zlogin, namely that it
keeps stdout_pipe[1] and stderr_pipe[1] from noninteractive_login()
open when returning to the parent.


Basically, I think the filer expected to see something like:


pipe(stdout);
pipe(stderr);

if (fork() == 0) {
/* in child, close pipe sides read by parent) */
close(stdout[0]);
close(stderr[0]);

.. write to std*[1]...
...
exit(..);
}
/* in parent, close pipe sides written by child */
close(stdout([1]);
close(stderr([1]);

... read from std*[0] ...
...

I think the in child part is handled by the closefrom on line 1559, but
the parent does not close the sides of the pipes that the child writes to.

-Steve


On Tue, May 27, 2008 at 08:48:17AM -0600, Jerry Jelinek wrote:
 I have updated the webrev at:
 
 http://cr.opensolaris.org/~gjelinek/webrev/
 
 This includes the changes for the feedback I have
 received so far.  I also added the zlogin.c file
 to the webrev with two bug fixes.  One of these was for
 a bug I was hitting during testing of these changes
 and there is a second bug in zlogin that came in
 which I also fixed.  So, at a minimum, it would be
 good to take a look at that additional file.
 
 Thanks,
 Jerry
 ___
 zones-discuss mailing list
 zones-discuss@opensolaris.org
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] code review: native brand refactoring

2008-05-23 Thread Steve Lawrence
It seems to me that the first comment in the NOTES section of fork(2) would
only apply to vfork().

??

-Steve


On Fri, May 23, 2008 at 01:31:58PM +0200, Joerg Barfurth wrote:
 Hi,
 
 I just stumbled over this:
 
 Edward Pilatowicz schrieb:
  - nit: in start_zoneadmd(), instead of:
  if ((child_pid = fork()) == -1) {
  zperror(gettext(could not fork));
  goto out;
  } else if (child_pid == 0) {
  ...
  _exit(1);
  } else {
  ...
  }
how about:
  if ((child_pid = fork()) == -1) {
  zperror(gettext(could not fork));
  goto out;
  }
  
  if (child_pid == 0) {
  ...
  _exit(1);
  }
  
  ...
  
  - nit: we have direct bindings now. :)  so why bother with:
  _exit(1)
instead just call:
  exit(1)
  
 
 Beware: _exit() is not just a linker synonym for exit(). See the exit(2) 
 man page for differences. When fork-ing without exec, calling _exit is 
 the proper way to exit; see the Notes section in the fork(2) man page.
 
 - J?rg
 
 -- 
 Joerg Barfurth   phone: +49 40 23646662 / x2
 Software Engineermailto:[EMAIL PROTECTED]
 Desktop Technology   http://reserv.ireland/twiki/bin/view/Argus/
 Thin Client Software http://www.sun.com/software/sunray/
 Sun Microsystems GmbHhttp://www.sun.com/software/javadesktopsystem/
 
 
 ___
 zones-discuss mailing list
 zones-discuss@opensolaris.org
___
zones-discuss mailing list
zones-discuss@opensolaris.org

Re: [zones-discuss] physical= not obeyed when ip-type=shared and physical dev part of IPMP group in global zone

2008-05-21 Thread Steve Lawrence
In the global zone, do you have two ip addresses (one on vnet0, one on vnet1)
or is vnet1 configured as standby?

Steve L.

On Wed, May 21, 2008 at 09:12:35AM +0100, Lewis Thompson wrote:
 On Tue, 2008-05-20 at 13:56 -0700, Steve Lawrence wrote:
  It is documented here:
  
  http://docs.sun.com/app/docs/doc/819-2450/z.admin.task-60?l=koa=viewq=multipathing
  
  The behavior you are seeing is not specifically documented, but it seems
  reasonable.  What behavior are you expecting?
 
 Hi Steve,
 
 For now I am just looking for confirmation that this is expected behaviour.
 
 However, as the customer has specifically set physical equal to the
 other interface in the IPMP group, I would expect this to be used, when
 the interface is not in a FAILED state.
 
 Thanks, Lewis
 
 ___
 zones-discuss mailing list
 zones-discuss@opensolaris.org
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] physical= not obeyed when ip-type=shared and physical dev part of IPMP group in global zone

2008-05-20 Thread Steve Lawrence
This appears to be the affect of selecting an interface that is a member
of an ipmp group in the global zone.

It is documented here:

http://docs.sun.com/app/docs/doc/819-2450/z.admin.task-60?l=koa=viewq=multipathing

The behavior you are seeing is not specifically documented, but it seems
reasonable.  What behavior are you expecting?

-Steve L.


On Tue, May 20, 2008 at 03:20:04PM +0100, Lewis Thompson wrote:
 Hi,
 
 I have a customer who has a basic IPMP config in his global zone:
 
vnet0  vnet1 [currently vnet0 has the 'floating' IP]
 
 In addition he has a zone with ip-type=shared where physical=vnet1
 
 When the zone boots the zone interface gets created on vnet0 instead of
 vnet1
 
 I have verified this behaviour on a test system and can confirm that the
 zone interface acts in the same way as the floating IP.  i.e. if I run
 'if_mpadm -d vnet0', then the zone's IP fails over to vnet1
 
 Unfortunately I can't find any documentation that discusses this
 behaviour.  Is anybody aware of any documents that explain what we are
 seeing?
 
 Thanks, Lewis
 
 ___
 zones-discuss mailing list
 zones-discuss@opensolaris.org
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] Can zone id be same on different global zones?

2008-05-16 Thread Steve Lawrence
/usr/bin/tar on solaris 10.

My comment was incorrect.  I was referring to the preservation of hard links.
I need to investigate the status of this in the various verions of tar.

Thanks,

-Steve L.

On Fri, May 16, 2008 at 10:59:10AM +0200, Joerg Schilling wrote:
 Steve Lawrence [EMAIL PROTECTED] wrote:
 
  tar is usually not the best archiving tool, as tends to deal poorly with
  symbolic links.  There are some examples of migrating a zone using both
  ufs (via pax) and zfs (via zfs send/recieve).
 
 Please elaborate what you like to point to when using the term tar?
 
 The Sun tar implementation has _security_ bugs related to archives that 
 contain 
 symbolic links but the same problem applies to the OpenGroup owned pax 
 implementation currently found at /usr/bin/pax.
 
 star on the other side has no known issues related to symbolic links.
 
 What are you talking about?
 
 J?rg
 
 -- 
  EMail:[EMAIL PROTECTED] (home) J?rg Schilling D-13353 Berlin
[EMAIL PROTECTED](uni)  
[EMAIL PROTECTED] (work) Blog: http://schily.blogspot.com/
  URL:  http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily
___
zones-discuss mailing list
zones-discuss@opensolaris.org

Re: [zones-discuss] How to migrate a seperate running server into a zone?

2008-05-15 Thread Steve Lawrence
Are you talking about zone migration? (attach/detach).  That should work well.
Both hosts need to be running the same version of solaris (patchlevel).  We
are planning a update-on-attach feature, which will allow a zone to be
detached, and then attached to a host running a higher patch level.

-Steve L.


On Thu, May 15, 2008 at 11:20:20AM +0300, mehmet cebeci wrote:
Hi Steve,
 
Thanks for info , i was searching for an unresolved solution all  day. In
this case my solution will be preparing the zones with the same
configuration as existing servers and copying disks.It seems the only
stable way of solving this issue.
 
Regards,
Mehmet
 
On Thu, May 15, 2008 at 6:02 AM, Steve Lawrence
[EMAIL PROTECTED] wrote:
 
  We currently don't have p2v support for native zones.  Future work for
  this
  is under consideration, but no timeframe established (yet).
 
  I could think of ways to make it work in the short term, but nothing
  that
  we could support.  It could also result in furthur issues later when
  patching/upgrading.
 
  -Steve L.
  O Wed, May 14, 2008 at 03:12:36PM +0300, mehmet cebeci wrote:
  Hi all,
  
  I would like to get any idea about how to migrate a seperate
  running
  servers on solaris 10  into a solaris zone seperately? For detail ;
  Our
  customer has running servers and he requests to migrate them one by
  one
  into another server with zones inside. In solaris 8 i searched that
  it may
  be done by flar,but i dont have any idea how to achieve it on
  solaris 10.
  
  Thanks for any opinion,
 
   ___
   zones-discuss mailing list
   [EMAIL PROTECTED]
 
 References
 
Visible links
1. mailto:[EMAIL PROTECTED]
2. mailto:zones-discuss@opensolaris.org
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] Dynamic dedicated-cpu

2008-04-16 Thread Steve Lawrence
zone reboot is required.

-Steve L.

On Tue, Apr 15, 2008 at 04:20:40PM +0100, Terry Smith wrote:
 Hi
 
 When adding dedicated-cpus to a zone does the configuration take effect 
 immediately or is a zone reboot required?
 
 T
 ___
 zones-discuss mailing list
 zones-discuss@opensolaris.org
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] zlogin and locales

2008-04-02 Thread Steve Lawrence
On Wed, Apr 02, 2008 at 11:11:12AM -0400, Moore, Joe wrote:
 Steve Lawrence wrote:
  Looks like the environment contained in /etc/default/init is 
  read and set
  by startd and init.  Since zlogin'ed processes are not child 
  of startd or init
  in the zone, they do not have these environment settings.
  
  Given brands, to fix this, we would need to add a hook that 
  asks the zone:
  
  Please fetch me the default login environment.
 
 And hope that the zone adminstrator hasn't figured out a way to violate
 security constraints by setting malicious variables in that default
 login environment...

You mean fetch the environment, but don't set for the zlogin process in
the global zone.  Just pass it into the zone and set it when exec'ing
the login process.

 
 Such as a specially-corrupted termcap (pushing data to the global-zone
 xterm, for example), or a locale with similar features
 
  
  It would be similar to the hook that we currently have for 
  fetching the
  passwd entry for a given user.
 
 passwd entries are fairly easy to validate.  Arbitrary environment
 variables should not be accepted from an untrusted source.
 
 --Joe
 ___
 zones-discuss mailing list
 zones-discuss@opensolaris.org
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] Capped memory observability

2007-02-06 Thread Steve Lawrence
On Mon, Feb 05, 2007 at 10:17:42PM -0600, Mike Gerdts wrote:
 I just got a chance to start playing with the capped memory resource
 controls in build 56.  At first blush, this looks to be *very* good
 stuff.  My initial testing included some very basic single process
 memory hog tests and multiple process mmap(..., MAP_SHARED,...) tests.
 In each case, the limits kicked in as I expected, and prstat -Z
 running from the global zone gave what appeared to be accurate
 information.  Great job!
 
 One of the effects of setting capped-memory resource control for swap
 is that the size of /tmp is also limited.  Unlike when a tmpfs size
 limit is set with the size=... mount option, df /tmp does not
 display a value that is reflective of the limits that are put in
 place.  Similarly, vmstat and swap -l running inside the zone give
 no indication that there is a cap smaller than the system-wide limits.
 Am I missing something here?

Swap -l shows details about swap devices, and we don't know in particular
how much each zone is using each swap device.  We could do something with
vmstat and swap -s.

We are looking to improve the general observability of resource limits
and utilization for zones.  It is a bit tricky though, as caps are not
reservations.  Since all zones share the same swap devices, mocking up size
to be equal to cap can lead to confusing output, such as used  size,
capacity  100%, etc.  Other confusing things include the fact that your
usage can be less than your cap, but there can still be no swap available
if the swap devices are full.  The swap and vmstat commands as they currently
exist cannot express these scenerios.

The sort answer is, we're working on it.

 
 I do see that some of the values I am looking for are available
 through kstat (thank you!).  Is there some more user-friendly tool
 (already or coming) to use inside the zone?

No ETA at this time.

 
 Oh, and the question that everyone at work will ask when I tell them
 about this - when will it find its way into Solaris?  :)
 
 Mike
 
 -- 
 Mike Gerdts
 http://mgerdts.blogspot.com/
 ___
 zones-discuss mailing list
 zones-discuss@opensolaris.org
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] Dynamic Resource Pools (DRP) and RCM ?

2007-01-23 Thread Steve Lawrence
The short answer is no.

When a processor is transfered from one pset to another, no RCM event is
generated in the global zone, or in any non-global zones.  RCM events are only
generated when DR operations take place.

The rcm_daemon only runs in the global zone, as part of the sysevent:default
service.

Currently, no events of any kind are generated when a processor is tranfered
from one pset to another.  pool_conf_update(3pool) can be used to detect
a change in a processor set.  pool_conf_update(3pool) is not a blocking
interface, so you need to call it periodically to detect a change.

-Steve L.



On Tue, Jan 23, 2007 at 01:42:08PM -0800, Gary Combs wrote:
 If a zone is bound to a resource pool with a processor set also bound 
 there and DRP is enabled, will the addition of a processor added to the 
 pset via DRP generate an insertion event via RCM? Would all zones bound 
 to the same pool see the RCM event? Would only the global zone see the 
 insert event?
 
 Thanks,
 Gary
 -- 
 http://www.sun.com/solaris  * Gary Combs *
 Technical Marketing
 
 *Sun Microsystems, Inc.*
 3295 NW 211th Terrace
 Hillsboro, OR 97124 US
 Phone x32604/+1 503 715 3517
 Cell 1-503-887-7519
 Fax 503-715-3517
 Email [EMAIL PROTECTED]
 
 Arguing with an engineer is like wrestling in mud with a pig; after a 
 while you realize the pig likes it!!
 
 http://www.sun.com/solaris
 
 

 ___
 zones-discuss mailing list
 zones-discuss@opensolaris.org

___
zones-discuss mailing list
zones-discuss@opensolaris.org


[zones-discuss] Re: Restart: PSARC/2006/598 Swap resource control; locked memory RM improvements

2006-11-14 Thread Steve Lawrence
 Good question.  These are essentially virtual system 
 requirements.
  
 
 
 What is the behaviour of Solaris intended to be when someone
 makes these changes (or attempts to make them) on a system
 that has no swap space?

All systems have reservable swap space.  Systems with no swap
devices use physical memory to back swap reservations.

 
 Furthermore, why shouldn't I be able to say a zone has no swap
 space available to it - i.e. to force it to all run from RAM?

Solaris's vm system has no such concept.  All anonymous allocations
reserve swap.  I think you suggesting a zone switch so that an admin
can choose from one of:

A. reserve swap from disk only
B. reserve swap from memory only
C. reserve swap from disk, then memory
D. reserve swap from memory, then disk.

Currently, system behavior is C for everyone.  zone.max-swap simply
limits swap reservation.  It does not provide an interface for choosing
a swap allocation policy.  These concepts are orthogonal.  I can see
a swap sets feature addressing allocation policy, since swap sets
could be used to associate a given zone with a particular set of swap
devices.

-Steve

 
 Darren
 
___
zones-discuss mailing list
zones-discuss@opensolaris.org


[zones-discuss] Re: Restart: PSARC/2006/598 Swap resource control; locked memory RM improvements

2006-11-13 Thread Steve Lawrence
On Sat, Nov 11, 2006 at 09:02:48PM -0800, Gary Winiger wrote:
 
   First off, sorry for the stutter in the spec update mail.
 
  The project team didn't supply a summary of the changes, so I'll be
  asking for one in a follow on.
 

I've addressed your comments way below.  Here is my change summary and case
discussion summary:

SUMMARY OF CHANGES

1.  Change to the proposed uncommitted kstat names and statistics.
From the form:

zone:{zoneid}:vm

with statistics:
zonename
swap_reserved
max_swap_reserved
locked_memory
max_locked_memory

To the form:

caps:{zoneid}:swaprsev_zone_{zoneid}
caps:{zoneid}:lockedmem_zone_{zoneid}
caps:{zoneid}:lockedmem_project_{projid}

with statistics:
zonename
usage
value

This sets up a generic scheme for adding kstats to project and
zone rctls.  A kstat is created per rctl, instead of per zone.

2.  Addition of zonecfg(1m) minimums for setting zone.max-swap.

When setting zone.max-swap via zonecfg(1m), a minimum value will be
enforced:

global zone: 100M
non-global zone: 50M

Currently, this is about 20M more than is needed to boot after a
default installation.

3.  Addition of zonecfg(1m) warnings when setting zone.max-swap and
zone.max-lwps on the global zone.

global:capped-memory set swap=200M
Warning: Setting capped swap on the global zone can impact
system availability.

SUMMARY OF CASE DISCUSSION:

The case disussion has focused on the problem that the zone.max-swap
rctl on the global zone can affect system availability.  An identical
problem exists today with task/project/zone.max-lwps.

Solutions to this problem may involve one or more of:

- Exempting project 0 in the global zone from zone.* rctls.
- Preventing task/project.* rctls from being set on project 0
  in the global zone.
- Modifying root's default project.
- Adding a new privilege to exempt a process from rctls.
- Updating system service manifests to drop the new privilege.

Solving this problem in a way that will prevent the global zone (on a
default system) from becoming unavailable due to a resource control setting
will require a signficant change to the system.  I believe solving this
problem is outside the scope of the zone.max-swap case, and would be better
solved by another case which is not seeking patch binding.

To minimize this problem for zone.max-swap (and zone.max-lwps), I've instead
proposed zonecfg enhacements to assist the admin in configuring these rctls
safely.

 
1. This case proposes adding the following resource control:
  
  INTERFACE   COMMITMENT  BINDING
  zone.max-swap  CommittedPatch
  
   This control will limit the swap reserved by processes and tmpfs
   mounts within the global zone and non-global zones.  This resource
   control serves to address the referenced RFE[6].
 
   There was some considerable discussion on the global zone aspect
   of this part of the proposal.  Perhaps I missed in the spec how
   the new proposal mitigates the risk of the global zone not being
   able to administer the system.
 
  DETAIL:
  
1. zone.max-swap resource control.
  
   Limits swap consumed by user process address space mappings and
   tmpfs mounts within a zone.
 
   While a low zone.max-swap setting for the global zone can lead to
   a difficult-to-administer global zone, the same problem exists
   today when configuring the zone.max-lwps resource control on the
   global zone, or when all system swap is reserved.  The zonecfg(1m)
   enhancements detailed below will help administrators configure
   zone.max-swap safely.
 
   Perhaps I misunderstood the interaction between project 0
   and zone.max-lwps in the global zone.  If a max-lwps is set
   is project 0 bound by it?

Currently yes.  zone.* rctls bound all processes in the global zone, regardless
of project.  This is the issue that my other proposal is attempting to
address.

   Perhaps a short summary of the offline discussion on project 0
   and the project teams feeling that the discussions conclusions
   might not be patch qualified.  I realize the need for this project
   to have a patch binding.

I've added this summary above.

 
2. swap and locked properties for zonecfg(1m) capped_memory
   resource.
 
   To prevent administrators from configuring a low swap limit that
   will prevent a system from booting, zonecfg will not allow a
   swap limit to be configured to less than:
  
  Global zone: 100M
  Non-global zone: 50M.
  
   These numbers are based on the swap needed to boota zone after a
   default installation.

[zones-discuss] Re: PSARC/2006/598 Swap resource control; locked memory RM improvements

2006-11-09 Thread Steve Lawrence
On Tue, Oct 31, 2006 at 03:28:31PM -0800, Dan Price wrote:
 On Tue 31 Oct 2006 at 03:24PM, Steve Lawrence wrote:
  It seems reasonable to amend this case to say:
  
  1.
  Any process with priv_sys_resource running in the global zone's
  system project (project 0) will not subject to project.* or zone.*
  resource controls.  System daemons which wish to be subject to the
  global zone's resource controls can drop priv_sys_resource.
 
 This feels risky for a patch... the effect here is to un-manage
 things which the customer may be expecting to be resource managed,
 potentially including a workload.  Or is the point that no-one using
 RM will be relying on the system project to limit things?

Currently, if rctls are specified in project(4) for the system project, they
are never actually applied during boot.  While this could be considered a bug,
this also means that currently, by default, admins will have to take additional
manual steps to resource manage project 0.  They are more likely to put what
they want to manage in a new/different project, and use project(4) to configure
rctls.

-Steve

 
 This implies that can't give the system project 10 (or 100) shares
 after this proposal?
 
 -dp
 
 -- 
 Daniel Price - Solaris Kernel Engineering - [EMAIL PROTECTED] - 
 blogs.sun.com/dp
___
zones-discuss mailing list
zones-discuss@opensolaris.org


[zones-discuss] Re: PSARC/2006/598 Swap resource control; locked memory RM improvements

2006-11-08 Thread Steve Lawrence
I am working on a new spec.  I have an unanswered question from the
discussion:

  The SIZE column will also be changed to SWAP for prstat
  options a, T, and J, for users, tasks, and projects.

 The reason for not changing this column in the default output would be
 helpful.

I have a seperate private interface used by prstat(1m) to get aggregate swap
reserved by users, tasks, projects, and zones.  Default prstat output is
per-process, and the information is accessed via /proc.

Currently, per-process swap reservation is not counted or made available via
/proc.  From proc(4):

 typedef struct psinfo {
...
size_t pr_size;   /* size of process image in Kbytes */
...

size of process image is pretty meaningless.  If we can change pr_size to
be swap reserved by process, then we could change SIZE to SWAP for all
prstat(1m) output.  Would such a change to psinfo_t be reasonable  If
not, could potentially convert pr_pad1 to pr_swap

-Steve


On Mon, Nov 06, 2006 at 12:13:41PM -0800, Gary Winiger wrote:
 At the request of the project team, I've put this case into waiting need spec.
 When the spec is updated, it will be sent out and the timer reset.
 
 Gary..
___
zones-discuss mailing list
zones-discuss@opensolaris.org


[zones-discuss] Re: PSARC/2006/598 Swap resource control; locked memory RM improvements

2006-11-03 Thread Steve Lawrence
On Fri, Nov 03, 2006 at 09:36:45AM +, Darren J Moffat wrote:
 Steve Lawrence wrote:
 Given a lack of supportive feedback, I'm going to revoke the proposed 
 amendment
 below.  To mitigate a zone admin setting a problematic swap limit on the 
 global
 zone, we will enhance zonecfg to:
 
  1. Print a warning when setting swap (and lwp) limits on the global
 zone.  Since the swap limit will not go into effect until reboot,
 the admin has a change to modify his setting before it takes
 affect.
 
  2. Enforce a reasonable minimum when setting swap (and lwp) limits
 on the global zone.
 
 What is the definition of reasonable ?  I think that being a
 default should be part of the case.

I'm not sure what your second sentence means.  As for the first, how about:

reasonable minimum:  The amount of resource neccessary to boot a zone that
is has a default installation and configuration.

My minimum, I do not mean the system default value.  For example, for
the following rctl, 128 is the system default value:

# prctl -n project.max-shm-ids $$
process: 425781: -sh
NAMEPRIVILEGE   VALUEFLAG   ACTION   RECIPIENT
project.max-shm-ids
privileged128   -   deny -
system  16.8M max   deny -

zone.max-swap will not have a system default value.  By default, every zone
will have access to all swap available on the system.

By minimum, I mean the minimum that zonecfg will allow you to set.

zonecfg add capped memory
zonecfg:capped memory set zone.max-swap=1K
Error, minimum value for zone.max-swap is ###

-Steve

 
 -- 
 Darren J Moffat
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] Re: PSARC/2006/598 Swap resource control; locked memory RM improvements

2006-11-02 Thread Steve Lawrence
I'm not sure it is within the domain of this case to to tell admins what they
should and shouldn't use the global zone for.

In any event, we are making it easy for admins to manage swap limits for zones
via zonecfg.

On Tue, Oct 31, 2006 at 05:58:24PM -0800, Michael Barto wrote:
 After all thus juggling, let make it simple for the system admin and use 
 some sort of fair share process to assignment and manage the swap for 
 all the zones. Personally I think that the global zone should use 
 minimum resources and be considered in the IT management processes to be 
 only like a system controller on a complex server. Keep your 
 applications out of the global zone!!!
 
 Gary Winiger wrote:
 
 This will not help root logins directly, but could by setting:
 
   usermod -K project=system root

 
Or perhaps deliver root's entry this way to start with.
  
 
 Would that be a reasonable change to make via patch?  Perhaps this change
 could be delivered to nevada, but not backported.
 
 It would be confusing to deliver this change, and also deliver the 
 user.root
 project.  If we made root's default project system, then the user.root
 project should be removed.  user.root is kind of a bug anyhow, as SMF 
 does
 not run root services in user.root.  Currently, only root processes 
 spawned
 by login/pam run in user.root.

 
 
 
  
 
 Perhaps this issue should be run as a seperate fasttrack?  I need to 
 investigate the implementation impact.

 
I'm looking for this case to define how to preserve the current
model of unlimited unless one asks for a limit model in the
global zone.  I believe it is important from a system integrity and
maintenance perspective.  Other's may have different opinions.
If there is a compelling reason to deliver in phases, please discuss
that.
  
 
 The global zone will have no swap limit by default.  The default 
 zone.max-swap
 rctl delivered on the global zone is UINT64_MAX, which is essentially
 unlimited.  Is that what you mean?

 
 
  My point(s) here is not so much how things get done, but that
  the global zone is in some ways special.  IIRC, before this
  project, the GZ doesn't have a swap limit.  After this project
  an administrator could set swap limit on the GZ.  Granted this
  is administrative action and they get what they deserve/ask for.
  However, it seemed to me that part of this case should (my
  judgement) include some way to override the limit in case 
  override is really desired.  As implied, perhaps by putting
  root into project 0 at login or as part of daemon/service start
  is a way to bypass the administrator's choice in the GZ for
  some processes.  What I didn't see as part of this case is
  the architecture to allow this bypass.  Perhaps I'm off base
  for thinking it's necessary to protect against inadvertantly
  not being able to administer the system from the GZ.
 
 Gary..
  
 ___
 zones-discuss mailing list
 zones-discuss@opensolaris.org
 
  
 
 
 -- 
 
 *Michael Barto*
 Software Architect
 
   LogiQwest Circle
 LogiQwest Inc.
 16458 Bolsa Chica Street, # 15
 Huntington Beach, CA  92649
 http://www.logiqwest.com/
 
   [EMAIL PROTECTED] mailto:[EMAIL PROTECTED]
 Tel:  714 377 3705
 Fax: 714 840 3937
 Cell: 714 883 1949
 
 *'tis a gift to be simple*
 
 This e-mail may contain LogiQwest proprietary information and should be 
 treated as confidential.
 
___
zones-discuss mailing list
zones-discuss@opensolaris.org


[zones-discuss] Re: PSARC/2006/598 Swap resource control; locked memory RM improvements

2006-11-02 Thread Steve Lawrence
Given a lack of supportive feedback, I'm going to revoke the proposed amendment
below.  To mitigate a zone admin setting a problematic swap limit on the global
zone, we will enhance zonecfg to:

1. Print a warning when setting swap (and lwp) limits on the global
   zone.  Since the swap limit will not go into effect until reboot,
   the admin has a change to modify his setting before it takes
   affect.

2. Enforce a reasonable minimum when setting swap (and lwp) limits
   on the global zone.

Currently, the rctl framework provides many mechanisms by which the admin
can make the system difficult to manage.  For instance, setting task.max-lwps
on project user.root can prevent root login.  If we want to make changes to
prevent admins from resource-controlling their way out of the box, I think we
need a broader case to address the whole problem.

-Steve

On Tue, Oct 31, 2006 at 03:24:18PM -0800, Steve Lawrence wrote:
I'm looking for this case to define how to preserve the current
model of unlimited unless one asks for a limit model in the
global zone.  I believe it is important from a system integrity 
and
maintenance perspective.  Other's may have different opinions.
If there is a compelling reason to deliver in phases, please 
discuss
that.
   
   The global zone will have no swap limit by default.  The default 
   zone.max-swap
   rctl delivered on the global zone is UINT64_MAX, which is essentially
   unlimited.  Is that what you mean?
  
  My point(s) here is not so much how things get done, but that
  the global zone is in some ways special.  IIRC, before this
  project, the GZ doesn't have a swap limit.  After this project
  an administrator could set swap limit on the GZ.  Granted this
  is administrative action and they get what they deserve/ask for.
  However, it seemed to me that part of this case should (my
  judgement) include some way to override the limit in case 
  override is really desired.  As implied, perhaps by putting
  root into project 0 at login or as part of daemon/service start
  is a way to bypass the administrator's choice in the GZ for
  some processes.  What I didn't see as part of this case is
  the architecture to allow this bypass.  Perhaps I'm off base
  for thinking it's necessary to protect against inadvertantly
  not being able to administer the system from the GZ.
  
 
 It seems reasonable to amend this case to say:
 
   1.
   Any process with priv_sys_resource running in the global zone's
   system project (project 0) will not subject to project.* or zone.*
   resource controls.  System daemons which wish to be subject to the
   global zone's resource controls can drop priv_sys_resource.
 
   2.
   The user.root project will be removed, and root's default project
   will be set to the system project via /etc/user_attr.
 
 I'm not sure if (2) can be delivered via patch.  I need some guidance here.
 I'm also not sure how implementable (1) is until I do more investigation.
 
 -Steve
 
  Gary..
  
___
zones-discuss mailing list
zones-discuss@opensolaris.org


[zones-discuss] Re: PSARC/2006/598 Swap resource control; locked memory RM improvements

2006-11-01 Thread Steve Lawrence
 Would it be reasonable to propose special treatment of the global project 0
 for all project and zone rctls?  Once could argue that capping system 
 daemons
 can only lead some sort of undesireable system failure.
 
 This would of course exempt all global zone system daemons from resource
 management.  To mitigate this, SMF could be leveraged to run application
 daemons (or leaky/bad system daemons) in other projects. 
 
 Please don't do that in a hardcoded way.  I don't mind if it is the 
 default but it can't be hard coded.  One of the main reasons we have 
 nfsd (and kcfd) is so that resource controls can be placed on system 
 services.

Do you mean one of the reasons that nfs and kcfd run in the system project??

 
 With the advent of SMF this is actually really easy to do since you
 just need to set the project/pool the start method runs in.
 
 Also it is perfectly reasonable to have a case where no useful 
 customer work happens in the global zone, ie it is a service processor
 really, and all the real work happens in the non global zones.

Please elaborate on this.  I'm not sure I understand what you're getting at.

-Steve

 
 -- 
 Darren J Moffat
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] PSARC/2006/598 Swap resource control; locked memory RM improvements

2006-10-26 Thread Steve Lawrence
Comments inline.  I've snipped stuff not relevant to comments.

   4. prstat(1m) output changes to report swap reserved.
 
  INTERFACE   COMMITMENT  BINDING
  prstat(1m) output   Uncommitted   Patch
 
  This case proposes changing the SIZE column of prstat -Z zone
  output lines to SWAP.  The swap reported will be the total swap
  consumed by the zone's processes and tmpfs mounts.  This value will
  assist administrators in monitoring the swap reserved by each zone,
  allowing them to choose a reasonable zone.max-swap settings.
 
  The SIZE column will also be changed to SWAP for prstat
  options a, T, and J, for users, tasks, and projects.
 
 The reason for not changing this column in the default output would be 
 helpful.

I have a seperate private interface used by prstat(1m) to get aggregate swap
reserved by users, tasks, projects, and zones.  Default prstat output is
per-process, and the information is accessed via /proc.

Currently, per-process, or per-address-space, swap reservation is not
counted or made available via /proc.  From proc(4):

 typedef struct psinfo {
...
size_t pr_size;   /* size of process image in Kbytes */
...

size of process image is pretty meaningless.  If we can change pr_size to
be swap reserved by process, then we could change SIZE to SWAP for all
prstat(1m) output.  Would such a change to psinfo_t be reasonable?

  Currently a global or non-global zone can consume all swap
  resources available on the system, limiting the usefulness of zones
  as an application container.  zone.max-swap provides a mechanism to
 
 I would rephrase that as the container of an application to avoid 
 confusion with the Solaris feature set called Containers.  I assume that 
 the former was meant moreso than the latter even though Containers are 
 Solaris' implementation of an application container.

I'm not sure what you mean, but ok.  By the Solaris feature set called
Containers., do you mean zones + RM, or do you mean zones, xen, ldoms.

  zone.max-swap will be configurable on both the global zone, and
  non-global zones.  The affect on processes in a zone reaching its
  zone.max-swap limit is the same as if all system swap is reserved.
  Callers of mmap(2) and sbrk(2) will receive EAGAIN.  Writes to
  tmpfs will return ENOSPC, which is the same errno returned when
  a tmpfs mount reaches it's size mount option.  The size mount
  option limits the quantity of swap that a tmpfs mount can reserve.
 
 With S10 11/06, some zone limitations are now configurable, e.g. setting 
 the system time clock.  Similarly, the ability to modify a zone's swap 
 limit could be given to the zone's root user, which might be valuable in 
 some situations.  This would be analogous to the 'basic' privilege level.  
 It would allow an advisory limit to be placed on a zone - a limit that the 
 zone admin could modify in unusual circumstances.
 
 I realize that this opens a can of worms in that most rctls are protected 
 by the sys_res_config priv, which is not allowed in a zone even with 11/06. 
 Further, it makes sense to consistently allow or forbid rctl-modification 
 in zones.
 
 I just wanted to mention this idea so that it is not unintentionally 
 overlooked.

Currently, all zone.* rctls are not modifiable from a non global zone.

The established mechanism for a zone admin to set rctls within the
zone is via project.* rctls set on projects within the zone.  Granted, in
the zone.max-swap case, we are not proposing a project.max-swap, due to
implementation complexity and risk.  With sufficient customer damand, we could
investigate implementing project.max-swap in the future.

Currently no zone.* rctls allow basic rctl values to be set.  The only
project.* rctl which allows basic is project.max-contracts, and perhaps
that is a bug.  A basic rctl is an unprivileged rctl that only affects the
process within the task, project, or zone which sets it.  It is pretty
useless, except for process.* rctls.

I'd be happy to address the general issues of privilege related to project
and zone rctls as a seperate case.  A possible solution may be to redefine
basic for project and zone rctls, and/or introduce more fine grained
privileges.  I agree that work is needed here.

  STATISTIC   DESCRIPTION
  zonenameThe name of the zone with {zoneid}
  swap reserved:  swap reserved by zone in bytes.
 
 Does swap_reserved include pages shared with other zones, e.g. text pages?

Each process mapping text reserves unique swap for that mapping.  Even though
the underlying physical page may be shared between processes/zones, each
process needs it's own swap reservation.  This is because each process may
cow the page, and then may need to page the private copy to disk.

 
  max_swap_reserved:  current zone.max-swap limit