Re: [zones-discuss] code review: native brand refactoring

2008-05-28 Thread Jerry Jelinek
Ed,

My responses are in-line.

Edward Pilatowicz wrote:
 On Tue, May 27, 2008 at 08:48:17AM -0600, Jerry Jelinek wrote:
 I have updated the webrev at:

 http://cr.opensolaris.org/~gjelinek/webrev/

 This includes the changes for the feedback I have
 received so far.  I also added the zlogin.c file
 to the webrev with two bug fixes.  One of these was for
 a bug I was hitting during testing of these changes
 and there is a second bug in zlogin that came in
 which I also fixed.  So, at a minimum, it would be
 good to take a look at that additional file.

 
 more comments below.
 all that's left is sw_support.c, i'll get to that tommorow.
 ed
 
 --
 usr/src/cmd/zoneadm/zoneadm.c
 
 - isn't it kinda weird to have the brand specific cli options come
   before
   the brand name when cloning a zone?

I'm not sure that I know which code you are thinking of here, but
since these are options, the order doesn't matter.

 - why are you parsing arguments by hand in stead of using getopt()?
   this will result in non-standard argument parsing and probably break
   things that used to work like:
   zoneadm -z foo clone -mcopy -sexport/zones/[EMAIL PROTECTED] ...
   (note the lack of spaces between the flag and the option) or:
   zoneadm -z foo attach -Fu ...

I would prefer to use getopt but I don't think there is any way
to make getopt work with a mix of known and unknown options.  I
initially coded this to handle the returned ? with opterr=0 when
an option that is not in the argument string is encountered, but
getopt is then in an error state and won't process any additional
options.  So, I had to fall back to using the manual parsing I have
now which does handle unknown options, although doesn't handle all
of the permutations that getopt could, as you noted.  I am not
really happy with what is there now, so if you have any suggestions,
let me know.

 - in clone_func(), done: always calls zonecfg_release_lock_file(), but
   there are paths that can invoke goto done; without first calling
   zonecfg_grab_lock_file().

Fixed.

 - in clone_func(), it seems like the is_system_labeled() check should
   be moved out of here and into the native brand postclone callback.

Moved.

 - why is there no locking around the pre-detach hook?

That hasn't changed with this code but if I recall correctly, the idea
was that predetach is not making any changes to the zone so we don't
lock until actual changes to the state are being made.

 - in dryrun_get_brand(), use STDIN_FILENO instead of 0.

Fixed.

 - in dryrun_get_brand(), you might want to always read the manifest
   into a temporary file.  with the current code specifying a manifest
 path
   of /dev/fd/0 (which is a standard substitute for -) won't work.

Done.

 - in the case of a forced attach why don't we run the brand callback?

That is really the definition of forced.  We don't do anything when
forced except change the state of the zone to installed.

 - might it simplify zoneadm to just include locking for help request
   subcommands?

It probably does but I didn't want to lock the zone when we don't have
to and this is consistent with the original behavior where the built-in
help never locked the zone.

 - in attach_func(), if the force flag is specified it seems like we'll
   always call zonecfg_release_lock_file().  (Even if we didn't aquire
   the lock beforehand.)

We should be doing this but I think I see the problem you were wondering
about.  The test for the combination of dry-run and force flags was
wrong.  I fixed that.

 - uninstall_func() has lots of error returns that don't call
   zonecfg_release_lock_file().

It looks like most of those are pre-existing bugs, but I went ahead
and fixed them.  Thanks for spotting that.

Thanks again for all your comments on this code-review,
Jerry


___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] code review: native brand refactoring

2008-05-28 Thread Jerry Jelinek
Steve Lawrence wrote:
 Hey Jerry,
 
 Does this address this comment in 6621020:
 
 
 This appears to point out at least one bug in zlogin, namely that it
 keeps stdout_pipe[1] and stderr_pipe[1] from noninteractive_login()
 open when returning to the parent.
 
 
 Basically, I think the filer expected to see something like:
 
 
   pipe(stdout);
   pipe(stderr);
 
   if (fork() == 0) {
   /* in child, close pipe sides read by parent) */
   close(stdout[0]);
   close(stderr[0]);
 
   .. write to std*[1]...
   ...
   exit(..);
   }
   /* in parent, close pipe sides written by child */
   close(stdout([1]);
   close(stderr([1]);
 
   ... read from std*[0] ...
   ...
 
 I think the in child part is handled by the closefrom on line 1559, but
 the parent does not close the sides of the pipes that the child writes to.

Steve,

I did not make the change discussed above.  Let me go back and take
another look at that.

Thanks,
Jerry
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] code review: native brand refactoring

2008-05-28 Thread Jerry Jelinek
Edward Pilatowicz wrote:
 - why are you parsing arguments by hand in stead of using getopt()?
   this will result in non-standard argument parsing and probably break
   things that used to work like:
   zoneadm -z foo clone -mcopy -sexport/zones/[EMAIL PROTECTED] ...
   (note the lack of spaces between the flag and the option) or:
   zoneadm -z foo attach -Fu ...

Ed,

Your comment prompted me to go back and take another look
at this and now I am not sure what I thought I was seeing
before, but it seems like I can make this work, so I will
go back to using getopt.

Thanks,
Jerry
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] code review: native brand refactoring

2008-05-28 Thread Nicolas Williams
See also:

6475075 zlogin i/o loop needs work
6263984 zlogin's i/o loop can sometimes drop data on child death

On Wed, May 28, 2008 at 11:53:18AM -0500, Nicolas Williams wrote:
 On Tue, May 27, 2008 at 08:48:17AM -0600, Jerry Jelinek wrote:
  This includes the changes for the feedback I have
  received so far.  I also added the zlogin.c file
  to the webrev with two bug fixes.  One of these was for
  a bug I was hitting during testing of these changes
  and there is a second bug in zlogin that came in
  which I also fixed.  So, at a minimum, it would be
  good to take a look at that additional file.
 
 Why timeout the poll(2) call when you can have a signal handler that
 writes a byte into a pipe that you could poll() for?
 
 That's the standard way (well, one of them) to handle signals +
 polling.  (E.g., we use it in Sun_SSH.)
 
 If you do it this way you get more responsiveness and you don't waste
 time waking up the zlogin process when there's nothing to do (this last
 is important w.r.t. power consumption).
 
 Nico
 -- 
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] code review: native brand refactoring

2008-05-28 Thread Edward Pilatowicz
On Wed, May 28, 2008 at 07:02:32AM -0600, Jerry Jelinek wrote:

 - why is there no locking around the pre-detach hook?

 That hasn't changed with this code but if I recall correctly, the idea
 was that predetach is not making any changes to the zone so we don't
 lock until actual changes to the state are being made.


hm.  that differs from the description in the dtd.  the dtd says the
callback should be carefull making changes since it could be invoked
multiple times.  it doesn't say anything about being read-only or
concurrent invocations.  i'd probably just be best to include locking
around the pre-detach call.

ed
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] code review: native brand refactoring

2008-05-28 Thread Jerry Jelinek
Nicolas Williams wrote:
 On Tue, May 27, 2008 at 08:48:17AM -0600, Jerry Jelinek wrote:
 This includes the changes for the feedback I have
 received so far.  I also added the zlogin.c file
 to the webrev with two bug fixes.  One of these was for
 a bug I was hitting during testing of these changes
 and there is a second bug in zlogin that came in
 which I also fixed.  So, at a minimum, it would be
 good to take a look at that additional file.
 
 Why timeout the poll(2) call when you can have a signal handler that
 writes a byte into a pipe that you could poll() for?
 
 That's the standard way (well, one of them) to handle signals +
 polling.  (E.g., we use it in Sun_SSH.)
 
 If you do it this way you get more responsiveness and you don't waste
 time waking up the zlogin process when there's nothing to do (this last
 is important w.r.t. power consumption).

Noco,

Thanks for looking at this.  I'll take a look at
implementing your suggestion.

Thanks again,
Jerry
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] code review: native brand refactoring

2008-05-28 Thread Jerry Jelinek
Edward Pilatowicz wrote:
 On Wed, May 28, 2008 at 07:02:32AM -0600, Jerry Jelinek wrote:
 - why is there no locking around the pre-detach hook?
 That hasn't changed with this code but if I recall correctly, the idea
 was that predetach is not making any changes to the zone so we don't
 lock until actual changes to the state are being made.

 
 hm.  that differs from the description in the dtd.  the dtd says the
 callback should be carefull making changes since it could be invoked
 multiple times.  it doesn't say anything about being read-only or
 concurrent invocations.  i'd probably just be best to include locking
 around the pre-detach call.

Ed,

OK, I'll move the locking up.

Thanks,
Jerry
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] code review: native brand refactoring

2008-05-28 Thread Edward Pilatowicz
On Wed, May 28, 2008 at 07:02:32AM -0600, Jerry Jelinek wrote:

 - in the case of a forced attach why don't we run the brand callback?

 That is really the definition of forced.  We don't do anything when
 forced except change the state of the zone to installed.


hm.  that differs from my concept of a forced operation, which
i thought always meant, do all the same things you normally do, but
ignore any errors encountered along the way.

if we do a forced boot, do we invoke the brand boot callback?

if you leave it as is, could you document this behavior in the dtd
comment?

ed
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] code review: native brand refactoring

2008-05-28 Thread Jerry Jelinek
Edward Pilatowicz wrote:
 On Wed, May 28, 2008 at 07:02:32AM -0600, Jerry Jelinek wrote:
 - in the case of a forced attach why don't we run the brand callback?
 That is really the definition of forced.  We don't do anything when
 forced except change the state of the zone to installed.

 
 hm.  that differs from my concept of a forced operation, which
 i thought always meant, do all the same things you normally do, but
 ignore any errors encountered along the way.
 
 if we do a forced boot, do we invoke the brand boot callback?
 
 if you leave it as is, could you document this behavior in the dtd
 comment?

Ed,

I'll update the dtd since this behavior hasn't changed with this
update to the code and I would prefer to keep it as is.

Thanks,
Jerry


___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] code review: native brand refactoring

2008-05-28 Thread Edward Pilatowicz
On Tue, May 27, 2008 at 08:48:17AM -0600, Jerry Jelinek wrote:
 I have updated the webrev at:

 http://cr.opensolaris.org/~gjelinek/webrev/

 This includes the changes for the feedback I have
 received so far.  I also added the zlogin.c file
 to the webrev with two bug fixes.  One of these was for
 a bug I was hitting during testing of these changes
 and there is a second bug in zlogin that came in
 which I also fixed.  So, at a minimum, it would be
 good to take a look at that additional file.


here's my last batch of comments.
(for sw_support.c i only looked at the new functions you introduced.)
ed

--
general

- iirc, there is still tons of zulu code in zoneadmd.  this code creates
  the special live upgrade environment, and LU is tied to svr4
  packaging, right?  so it seems like the zulu code is brand specific
  and should also be factored out into mount callbacks.  (and iirc,
  the mount operations isn't currenly supported by any branded zones.)
  thoughts?


--
usr/src/lib/brand/native/zone/sw_support.c

- in main(),  an invalid cmd value will result in returning an
  uninitialzed err value.  makes me wonder, is this code linting
  properly?

- nit: in install_func(), status should be initialized to Z_ERR not 1.

- in install_func(), if the snprintf() overflows, you'll pass a
  a constant error value (Z_ERR) to subproc_status() (which is
  expecting a waitpid() encoded status value).

- hm.  attach_func() looks for the -F force flag, but as discussed
  previously, in the case of forced attach, the attach brand callback
  is currently not executed.
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] code review: native brand refactoring

2008-05-28 Thread Jerry Jelinek
Ed,

Responses below.

Edward Pilatowicz wrote:
 On Tue, May 27, 2008 at 08:48:17AM -0600, Jerry Jelinek wrote:
 I have updated the webrev at:

 http://cr.opensolaris.org/~gjelinek/webrev/

 This includes the changes for the feedback I have
 received so far.  I also added the zlogin.c file
 to the webrev with two bug fixes.  One of these was for
 a bug I was hitting during testing of these changes
 and there is a second bug in zlogin that came in
 which I also fixed.  So, at a minimum, it would be
 good to take a look at that additional file.

 
 here's my last batch of comments.
 (for sw_support.c i only looked at the new functions you introduced.)
 ed
 
 --
 general
 
 - iirc, there is still tons of zulu code in zoneadmd.  this code creates
   the special live upgrade environment, and LU is tied to svr4
   packaging, right?  so it seems like the zulu code is brand specific
   and should also be factored out into mount callbacks.  (and iirc,
   the mount operations isn't currenly supported by any branded zones.)
   thoughts?

I don't think there is really any zulu code in zoneadmd but I'll
take a pass through to double check.  If I see anything like
that, I'll fix it up and update the webrev.

The mount code is not really svr4 pkg specific either, but is
specific to the native brand.  I have extensions to mount which
would allow us to mount any type of brand with very little change.
I added those with the update on attach work.  Being able to
mount any zone is something I've thought would be handy for a long
time now, so I'll take a look at that as I am making the pass through
zoneadmd.

 
 --
 usr/src/lib/brand/native/zone/sw_support.c
 
 - in main(),  an invalid cmd value will result in returning an
   uninitialzed err value.  makes me wonder, is this code linting
   properly?

I am not getting any lint warning from that, which is surprising.
I fixed this.

 - nit: in install_func(), status should be initialized to Z_ERR not 1.

Fixed.

 - in install_func(), if the snprintf() overflows, you'll pass a
   a constant error value (Z_ERR) to subproc_status() (which is
   expecting a waitpid() encoded status value).

That was old code that just got moved, but it is broken and I fixed
it.

 - hm.  attach_func() looks for the -F force flag, but as discussed
   previously, in the case of forced attach, the attach brand callback
   is currently not executed.

I duplicated that when I split things up.  I have removed it.

Thanks again,
Jerry

___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] code review: native brand refactoring

2008-05-28 Thread Edward Pilatowicz
On Wed, May 28, 2008 at 12:24:07PM -0600, Jerry Jelinek wrote:
 Ed,

 Responses below.

 Edward Pilatowicz wrote:
 On Tue, May 27, 2008 at 08:48:17AM -0600, Jerry Jelinek wrote:
 I have updated the webrev at:

 http://cr.opensolaris.org/~gjelinek/webrev/

 This includes the changes for the feedback I have
 received so far.  I also added the zlogin.c file
 to the webrev with two bug fixes.  One of these was for
 a bug I was hitting during testing of these changes
 and there is a second bug in zlogin that came in
 which I also fixed.  So, at a minimum, it would be
 good to take a look at that additional file.


 here's my last batch of comments.
 (for sw_support.c i only looked at the new functions you introduced.)
 ed

 --
 general

 - iirc, there is still tons of zulu code in zoneadmd.  this code creates
   the special live upgrade environment, and LU is tied to svr4
   packaging, right?  so it seems like the zulu code is brand specific
   and should also be factored out into mount callbacks.  (and iirc,
   the mount operations isn't currenly supported by any branded zones.)
   thoughts?

 I don't think there is really any zulu code in zoneadmd but I'll
 take a pass through to double check.  If I see anything like
 that, I'll fix it up and update the webrev.

 The mount code is not really svr4 pkg specific either, but is
 specific to the native brand.  I have extensions to mount which
 would allow us to mount any type of brand with very little change.
 I added those with the update on attach work.  Being able to
 mount any zone is something I've thought would be handy for a long
 time now, so I'll take a look at that as I am making the pass through
 zoneadmd.


hm.  from what i remember mount mode actually puts the zone root at
zoneroot/a and lofs mounts stuff from the global zone at zoneroot,
all so that the svr4 packaging code can enter the zone to do packaging
operations.  i thought this dance was necessary because we wanted packaging
scripts to execute inside a zone (since for security reasons, we wouldn
t want them to be able to muck with stuff outside a zone).  but IPS doesn't
have packaging scripts.  so will this complicated setup really be necessary
for IPS?  ie, could we do away with the /a mount in mount mode for IPS?

i'm asking about this because the /a mount code in zoneadmd has lots of hard
coded paths and makes zoneadmd pretty complicated.  (just look at
mount_filesystems() and build_mounted_post_var().)  i've had to modify this
code a few times, i've broken it a couple times, and been confused by it all
the time...

ed
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] code review: native brand refactoring

2008-05-28 Thread Edward Pilatowicz
On Wed, May 28, 2008 at 01:00:08PM -0600, Jerry Jelinek wrote:
 Edward Pilatowicz wrote:
 hm.  from what i remember mount mode actually puts the zone root at
 zoneroot/a and lofs mounts stuff from the global zone at zoneroot,
 all so that the svr4 packaging code can enter the zone to do packaging
 operations.  i thought this dance was necessary because we wanted packaging
 scripts to execute inside a zone (since for security reasons, we wouldn
 t want them to be able to muck with stuff outside a zone).  but IPS doesn't
 have packaging scripts.  so will this complicated setup really be necessary
 for IPS?  ie, could we do away with the /a mount in mount mode for IPS?

 i'm asking about this because the /a mount code in zoneadmd has lots of hard
 coded paths and makes zoneadmd pretty complicated.  (just look at
 mount_filesystems() and build_mounted_post_var().)  i've had to modify this
 code a few times, i've broken it a couple times, and been confused by it all
 the time...

 Ed,

 Yes, this description is correct but not because of the pkging
 code.  We want to be able to enter the zone while running the
 sw that is installed in the global zone, irrespective of what is
 inside the non-global zone.  All of the global zone sw is mounted
 read-only in a mounted zone so we can login into the zone and
 have a safe environment that matches the global zone.  The non-global
 zone is mounted under /a so we can safely operate on it within the zone
 using the global zones sw.  None of this has any relationship to
 the pkging code in either zone.  All the mount is doing is setting
 up a safe zone that is compatible with the global zone.  With this
 definition, you can use mount to safely do any kind of admin work,
 not just running pkging code.  The problem with the current mount
 is it lofs mounts the zone's etc and var back into the global zone
 mount area.  This obviously won't work for non-native zones or even
 older zones that we are migrating to the new host, which is why I
 had to fix this for update on attach.


ah.  ok.  thanks for clearing that up for me.  i assume all that stuff
was for zulu/lu and we would be able to rip it out someday...

ed
___
zones-discuss mailing list
zones-discuss@opensolaris.org