Quoth Steve Peng on Tue, Dec 18, 2007 at 11:07:30AM -0700:
> The incremental webrev which incorporate all the comments can be found 
> under the same location.  The original webrev has been renamed to 
> 6351623-old for your reference.  Again, send me any comment on those new 
> changes.

Good work so far.  I found some devils in your details, though:

common/svc/repcache_protocol.h
  357: This should be "SWITCH(src, dst, flag)".

  358:
    - This should explain what flag means.

    - This describes how the operation will be used.  You should at
      least precede it with a description of the operation.  Something
      like "Moves the main repository database file from src to dst."

  817: I believe the comment is supposed to correspond to the
    enum rep_protocol_requestid constant, which would just be SWITCH in
    this case.

  818-20: Do these need to be indented so much?

cmd/svc/configd/client.c
  1774: Please document the errors that repository_switch() can fail
    with.

  1789: Don't you need the changeid logic that backup_repository() has?
    Or is backend_switch() somehow idempotent?  (See 1.1 at the top of
    repcache_protocol.h about idempotency.)

cmd/svc/configd/backend.c
  1105-6: This is a lot of memory.  Can you allocate it on the heap
    rather than the stack, so that we can fail gracefully if we're out
    of memory, rather than segfaulting?

  1109: Can't this be const char *?

  1118: Can't you just make tmppath PATH_MAX + 7 to avoid this?

  1122,27: Please include strerror(errno) in these error messages.

  1128: Shouldn't you warn the administrator if unlink() fails?

  1129: Shouldn't you close(dstfd)?

  1133-7: Shouldn't this explanation either be before
    backend_switch_copy() or before 1148?

  1133: Shouldn't this explanation of sw_back be either before
    backend_switch_copy() or before we use sw_back at 1178?

  1141: What is this fstat() for?

  1142: Please include strerror(errno) in this error message.

  1143: Shouldn't you warn the administrator if unlink() fails?

  1151,63: Please include strerror(errno) in these error messages.

  1185,96,1206: Please include both paths and strerror(errno) in these
    error messages.

  1204: What's the benefit of moving to trans_db if you're going to
    move to dst right away?

  1208,15: Shouldn't you warn the administrator if these unlink()s fail?

  1223: Please explain what the objectives of backend_switch() are, and
    maybe how it accomplishes them.

  1227-8: For each error code, please explain why the function failed.

  1243: backend_lock() cannot fail with those arguments.  Please
    assert(result == REP_PROTOCOL_SUCCESS).

  1248: Why do you fail if be_readonly is set?  Won't this be true when
    we try to copy the repository off while the root filesystem is
    read-only?  If it is correct, why not invoke backend_lock() with
    writing = 1?

  1254: src has come straight from the client.  It might be
    unterminated.  In client.c:repository_switch() you should either use
    strnlen() to detect the situation and return an error, or you should
    forcibly terminate the buffer.  Same for dst.

  1256: Why is it ok to fail if src doesn't match?  What should the
    client do in that case?

  1267: I'm confused by this comment.  It says that you're going to "go
    ahead switch back to the original repository", but the code just
    returns without switching.  Can you explain this for me?

  1273: If backend_switch_copy() failed, but sw_back is true, will we
    try to sqlite_open(dst) anyway?

  1276: Since sqlite has a specific "integrity check" function, please
    use "sanity check" here.

  1278: If sqlite_open() fails, shouldn't we issue an error message?

  1287: This string looks a lot like one in backend_create().  Should
    they be shared with a macro or a common function?

  1296: If this fails, shouldn't we issue an error message?  And return
    something other than REP_PROTOCOL_SUCCESS?

  1315-7: Can these be const char *?

  1329s/uses/use/

  1332: This will be overwritten if stat(trans_db) succeeds.  Why don't
    you do it first, and only do stat(fast_db) if it fails?

  1343: As for 1276, "integrity check" means something specific in
    sqlite, so please change this to "sanity check".

  1371: If sqlite_open() failed, shouldn't we warn the administrator
    that there's an invalid repository lying around?  Or that we didn't
    use it because it is invalid?

  1348-57: This looks a lot like the code at 1283-92.  Shouldn't we put
    them into a separate function?

  1366: Shouldn't we warn the administrator if this fails?

  1370: This shouldn't be necessary if the rename() succeeded, should
    it?

lib/libscf/inc/libscf.h
  562-5: Why did you choose to put these in libscf.h rather than
    libscf_priv.h?

lib/libscf/inc/libscf_priv.h
  297-301: Why do we need a structure?  Can't _scf_repository_switch()
    accept sw_src, sw_dst, and sw_back as separate parameters?

lib/libscf/common/lowlevel.c
  6859: Please change this to something like "Request svc.configd to
    switch repository database files."  Also document what errors the
    function can fail with.

  6881: Blank line, please.

  6884-5: This can be done before the mutex is taken.

cmd/svc/svcadm/svcadm.c
  2405,16: Since you're not adding this subcommand to the usage message,
    these calls won't be helpful.

  2408-9,13-14: Why are you encoding the paths into svcadm?  Do you
    anticipate needing to switch to repositories other than
    FAST_REPOSITORY_DB and REPOSITORY_DB in the future?  If not, this
    seems like extra information which is easy to get wrong.  Couldn't
    you just set sw_back to 1 or 0, and have svc.configd know what paths
    to use?

  2428: _NOT_BOUND should be impossible, so you should abort() if it
    happens.

  2432: This should be "/* NOTREACHED */".

  2442: Please add a default case which prints that
    _scf_repository_switch() returned an unknown error.

cmd/svc/milestone/manifest-import
  449: Please change this to something like "Try moving the repository
    to tmpfs.  If it doesn't work, reset doswitch so we won't try
    switching back."

  451-75: I think you could save some lines with

        /usr/sbin/svcadm _smf_repository_switch fast
        switchret=$?
        ...
        if [ $switchret -eq 0 ]; then
                /usr/sbin/svcadm _smf_repository_switch perm || exit 1
        fi

  468: Can you explain that the existing retry mechanism is SMF
    restarting the service?  Something like, "... if switch back fails,
    we'll just exit and svc.startd should restart us."

  473: I think you left the dollar sign out of this line.


David

Reply via email to