The code responsible for the SVCCFG_CHECKHASH handlis is at
http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/cmd/svc/common
/manifest_hash.c#632

So I rolled back the root filesystem, then I modified manifest-import method
as below:

root@solaris:~# diff -u
/.zfs/snapshot/install/lib/svc/method/manifest-import
/lib/svc/method/manifest-import
--- /.zfs/snapshot/install/lib/svc/method/manifest-import       2011-03-02
08:14:29.198429073 +0000
+++ /lib/svc/method/manifest-import     2011-03-02 14:40:25.849157553 +0000
@@ -66,7 +66,7 @@
 [ -n "$ALT_MFST_DIR" -a -z "$ALT_REPOSITORY" ] && usage

 function svccfg_apply {
-       $X /usr/sbin/svccfg apply $1
+       dtrace -q -n 'pid$target::mhash_test_file:entry{printf("%s ",
copyinstr(arg1));} pid$target::mhash_test_file:return{printf("%s\n",
arg1==0?"MHASH_NEWFILE":arg1==1?"MHASH_RECONCILED":arg1==-1?"MHASH_FAILURE":
"unknown");}' -c "/usr/sbin/svccfg apply $1"
        if [ $? -ne 0 ]; then
                echo "WARNING: svccfg apply $1 failed" | tee /dev/msglog
        fi
root@solaris:~#


then I reset the server and this is what I got in logs:

root@solaris:~# cat /var/svc/log/system-early-manifest-import\:default.log
svccfg: Loaded 170 smf(5) service descriptions
/etc/svc system profiles not found: upgrade system profiles
/etc/svc/profile/generic.xml MHASH_NEWFILE

/etc/svc/profile/platform.xml MHASH_NEWFILE

/etc/svc/profile/site.xml MHASH_NEWFILE



So far so good.

root@solaris:~# cat /var/svc/log/system-manifest-import\:default.log
[ Mar  2 14:41:27 Enabled. ]
[ Mar  2 14:41:44 Executing start method
("/lib/svc/method/manifest-import"). ]
[ Mar  2 14:41:44 Timeout override by svc.startd.  Using infinite timeout. ]
svccfg: Loaded 8 smf(5) service descriptions
/etc/svc/profile/generic.xml MHASH_NEWFILE

/etc/svc/profile/platform.xml MHASH_NEWFILE


[ Mar  2 14:41:53 Method "start" exited with status 0. ]
root@solaris:~#


This one is looks wrong (although expected given the behaviour).


Lets get a little bit more details, again rollback to @install and this time
below modification to manifest-import:

function svccfg_apply {
        dtrace -q -n
'pid$target::mhash_test_file:entry{self->in=1;printf("%s\n ",
copyinstr(arg1));}
pid$target::mhash_test_file:return{self->in=0;printf("%s\n",
arg1==0?"MHASH_NEWFILE":arg1==1?"MHASH_RECONCILED":arg1==-1?"MHASH_FAILURE":
"unknown");} pid$target:a.out::return/self->in/{printf("%s %d\n", probefunc,
arg1);}' -c "/usr/sbin/svccfg apply $1"
        if [ $? -ne 0 ]; then
                echo "WARNING: svccfg apply $1 failed" | tee /dev/msglog
        fi
}


root@solaris:~# cat /var/svc/log/system-early-manifest-import\:default.log
svccfg: Loaded 170 smf(5) service descriptions
/etc/svc system profiles not found: upgrade system profiles
/etc/svc/profile/generic.xml
mhash_filename_to_propname 135150408
mhash_retrieve_entry 4294967295
md5_hash_file 0
MHASH_NEWFILE

/etc/svc/profile/platform.xml
mhash_filename_to_propname 135150408
mhash_retrieve_entry 4294967295
md5_hash_file 0
MHASH_NEWFILE

/etc/svc/profile/site.xml
mhash_filename_to_propname 135150408
mhash_retrieve_entry 4294967295
md5_hash_file 0
MHASH_NEWFILE


root@solaris:~#
root@solaris:~# cat /var/svc/log/system-manifest-import\:default.log
[ Mar  2 15:24:16 Enabled. ]
[ Mar  2 15:24:33 Executing start method
("/lib/svc/method/manifest-import"). ]
[ Mar  2 15:24:34 Timeout override by svc.startd.  Using infinite timeout. ]
svccfg: Loaded 8 smf(5) service descriptions
/etc/svc/profile/generic.xml
mhash_filename_to_propname 135154504
mhash_retrieve_entry 0
MHASH_NEWFILE

/etc/svc/profile/platform.xml
mhash_filename_to_propname 135154504
mhash_retrieve_entry 0
MHASH_NEWFILE


[ Mar  2 15:24:43 Method "start" exited with status 0. ]
root@solaris:~#


I believe it returns here (line #711):
http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/cmd/svc/common
/manifest_hash.c#692
    692         /*
    693          * Verify the meta data hash.
    694          */
    695         if (hashash && memcmp(hash, stored_hash, MD5_DIGEST_LENGTH)
== 0) {
    696                 int i;
    697 
    698                 metahashok = 1;
    699                 /*
    700                  * The metadata hash matches; now we see if there
was a
    701                  * content hash; if not, we will continue on and
compute and
    702                  * store the updated hash.
    703                  * If there was no content hash,
mhash_retrieve_entry()
    704                  * will have zero filled it.
    705                  */
    706                 for (i = 0; i < MD5_DIGEST_LENGTH; i++) {
    707                         if (stored_hash[MD5_DIGEST_LENGTH+i] != 0) {
    708                                 if (action == APPLY_LATE) {
    709                                         if (pnamep != NULL)
    710                                                 *pnamep = pname;
    711                                         ret = MHASH_NEWFILE;
    712                                 } else {
    713                                         uu_free(pname);
    714                                         ret = MHASH_RECONCILED;
    715                                 }
    716                                 return (ret);
    717                         }
    718                 }
    719         }


This code was changed as part of:

10461:d59a044cc787              07-Sep-2009     Sean Wilcox     6859248
manifest removal should automatically cause repository cleanup

http://src.opensolaris.org/source/diff/onnv/onnv-gate/usr/src/cmd/svc/common
/manifest_hash.c?r2=%2Fonnv%2Fonnv-gate%2Fusr%2Fsrc%2Fcmd%2Fsvc%2Fcommon%2Fm
anifest_hash.c%4012172%3A032d6e46c1ac&r1=%2Fonnv%2Fonnv-gate%2Fusr%2Fsrc%2Fc
md%2Fsvc%2Fcommon%2Fmanifest_hash.c%4010461%3Ad59a044cc787


594     if (hashash && memcmp(hash, stored_hash, MD5_DIGEST_LENGTH) == 0) {
595             int i;
596
597             metahashok = 1;
598             /*
599              * The metadata hash matches; now we see if there was a
600              * content hash; if not, we will continue on and compute and
601              * store the updated hash.
602              * If there was no content hash, mhash_retrieve_entry()
603              * will have zero filled it.
604              */
605             for (i = 0; i < MD5_DIGEST_LENGTH; i++) {
606                     if (stored_hash[MD5_DIGEST_LENGTH+i] != 0) {
607                             uu_free(pname);
608                             return (MHASH_RECONCILED);
609                     }
610             }
611     }

was replaced by:

695     if (hashash && memcmp(hash, stored_hash, MD5_DIGEST_LENGTH) == 0) {
696             int i;
697
698             metahashok = 1;
699             /*
700              * The metadata hash matches; now we see if there was a
701              * content hash; if not, we will continue on and compute and
702              * store the updated hash.
703              * If there was no content hash, mhash_retrieve_entry()
704              * will have zero filled it.
705              */
706             for (i = 0; i < MD5_DIGEST_LENGTH; i++) {
707                     if (stored_hash[MD5_DIGEST_LENGTH+i] != 0) {
708                             if (action == APPLY_LATE) {
709                                     if (pnamep != NULL)
710                                             *pnamep = pname;
711                                     ret = MHASH_NEWFILE;
712                             } else {
713                                     uu_free(pname);
714                                     ret = MHASH_RECONCILED;
715                             }
716                             return (ret);
717                     }
718             }
719     }


I'm not entirely sure what was the intention here though...





From: [email protected]
[mailto:[email protected]] On Behalf Of Robert
Milkowski
Sent: 03 March 2011 18:21
To: [email protected]
Subject: [caiman-discuss] site.xml and EMI/LMI

Hi,


Snv_151a, x86

I put some customizations into default.xml so some extra services are
disabled.
I confirmed that AI ould generate /etc/svc/profile/site.xml->sc_profile.xml
with all the customizations I put.

The problem is that the changes do not take effect. Well, they do but only
for a brief period.
This is what is happening:

   1. AI properly creates site.xml
   2. during the very first boot from a localdisk after AI networked
installation finished, system-early-manifest-import is run
(/lib/svc/method/manifest-import)
        It will import all the manifests and apply generix.xml, platform.xml
and the site.xml
   3. later in the boot process system-manifest-import is run (again
/lib/svc/method/manifest-import)
       Although SVCCFG_CHECKHASH=1 for some reason it actually applies
generic.xml again (I mean it takes effect) undoing my customizations in
site.xml
      The site.xml won't be applied again as it was removed by
system/install/config service.

There are couple of problems here:

                I. why manifest-import ends up really applying generic.xml
despite SVCCFG_CHECKHASH=1? Looks like a bug...
                    If I manually disable some services now and reboot, both
early-manifest-import and manifest-import won't really apply generic.xml -
good.
                    It only happens during the first boot.

                II. I'm not entirely sure that system/install/config should
just blindly remove site.xml. It does check if it is a symbolink link, but
it doesn't even check where it points to.
                     Then shouldn't SVCCFG_CHECKHASH=1 protect here anyway?
(I haven't looked at the code yet...)
                    I guess the problem might be not with disabled/enabled
services but with extra properties for install/config (create a user, etc.).
                     Perhaps AI customizations should really go to
/etc/svc/profile/site/AI_site.xml? I'm worried here if a sysadmin would put
its own site.xml (as a symlink) as part of a "finish script".

I commented out the removal of site.xml in system/install/config and it
doesn't really solve the problem. I end-up with all the customizations,
which is fine, but sometimes some services which should be disabled
according to site.xml manage to run their start methods between generic.xml
is applied and site.xml is applied which caused some transient errors. But
they shouldn't have even try to start in the first place. It's all down to
timing nevertheless it happened couple of times.

I think that the main problem is I. The II. Is more of a concern than a
problem and it makes things unnecessarily more complex than needed.



-- 
Robert Milkowski
http://milek.blogspot.com



_______________________________________________
caiman-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/caiman-discuss

Reply via email to