The code responsible for the SVCCFG_CHECKHASH handlis is at
http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/cmd/svc/common
/manifest_hash.c#632
So I rolled back the root filesystem, then I modified manifest-import method
as below:
root@solaris:~# diff -u
/.zfs/snapshot/install/lib/svc/method/manifest-import
/lib/svc/method/manifest-import
--- /.zfs/snapshot/install/lib/svc/method/manifest-import 2011-03-02
08:14:29.198429073 +0000
+++ /lib/svc/method/manifest-import 2011-03-02 14:40:25.849157553 +0000
@@ -66,7 +66,7 @@
[ -n "$ALT_MFST_DIR" -a -z "$ALT_REPOSITORY" ] && usage
function svccfg_apply {
- $X /usr/sbin/svccfg apply $1
+ dtrace -q -n 'pid$target::mhash_test_file:entry{printf("%s ",
copyinstr(arg1));} pid$target::mhash_test_file:return{printf("%s\n",
arg1==0?"MHASH_NEWFILE":arg1==1?"MHASH_RECONCILED":arg1==-1?"MHASH_FAILURE":
"unknown");}' -c "/usr/sbin/svccfg apply $1"
if [ $? -ne 0 ]; then
echo "WARNING: svccfg apply $1 failed" | tee /dev/msglog
fi
root@solaris:~#
then I reset the server and this is what I got in logs:
root@solaris:~# cat /var/svc/log/system-early-manifest-import\:default.log
svccfg: Loaded 170 smf(5) service descriptions
/etc/svc system profiles not found: upgrade system profiles
/etc/svc/profile/generic.xml MHASH_NEWFILE
/etc/svc/profile/platform.xml MHASH_NEWFILE
/etc/svc/profile/site.xml MHASH_NEWFILE
So far so good.
root@solaris:~# cat /var/svc/log/system-manifest-import\:default.log
[ Mar 2 14:41:27 Enabled. ]
[ Mar 2 14:41:44 Executing start method
("/lib/svc/method/manifest-import"). ]
[ Mar 2 14:41:44 Timeout override by svc.startd. Using infinite timeout. ]
svccfg: Loaded 8 smf(5) service descriptions
/etc/svc/profile/generic.xml MHASH_NEWFILE
/etc/svc/profile/platform.xml MHASH_NEWFILE
[ Mar 2 14:41:53 Method "start" exited with status 0. ]
root@solaris:~#
This one is looks wrong (although expected given the behaviour).
Lets get a little bit more details, again rollback to @install and this time
below modification to manifest-import:
function svccfg_apply {
dtrace -q -n
'pid$target::mhash_test_file:entry{self->in=1;printf("%s\n ",
copyinstr(arg1));}
pid$target::mhash_test_file:return{self->in=0;printf("%s\n",
arg1==0?"MHASH_NEWFILE":arg1==1?"MHASH_RECONCILED":arg1==-1?"MHASH_FAILURE":
"unknown");} pid$target:a.out::return/self->in/{printf("%s %d\n", probefunc,
arg1);}' -c "/usr/sbin/svccfg apply $1"
if [ $? -ne 0 ]; then
echo "WARNING: svccfg apply $1 failed" | tee /dev/msglog
fi
}
root@solaris:~# cat /var/svc/log/system-early-manifest-import\:default.log
svccfg: Loaded 170 smf(5) service descriptions
/etc/svc system profiles not found: upgrade system profiles
/etc/svc/profile/generic.xml
mhash_filename_to_propname 135150408
mhash_retrieve_entry 4294967295
md5_hash_file 0
MHASH_NEWFILE
/etc/svc/profile/platform.xml
mhash_filename_to_propname 135150408
mhash_retrieve_entry 4294967295
md5_hash_file 0
MHASH_NEWFILE
/etc/svc/profile/site.xml
mhash_filename_to_propname 135150408
mhash_retrieve_entry 4294967295
md5_hash_file 0
MHASH_NEWFILE
root@solaris:~#
root@solaris:~# cat /var/svc/log/system-manifest-import\:default.log
[ Mar 2 15:24:16 Enabled. ]
[ Mar 2 15:24:33 Executing start method
("/lib/svc/method/manifest-import"). ]
[ Mar 2 15:24:34 Timeout override by svc.startd. Using infinite timeout. ]
svccfg: Loaded 8 smf(5) service descriptions
/etc/svc/profile/generic.xml
mhash_filename_to_propname 135154504
mhash_retrieve_entry 0
MHASH_NEWFILE
/etc/svc/profile/platform.xml
mhash_filename_to_propname 135154504
mhash_retrieve_entry 0
MHASH_NEWFILE
[ Mar 2 15:24:43 Method "start" exited with status 0. ]
root@solaris:~#
I believe it returns here (line #711):
http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/cmd/svc/common
/manifest_hash.c#692
692 /*
693 * Verify the meta data hash.
694 */
695 if (hashash && memcmp(hash, stored_hash, MD5_DIGEST_LENGTH)
== 0) {
696 int i;
697
698 metahashok = 1;
699 /*
700 * The metadata hash matches; now we see if there
was a
701 * content hash; if not, we will continue on and
compute and
702 * store the updated hash.
703 * If there was no content hash,
mhash_retrieve_entry()
704 * will have zero filled it.
705 */
706 for (i = 0; i < MD5_DIGEST_LENGTH; i++) {
707 if (stored_hash[MD5_DIGEST_LENGTH+i] != 0) {
708 if (action == APPLY_LATE) {
709 if (pnamep != NULL)
710 *pnamep = pname;
711 ret = MHASH_NEWFILE;
712 } else {
713 uu_free(pname);
714 ret = MHASH_RECONCILED;
715 }
716 return (ret);
717 }
718 }
719 }
This code was changed as part of:
10461:d59a044cc787 07-Sep-2009 Sean Wilcox 6859248
manifest removal should automatically cause repository cleanup
http://src.opensolaris.org/source/diff/onnv/onnv-gate/usr/src/cmd/svc/common
/manifest_hash.c?r2=%2Fonnv%2Fonnv-gate%2Fusr%2Fsrc%2Fcmd%2Fsvc%2Fcommon%2Fm
anifest_hash.c%4012172%3A032d6e46c1ac&r1=%2Fonnv%2Fonnv-gate%2Fusr%2Fsrc%2Fc
md%2Fsvc%2Fcommon%2Fmanifest_hash.c%4010461%3Ad59a044cc787
594 if (hashash && memcmp(hash, stored_hash, MD5_DIGEST_LENGTH) == 0) {
595 int i;
596
597 metahashok = 1;
598 /*
599 * The metadata hash matches; now we see if there was a
600 * content hash; if not, we will continue on and compute and
601 * store the updated hash.
602 * If there was no content hash, mhash_retrieve_entry()
603 * will have zero filled it.
604 */
605 for (i = 0; i < MD5_DIGEST_LENGTH; i++) {
606 if (stored_hash[MD5_DIGEST_LENGTH+i] != 0) {
607 uu_free(pname);
608 return (MHASH_RECONCILED);
609 }
610 }
611 }
was replaced by:
695 if (hashash && memcmp(hash, stored_hash, MD5_DIGEST_LENGTH) == 0) {
696 int i;
697
698 metahashok = 1;
699 /*
700 * The metadata hash matches; now we see if there was a
701 * content hash; if not, we will continue on and compute and
702 * store the updated hash.
703 * If there was no content hash, mhash_retrieve_entry()
704 * will have zero filled it.
705 */
706 for (i = 0; i < MD5_DIGEST_LENGTH; i++) {
707 if (stored_hash[MD5_DIGEST_LENGTH+i] != 0) {
708 if (action == APPLY_LATE) {
709 if (pnamep != NULL)
710 *pnamep = pname;
711 ret = MHASH_NEWFILE;
712 } else {
713 uu_free(pname);
714 ret = MHASH_RECONCILED;
715 }
716 return (ret);
717 }
718 }
719 }
I'm not entirely sure what was the intention here though...
From: [email protected]
[mailto:[email protected]] On Behalf Of Robert
Milkowski
Sent: 03 March 2011 18:21
To: [email protected]
Subject: [caiman-discuss] site.xml and EMI/LMI
Hi,
Snv_151a, x86
I put some customizations into default.xml so some extra services are
disabled.
I confirmed that AI ould generate /etc/svc/profile/site.xml->sc_profile.xml
with all the customizations I put.
The problem is that the changes do not take effect. Well, they do but only
for a brief period.
This is what is happening:
1. AI properly creates site.xml
2. during the very first boot from a localdisk after AI networked
installation finished, system-early-manifest-import is run
(/lib/svc/method/manifest-import)
It will import all the manifests and apply generix.xml, platform.xml
and the site.xml
3. later in the boot process system-manifest-import is run (again
/lib/svc/method/manifest-import)
Although SVCCFG_CHECKHASH=1 for some reason it actually applies
generic.xml again (I mean it takes effect) undoing my customizations in
site.xml
The site.xml won't be applied again as it was removed by
system/install/config service.
There are couple of problems here:
I. why manifest-import ends up really applying generic.xml
despite SVCCFG_CHECKHASH=1? Looks like a bug...
If I manually disable some services now and reboot, both
early-manifest-import and manifest-import won't really apply generic.xml -
good.
It only happens during the first boot.
II. I'm not entirely sure that system/install/config should
just blindly remove site.xml. It does check if it is a symbolink link, but
it doesn't even check where it points to.
Then shouldn't SVCCFG_CHECKHASH=1 protect here anyway?
(I haven't looked at the code yet...)
I guess the problem might be not with disabled/enabled
services but with extra properties for install/config (create a user, etc.).
Perhaps AI customizations should really go to
/etc/svc/profile/site/AI_site.xml? I'm worried here if a sysadmin would put
its own site.xml (as a symlink) as part of a "finish script".
I commented out the removal of site.xml in system/install/config and it
doesn't really solve the problem. I end-up with all the customizations,
which is fine, but sometimes some services which should be disabled
according to site.xml manage to run their start methods between generic.xml
is applied and site.xml is applied which caused some transient errors. But
they shouldn't have even try to start in the first place. It's all down to
timing nevertheless it happened couple of times.
I think that the main problem is I. The II. Is more of a concern than a
problem and it makes things unnecessarily more complex than needed.
--
Robert Milkowski
http://milek.blogspot.com
_______________________________________________
caiman-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/caiman-discuss