Jan, It seems like the problem is not with ZFS but with the device driver. If the driver is failing to provide the devid then ZFS is just going to be a victim. I would recommend that we change the synopsis to devid_get() fails with "Invalid argument" and pass this to the driver folks. Do you know if it's always the same driver?
Thanks, George jan damborsky wrote: > Hi George, > > > George Wilson wrote: >> Jan, >> >> So who is working the UFS issue and how is that being tracked. > > In general, bugs in OpenSolaris Caiman installer are tracked in > Bugzilla at > defect.opensolaris.org. This is the preferred over filing bugs in > Bugster. > Speaking about this particular problem, it is tracked by following bug: > > 4675 Fix for bug 30 causes ZFS label to be mangled - ending up in GRUB > prompt after installing OpenSolaris > http://defect.opensolaris.org/bz/show_bug.cgi?id=4675 > > Sanjay Nadkarni is assigned to this bug (CCing him). > >> I would recommend that we keep this bug as the UFS/install issue and >> create a new bug and send that to me. > > As pointed above, Bugzilla is preferred database to track issues in > Caiman installer. > > Please note that 6769487 was originally filed for tracking the problem > when > GRUB can't access ZFS filesystem because 'devid' is not present in ZFS > label. > > It was overloaded later by 'UFS' problem. > >> Can you move the descriptions below from this bug and add them to the >> new one? > > To be honest, since installer part of problem related to UFS is > tracked by 4675, > I don't see why we shouldn't continue to use 6769487 to track the > issue this bug > was initially filed for and I think that we might lose some context when > ZFS related information is moved from 6769487 to the new bug. > That said, if you think it might be helpful, please let me know and > I will try to capture all information from 6769487 I think is relevant to > the ZFS part in new bug. > >> Also since you can reproduce this can you tell me exactly how or >> point me at a system which I can login into to debug? > > Sure, the machine can be accessed via 'ssh', but since it is not > directly accessible from SWAN (it is behind the NAT), > I will provide you with instructions, how to access it. > Unfortunately it doesn't have console access. > > Please let me know, in which state you would need to have that > machine - right after the installation finished, but before reboot ? > > Unfortunately, following the procedure itself doesn't seem to be > sufficient for reproducing the problem :-( I tried exactly the > same steps on other bare metal as well as in virtual environment, > but without success. > > >> >> I want to make sure we don't lose sight of the UFS issue and this bug >> has already gone down to root cause so let's not overload this bug >> any further. > > UFS part of problem is being solved right now (please feel free to > monitor > bug 4675 for progress and add anything you might consider relevant > to that issue). > > Thank you, > Jan > >> >> Thanks, >> George >> >> jan damborsky wrote: >>> Hi George, >>> >>> there are at least two parts of this problem: >>> >>> [1] UFS one >>> This is what you are referring to and it is being tracked by >>> Bugzilla bug 4675. >>> In that case workaround #2 helps to "solve" the problem. >>> >>> [2] ZFS one >>> Please see original description #1. I am able to reproduce that on >>> system >>> at will which didn't contain any UFS filesystem and thus [1] is not >>> applicable here. 'zpool import' helps in this case. >>> >>> Also please see: >>> * description #4 >>> * description #5 >>> * public comments #8 >>> * comments #6 >>> >>> People are apparently encountering this problem in >>> other configurations (e.g. when using virgin disk >>> or installing on system containing only Windows). >>> >>> I am not stating that this is in fact problem in ZFS as it might >>> be related for example to device driver code, but at this point it >>> seems to me that ZFS team is the most eligible one to move >>> things forward, as GRUB can't read menu.lst from ZFS >>> filesystem . >>> >>> Please let me know if you have any questions or need more >>> information. >>> >>> Thank you, >>> Jan >>> >>> >>> George Wilson wrote: >>>> Jan, >>>> >>>> I don't understand how this is a ZFS problem. I thought from the >>>> evaluation that the issue is that UFS and ZFS are sharing the same >>>> block and this was being caused by the fact the the livecd had >>>> mounted a UFS filesystem as part of the installation. Could you >>>> clarify? >>>> >>>> Thanks, >>>> George >>>> >>>> Jan.Damborsky at Sun.COM wrote: >>>>> Sun Confidential: Internal only >>>>> >>>>> *Synopsis*: Ended up in 'grub>' prompt after installation of >>>>> OpenSolaris 2008.11 (build 101a) >>>>> >>>>> CrPrint: http://bt2ws.central.sun.com/CrPrint?id=6769487 >>>>> Monaco: http://monaco.sfbay.sun.com/detail.jsf?cr=6769487 >>>>> >>>>> Due to a change of Responsible manager requested by >>>>> jan.damborsky at sun.com, >>>>> david.brittle at sun.com is now the responsible manager for: >>>>> >>>>> Due to a change requested by jan.damborsky at sun.com, >>>>> this CR is being redispatched: >>>>> >>>>> This is a high priority CR and requires your immediate attention. >>>>> Please evaluate it as soon as possible. Thank you. >>>>> >>>>> CR 6769487 changed on Nov 12 2008 by jan.damborsky at sun.com >>>>> >>>>> === Field ============ === New Value ============= === Old Value >>>>> ============= >>>>> >>>>> Category kernel >>>>> opensolaris Comments New >>>>> Note >>>>> Comments New Note Old >>>>> Note Comments New >>>>> Note Old Note Public >>>>> Comments New >>>>> Note Responsible >>>>> Manager david.brittle at sun.com eric.ray at sun.com >>>>> Status 1-Dispatched 5-Cause >>>>> Known SubCategory >>>>> zfs livecd >>>>> ====================== =========================== >>>>> =========================== >>>>> >>>>> *Change Request ID*: 6769487 >>>>> >>>>> *Synopsis*: Ended up in 'grub>' prompt after installation of >>>>> OpenSolaris 2008.11 (build 101a) >>>>> >>>>> Product: solaris >>>>> Category: kernel >>>>> Subcategory: zfs >>>>> Type: Defect >>>>> Subtype: Functionality >>>>> Status: 1-Dispatched >>>>> Substatus: Priority: 1-Very High >>>>> Introduced In Release: Introduced In Build: Responsible >>>>> Manager: david.brittle at sun.com >>>>> Responsible Engineer: Initial Evaluator: zfs-team at sun.com >>>>> Keywords: >>>>> === *Description* >>>>> ============================================================ >>>>> When testing installation with recent OpenSolaris builds, we have >>>>> been encountering that >>>>> in some cases, people end up in GRUB prompt after the installation >>>>> - it seems that menu.lst >>>>> can't be accessed for some reason. For now bunch of Bugzilla bugs >>>>> seem to be describing >>>>> the same manifestation of the problem which root cause has not >>>>> been identified yet: >>>>> >>>>> 4051 opensolaris b99b/b100a does not install on 1.5 TB disk or >>>>> boot fails after install >>>>> 4591 Install failure on a Sun Fire X4240 with Opensolaris 200811 >>>>> 4161 no grub in 2008.11 Development Builds (comment #20, comment #31) >>>>> 4760 Enter grub after installing 2008.11 RC 1 >>>>> ... >>>>> >>>>> I also hit that problem when testing Automated Installer (it is a >>>>> part of Caiman project >>>>> and will replace current jumpstart install technology), I was able >>>>> to make GRUB find >>>>> 'menu.lst' just by using 'zpool import' command - please see below >>>>> for detailed procedure. >>>>> >>>>> >>>>> configuration: >>>>> -------------- >>>>> HW: Ultra 20, 1GB RWM, 1 250GB SATA drive >>>>> SW: Opensolaris build 100, 64bit mode >>>>> >>>>> steps used: >>>>> ----------- >>>>> [1] OpenSolaris 100 installed using Automated Installer >>>>> - Solaris 2 partition created during installation >>>>> >>>>> * partition configuration before installation: >>>>> >>>>> # fdisk -W - c2t0d0p0 >>>>> ...* Id Act Bhead Bsect Bcyl Ehead Esect Ecyl >>>>> Rsect Numsect >>>>> 192 0 0 1 1 254 63 1023 16065 >>>>> 22491000 >>>>> * partition configuration after installation: >>>>> >>>>> # fdisk -W - c2t0d0p0 >>>>> ...* Id Act Bhead Bsect Bcyl Ehead Esect Ecyl >>>>> Rsect Numsect >>>>> 192 0 0 1 1 254 63 1023 16065 >>>>> 22491000 191 128 254 63 1023 254 63 1023 >>>>> 22507065 30000000 >>>>> >>>>> [2] When I reboot the system after the installation, I ended up in >>>>> GRUB prompt: >>>>> grub> root >>>>> (hd0,1,a): Filesystem type unknown, partition type 0xbf >>>>> >>>>> grub> cat /rpool/boot/grub/menu.lst >>>>> >>>>> Error 17: Cannot mount selected partition >>>>> >>>>> grub> >>>>> >>>>> [3] I rebooted into AI and did 'zpool import' >>>>> # zdb -l /dev/rdsk/c2t0d0s0 > /tmp/zdb_before_import.txt (attached) >>>>> # zpool import -f rpool >>>>> # zdb -l /dev/rdsk/c2t0d0s0 > /tmp/zdb_after_import.txt (attached) >>>>> # diff /tmp/zdb_before_import.txt /tmp/zdb_after_import.txt >>>>> 7c7 >>>>> < txg=21 >>>>> --- >>>>> >>>>>> txg=2675 >>>>>> >>>>> 9c9 >>>>> < hostid=4741222 >>>>> --- >>>>> >>>>>> hostid=4247690 >>>>>> >>>>> 17a18 >>>>> >>>>>> devid='id1,sd at f00c778e247ac7bd0000238460000/a' >>>>>> >>>>> 31c32 >>>>> ... >>>>> # reboot >>>>> >>>>> [4] Now GRUB can access menu.lst and Solaris is booted >>>>> >>>>> hypothesis >>>>> ---------- >>>>> It seems that for some reason, when ZFS pool was created, 'devid' >>>>> information was not added to the ZFS label. >>>>> >>>>> When 'zpool import' was called, 'devid' got populated. >>>>> >>>>> Looking at the GRUB ZFS plug-in, it seems that 'devid' >>>>> (ZPOOL_CONFIG_DEVID attribute) is >>>>> required in order to be able to access ZFS filesystem: >>>>> >>>>> In grub/grub-0.95/stage2/fsys_zfs.c: >>>>> >>>>> vdev_get_bootpath() >>>>> { >>>>> ... >>>>> if (strcmp(type, VDEV_TYPE_DISK) == 0) { >>>>> if (vdev_validate(nv) != 0 || >>>>> (nvlist_lookup_value(nv, ZPOOL_CONFIG_PHYS_PATH, >>>>> bootpath, DATA_TYPE_STRING, NULL) != 0) || >>>>> (nvlist_lookup_value(nv, ZPOOL_CONFIG_DEVID, >>>>> devid, DATA_TYPE_STRING, NULL) != 0)) >>>>> return (ERR_NO_BOOTPATH); >>>>> ... >>>>> } >>>>> >>>>> additional observations: >>>>> ------------------------ >>>>> [1] If 'devid' is populated during installation after 'zpool create' >>>>> operation, the problem doesn't occur. >>>>> >>>>> [2] If following described procedure, the problem is reproducible >>>>> at will on system where it was initially reproduced (please see >>>>> above for the configuration) >>>>> >>>>> [3] Other people reported this problem also for following >>>>> configurations: >>>>> * vmware >>>>> * Sun Java Workstation W2100z with 2xOpteron2.4G 3G Mem >>>>> >>>>> [4] When installation into existing Solaris2 partition containing >>>>> Solaris instance is done >>>>> 'devid' is always populated and the problem doesn't occur (it >>>>> doesn't matter if partition >>>>> is marked 'active' or not). >>>>> >>>>> *** (#1 of 5): 2008-11-10 10:27:21 GMT+00:00 jan.damborsky at sun.com >>>>> >>>>> If the system once be Navada, (101a as mine), install OpenSolaris >>>>> will hit this issue, while keep the partition but not choose the >>>>> entire disk (I suspect this caused the issue, perhaps) >>>>> There's a diagnostic partition on there if Navada installed, and >>>>> opensolaris 2008.11 simply enter grub> as this CR mentioned. Then >>>>> I use the entire disk, this time the system boot up okay. >>>>> But while I re-install it again with a smaller size than the >>>>> entire disk specified, >>>>> grub has no problem, but GNOME cannot start (hang there endlessly) >>>>> >>>>> *** (#2 of 5): 2008-11-10 10:45:29 GMT+00:00 robin.guo at sun.com >>>>> >>>>> The root cause of this problem is the continued existence of UFS >>>>> filesystems structures on disk, even after the zfs filesystem is >>>>> created and is live. Because ZFS did not destroy the UFS magic, >>>>> both GRUB and Solaris think there's a (horribly damaged) UFS >>>>> filesystem present on that slice (a WARNING is displayed at boot >>>>> time during OpenSolaris boot informing the user that >>>>> /mnt/solaris<N> (where <N> is a number) could not be mounted >>>>> because of filesystem problems -- in reality, that slice is where >>>>> the zfs root is located. >>>>> >>>>> In GRUB, since code that attempts to mount root does so by trying >>>>> each filesystem module in the order in which they are listed in >>>>> the fsys_table[] array, and since UFS is listed before ZFS, GRUB >>>>> thinks that a UFS filesystem exists in the slice actually >>>>> containing the ZFS root filesystem (and fails trying to mount it, >>>>> leaving it unable to locate the real root filesystem). A modified >>>>> version of GRUB that modifies fsys_table by declaring the ZFS >>>>> operations before the UFS operations confirms this hypothesis. >>>>> >>>>> Therefore, a valid workaround destroys the UFS magic, preventing >>>>> both GRUB's and Solaris's UFS modules from recognizing the slice >>>>> as a UFS filesystem. When GRUB's UFS code fails to find a valid >>>>> UFS filesystem, the ZFS module is subsequently tried and is able >>>>> to successfully mount the filesystem. >>>>> >>>>> *** (#3 of 5): 2008-11-11 03:23:04 GMT+00:00 seth.goldberg at sun.com >>>>> *** Last Edit: 2008-11-11 03:45:05 GMT+00:00 seth.goldberg at sun.com >>>>> >>>>> I think there are two separate issues here. The UFS label appears >>>>> to be one. The signature for this bug is that at grub prompt, >>>>> typing root - generates the UFS filesystem info. >>>>> However there is a secondary bug where after installation, one >>>>> gets a grub prompt. Typing root command at the grub prompmt >>>>> generates - unknown file system. In this case no UFS filesystems >>>>> were detected or mounted. The workaround for this has been to run >>>>> zpool import. This still needs to be investigated. >>>>> >>>>> *** (#4 of 5): 2008-11-12 00:04:16 GMT+00:00 sanjay.nadkarni at sun.com >>>>> >>>>> We were able to recreate the grub failure where typing root at the >>>>> prompt returns unknown file system. This was on a Fujistu LifeBook >>>>> S7211. It was installed with installed with Vista. We then >>>>> booted OpenSolaris and started the install. At the end of the >>>>> installation we noted that the zfs label did not have devid >>>>> information. >>>>> >>>>> We then loaded a simple program that would get the devid >>>>> (devid_get). This failed with "Invalid argument". We then >>>>> rebooted the liveCD again and reran this program and this time it >>>>> printed out the device id. The disk is off a SATA controller. >>>>> The driver that attached to this is ahci. The device is: >>>>> 82801HBM/HEM. The disk is Fujitsu MHY2120BH >>>>> >>>>> *** (#5 of 5): 2008-11-12 02:43:18 GMT+00:00 sanjay.nadkarni at sun.com >>>>> >>>>> >>>>> === *Public Comments* >>>>> ======================================================== >>>>> Following Bugzilla bugs were closed as duplicate of this issue: >>>>> >>>>> 4772 Cannot install OpenSolaris 2008.11 on VMware Server 2.0 >>>>> http://defect.opensolaris.org/bz/show_bug.cgi?id=4772 >>>>> >>>>> 4756 after reboot when finishing the installation, system can not >>>>> boot >>>>> http://defect.opensolaris.org/bz/show_bug.cgi?id=4756 >>>>> >>>>> 4749 After installed opensolaris0811RC1 on Dell PowerEdge, can't >>>>> boot from disk. >>>>> http://defect.opensolaris.org/bz/show_bug.cgi?id=4749 >>>>> >>>>> *** (#1 of 9): 2008-11-10 17:20:54 GMT+00:00 dave.miner at sun.com >>>>> *** Last Edit: 2008-11-11 11:45:41 GMT+00:00 jan.damborsky at sun.com >>>>> >>>>> zpool import doesn't help for me, nor would I expect it to (it's a >>>>> mystery >>>>> why it seems to). Clearing the UFS magic helps. >>>>> >>>>> Looking further, I find that the data on disk at 8k seems to still >>>>> be a UFS superblock, not a zfs vdev_boot_header_t, which doesn't make >>>>> sense to me; in any ZFS initialization scheme, one would expect >>>>> all parts >>>>> of the label to be completely written. >>>>> >>>>> The expected vdev_boot_header_t appears at the label copy at >>>>> 256K+8K, as >>>>> expected. >>>>> >>>>> *** (#2 of 9): 2008-11-11 04:39:09 GMT+00:00 dan.mick at sun.com >>>>> >>>>> It appears that ZFS doesn't validate that first 8k (the >>>>> vdev_boot_header), so >>>>> that explains why the kernel was happy even with a UFS superblock >>>>> where the >>>>> vdev_boot_header was supposed to be. >>>>> >>>>> Also, the last few bits of the 8k block in question seem to contain a >>>>> zio_block_tail_t (i.e. a zbt_magic and a zbt_cksum), so it seems >>>>> this block >>>>> was written by ZFS sometime in the past. >>>>> Possible theories: 1) the ZFS initialization somehow skipped this >>>>> 8k header, >>>>> or 2) somehow the 8k superblock was rewritten over the block after >>>>> ZFS initialized it. >>>>> >>>>> *** (#3 of 9): 2008-11-11 04:49:57 GMT+00:00 dan.mick at sun.com >>>>> >>>>> Another possible theory: could this be the superblock flush from a >>>>> still-mounted UFS being shut down? >>>>> >>>>> (The block was correct until after the OpenSolaris installer said >>>>> it was done, >>>>> and waited for me to press a button to reboot. I suspect >>>>> the original UFS was mounted and not unmounted before the ZFS >>>>> creation, >>>>> so they both think they own the device.) >>>>> >>>>> Supporting evidence: the "last mounted" path in the superblock is >>>>> "/mnt/solaris0". >>>>> >>>>> I suspect the cause of this bug is a UFS that's mounted and should be >>>>> unmounted by the installer before ZFS creation. >>>>> >>>>> What's the right category/subcategory for Caiman? >>>>> >>>>> *** (#4 of 9): 2008-11-11 07:34:58 GMT+00:00 dan.mick at sun.com >>>>> >>>>> The live CD has historically automatically mounted up any UFS file >>>>> systems that it found, going back to Belenix. Interesting that >>>>> this is just now a problem, but it probably is a result of >>>>> switching to ZFS for swap, as up until build 96 we always created >>>>> a swap slice at the start of the disk, which it appears would have >>>>> masked this problem. >>>>> >>>>> *** (#5 of 9): 2008-11-11 15:02:03 GMT+00:00 dave.miner at sun.com >>>>> >>>>> Installer takes care of releasing the target device before Target >>>>> Instantiation >>>>> phase is launched. Among other things, it >>>>> >>>>> * releases all swap devices created on target disk >>>>> * unmounts whatever is mounted on target disk >>>>> >>>>> For the latter, /etc/mnttab is read and if there is mounted device >>>>> which is part of >>>>> the target disk, installer tries to unmount it. >>>>> >>>>> The problem is after fix for Bugzilla bug 30 was integrated, UFS >>>>> filesystems are >>>>> mounted with '-o m' option which causes the filesystem being >>>>> mounted without making >>>>> entry in /etc/mnttab. Then mountpoints are hidden, installer can't >>>>> see those and >>>>> doesn't unmount them. >>>>> >>>>> That said, this explains UFS part of the problem (when 'dd' >>>>> workaround works), >>>>> but doesn't seems to be related to ZFS part of the issue, when >>>>> 'zpool import' workaround helped. >>>>> >>>>> *** (#6 of 9): 2008-11-11 16:25:09 GMT+00:00 jan.damborsky at sun.com >>>>> *** Last Edit: 2008-11-11 16:34:29 GMT+00:00 jan.damborsky at sun.com >>>>> >>>>> We should probably file leave this bug to resolve zpool create not >>>>> removing evidence of the >>>>> previous ufs fs, and file another one to chase down the other >>>>> issue(s?). >>>>> >>>>> Chris, if you run zbd -l on you virgin device, are you missing >>>>> any zfs properties? The reader >>>>> in GRUB pretty much gives up if things like the devid aren't set. >>>>> >>>>> *** (#7 of 9): 2008-11-11 19:30:56 GMT+00:00 jan.setje-eilers at sun.com >>>>> >>>>> Concur that Chris' problem is different; the UFS superblock does >>>>> not exist in >>>>> the first 256kb attached to the bug. It appears as though >>>>> phys_path and devid >>>>> are present, although it's difficult to be sure. We should >>>>> probably see if we can >>>>> send a debug version of Grub to Chris, with installation >>>>> instructions, to see >>>>> why it seems unable to find the zfs. >>>>> >>>>> *** (#8 of 9): 2008-11-11 22:16:50 GMT+00:00 dan.mick at sun.com >>>>> >>>>> The root cause of 'UFS part' of this problem is in 'livecd code' >>>>> and is tracked by >>>>> following Bugzilla bug: >>>>> >>>>> 4675 Fix for bug 30 causes ZFS label to be mangled - ending up in >>>>> GRUB prompt after installing OpenSolaris >>>>> >>>>> Please feel free to use this bug (6769487) for tracking other >>>>> part(s) of the problem. >>>>> Resetting category to solaris/kernel/zfs and Status to 'Dispatched'. >>>>> >>>>> *** (#9 of 9): 2008-11-12 12:46:21 GMT+00:00 jan.damborsky at sun.com >>>>> >>>>> >>>>> === *Comments* >>>>> =============================================================== >>>>> Moved to public comments. >>>>> >>>>> *** (#1 of 6): 2008-11-10 17:04:10 GMT+00:00 jan.damborsky at sun.com >>>>> *** Last Edit: 2008-11-10 17:20:54 GMT+00:00 dave.miner at sun.com >>>>> >>>>> Same situation (without zfs) on: >>>>> White Box based on Intel DG33TL motherboard with ICH9R chipset, >>>>> 2Gb memory, 3 SATA drives, 1 SATA CD/DVD, Intel graphics. >>>>> >>>>> *** (#2 of 6): 2008-11-10 22:52:23 GMT+00:00 pawel.wojcik at sun.com >>>>> >>>>> Workaround #1 does not cause the system to boot properly on the >>>>> system I tried installing (that seems to be consistent with what >>>>> others are reporting in the opensolaris defect report), but >>>>> workaround #2 DOES. >>>>> >>>>> *** (#3 of 6): 2008-11-11 01:56:43 GMT+00:00 seth.goldberg at sun.com >>>>> *** Last Edit: 2008-11-11 03:41:48 GMT+00:00 seth.goldberg at sun.com >>>>> >>>>> I've reproduced this on a "virgin" disk, see SR record against >>>>> this bug, (had to purchase a new spindle as previous disk failed >>>>> and new disk removed supplier packaging was inserted into laptop >>>>> and then 2008.11 CD booted). >>>>> >>>>> After a discussion with Dan Mick on email data requested by dan >>>>> was capture root command from grub prompt: >>>>> >>>>> (hd0,0,a): Filesystem type is zfs, partition type 0xbf >>>>> >>>>> Also, can you boot from the CD and collect the first 256kb of the >>>>> disk, with >>>>> >>>>> dd if=<your s0 slice here> of=first.256kb bs=256k count=1 >>>>> >>>>> This is attached. >>>>> >>>>> *** (#4 of 6): 2008-11-11 10:46:29 GMT+00:00 >>>>> christopher.armes at sun.com >>>>> >>>>> Saw this bug on several machines today which I was helping to >>>>> install. One person did a reinstall and it worked fine the second >>>>> time as some reported. >>>>> >>>>> 2 other machines could use the workaround which Lin Ling pointed >>>>> us to with this bug. That did save a couple folks from having to >>>>> reinstall, so was very helpful. Thanks Lin! Of the installs of >>>>> people that installed to a hard drive (i.e., not within >>>>> VirtualBox), about 12 systems, we saw this on 3 machines, so about >>>>> 25% of the systems in this small sampling. >>>>> >>>>> *** (#5 of 6): 2008-11-12 09:58:01 GMT+00:00 alan.duboff at sun.com >>>>> >>>>> Moved to public comments. >>>>> >>>>> *** (#6 of 6): 2008-11-12 12:43:18 GMT+00:00 jan.damborsky at sun.com >>>>> *** Last Edit: 2008-11-12 12:46:43 GMT+00:00 jan.damborsky at sun.com >>>>> >>>>> >>>>> === *Evaluation* >>>>> ============================================================= >>>>> See Description. >>>>> >>>>> *** (#1 of 4): 2008-11-11 03:23:04 GMT+00:00 seth.goldberg at sun.com >>>>> >>>>> remove mislead evaluation. >>>>> >>>>> *** (#2 of 4): 2008-11-11 21:45:12 GMT+00:00 lin.ling at sun.com >>>>> *** Last Edit: 2008-11-11 23:16:07 GMT+00:00 lin.ling at sun.com >>>>> >>>>> What? No, read the public comments. The problem is that the UFS >>>>> filesystem is still mounted as the installer lays down the ZFS. >>>>> Then, on reboot, the UFS, as >>>>> it's syncing, writes its superblock back to the filesystem it >>>>> thinks it owns, >>>>> over the top of the now-ZFS-owned space. >>>>> >>>>> The installer must ensure that other filesystems are not mounted >>>>> on the slice >>>>> where it's creating the ZFS rpool. >>>>> >>>>> *** (#3 of 4): 2008-11-11 22:11:35 GMT+00:00 dan.mick at sun.com >>>>> >>>>> You are right. I misunderstood. >>>>> George Wilson just corrected me that 'zpool create' indeed clears >>>>> the space correctly: >>>>> >>>>> vdev_label_init() { >>>>> : >>>>> vp = zio_buf_alloc(sizeof (vdev_phys_t)); >>>>> bzero(vp, sizeof (vdev_phys_t)); >>>>> : >>>>> bzero(vb, sizeof (vdev_boot_header_t)); >>>>> : >>>>> } >>>>> >>>>> Thanks for the clarification. >>>>> >>>>> *** (#4 of 4): 2008-11-11 22:49:04 GMT+00:00 lin.ling at sun.com >>>>> >>>>> >>>>> === *Suggested Fix* >>>>> ========================================================== >>>>> >>>>> === *Workaround* >>>>> ============================================================= >>>>> [1] Boot LiveCD >>>>> $ pfexec su - >>>>> # zpool import -f rpool >>>>> >>>>> *** (#1 of 3): 2008-11-10 10:27:21 GMT+00:00 jan.damborsky at sun.com >>>>> >>>>> ZERO OUT The leftover UFS magic: >>>>> >>>>> For GNU dd: >>>>> dd if=/dev/zero bs=1 count=4 seek=9564 /dev/dsk/<SLICE> >>>>> >>>>> (e.g.: >>>>> dd if=/dev/zero bs=1 count=4 seek=9564 /dev/dsk/c4t0d0s0 >>>>> ) >>>>> >>>>> *** (#2 of 3): 2008-11-11 03:36:55 GMT+00:00 seth.goldberg at sun.com >>>>> >>>>> I did the following in dd to workaround around the issue: >>>>> >>>>> root at opensolaris:~# dd if=/dev/zero of=/dev/dsk/c1t0d0s0 bs=1 >>>>> count=4 seek=9564 >>>>> 4+0 records in >>>>> 4+0 records out >>>>> 4 bytes (4 B) copied, 0.0394095 s, 0.1 kB/s >>>>> root at opensolaris:~# >>>>> >>>>> *** (#3 of 3): 2008-11-11 19:07:04 GMT+00:00 mary.ding at sun.com >>>>> >>>>> >>>>> === *Justification* >>>>> ========================================================== >>>>> Priority changed from [] to [1-Very High] >>>>> Installed OpenSolaris 2008.11 doesn't boot >>>>> jan.damborsky at sun.com 2008-11-10 10:27:21 GMT >>>>> >>>>> *** (#1 of 1): 2008-11-10 10:27:21 GMT+00:00 jan.damborsky at sun.com >>>>> >>>>> >>>>> === *Additional Details* >>>>> ===================================================== >>>>> Targeted Release: Commit To Fix In Build: >>>>> Fixed In Build: Integrated In Build: Verified In >>>>> Build: See Also: 6769534 >>>>> Duplicate of: Hooks: >>>>> Hook1: Hook2: Hook3: >>>>> Hook4: Hook5: Hook6: Interest List: >>>>> dan.mick at sun.com, dave.miner at sun.com, david.comay at sun.com, >>>>> frank.batschulat at sun.com, kerberos-iteam at Sun.COM, >>>>> lin.ling at sun.com, nick.todd at sun.com, peter.dennis at sun.com, >>>>> plus1tb at sun.com, sdg at sun.com, si-bugs at sun.com, sst-prg at >>>>> sun.com, >>>>> tomas.hurka at sun.com >>>>> Program Management: New Defect >>>>> Root Cause: Is a Security Vulnerability?: No >>>>> Fix Affects Documentation: No >>>>> Fix Affects Localization: No >>>>> Reported by: >>>>> === *History* >>>>> ================================================================ >>>>> Date Submitted: 2008-11-10 10:27:21 GMT+00:00 >>>>> Submitted By: jan.damborsky at sun.com >>>>> >>>>> Status Changed Date Updated Updated By >>>>> 3-Accepted 2008-11-10 23:59:05 GMT+00:00 >>>>> lin.ling at sun.com >>>>> 5-Cause Known 2008-11-11 03:23:04 GMT+00:00 >>>>> seth.goldberg at sun.com >>>>> 1-Dispatched 2008-11-12 12:43:18 GMT+00:00 >>>>> jan.damborsky at sun.com >>>>> >>>>> >>>>> === *Solution* >>>>> =============================================================== >>>>> >>>>> >>>>> === *Service Request* >>>>> ======================================================== >>>>> ID: 1-493023606 >>>>> Customer: >>>>> Account Name: Sun Microsystems >>>>> Customer Contact: Customer Contact Role: >>>>> D-Development >>>>> Customer Contact Type: I-Internal (SMI) Customer >>>>> Impact: Critical >>>>> Functionality: Primary >>>>> Severity: 1 >>>>> Synopsis: Product Name: solaris >>>>> Product Release: osol_2008.11 >>>>> Product Build: Operating System: osol_2008.11 >>>>> Hardware: generic >>>>> Reference Number: Sun Contact: jan.damborsky at sun.com >>>>> Status: Open >>>>> Source: BugTraq2 >>>>> Reproducible: Submitted By: jan.damborsky at sun.com >>>>> Submitted Date: 2008-11-10 10:27:21 GMT+00:00 >>>>> Description: >>>>> >>>>> === *Service Request* >>>>> ======================================================== >>>>> ID: 1-493053806 >>>>> Customer: >>>>> Account Name: SUN MicroSystems >>>>> Customer Contact: Customer Contact Role: >>>>> D-Development >>>>> Customer Contact Type: I-Internal (SMI) Customer >>>>> Impact: Critical >>>>> Functionality: Primary >>>>> Severity: 1 >>>>> Synopsis: After installing 2008.11RC1b boot from hard disk >>>>> fails >>>>> Product Name: solaris >>>>> Product Release: osol_2008.11 >>>>> Product Build: Operating System: osol_2008.11 >>>>> Hardware: x86 >>>>> Reference Number: Sun Contact: >>>>> christopher.armes at sun.com >>>>> Status: Open >>>>> Source: BugTraq2 >>>>> Reproducible: Always >>>>> Submitted By: christopher.armes at sun.com >>>>> Submitted Date: 2008-11-10 12:54:24 GMT+00:00 >>>>> Description: Booting from the livecd and then selecting >>>>> install works fine upon reboot with either cd in and selecting >>>>> boot from hard disk or without cd allowing grub menu to boot, >>>>> causes boot to fail drops system to "grub>" prompt >>>>> >>>>> >>>>> === *Service Request* >>>>> ======================================================== >>>>> ID: 1-493177108 >>>>> Customer: >>>>> Account Name: SUN >>>>> Customer Contact: Customer Contact Role: >>>>> D-Development >>>>> Customer Contact Type: I-Internal (SMI) Customer >>>>> Impact: Critical >>>>> Functionality: Primary >>>>> Severity: 1 >>>>> Synopsis: Product Name: solaris >>>>> Product Release: osol_2008.11 >>>>> Product Build: osol_2008.11 >>>>> Operating System: osol_2008.11 >>>>> Hardware: amd >>>>> Reference Number: Sun Contact: garrett.damore at sun.com >>>>> Status: Source: BugTraq2 >>>>> Reproducible: Submitted By: garrett.damore at sun.com >>>>> Submitted Date: 2008-11-10 20:16:41 GMT+00:00 >>>>> Description: I hit this when updating my Ultra 20 >>>>> (original model, not M2) from b77ish to OSOL 2008.11rc1b >>>>> >>>>> System has 1.5GB ram, SATA hard disk. >>>>> >>>>> >>>>> === *Service Request* >>>>> ======================================================== >>>>> ID: 1-493257401 >>>>> Customer: >>>>> Account Name: Sun Microsystems, Inc. >>>>> Customer Contact: Customer Contact Role: >>>>> D-Development >>>>> Customer Contact Type: I-Internal (SMI) Customer >>>>> Impact: Critical >>>>> Functionality: Primary >>>>> Severity: 1 >>>>> Synopsis: Product Name: solaris >>>>> Product Release: osol_2008.11 >>>>> Product Build: osol_2008.11 >>>>> Operating System: osol_2008.11 >>>>> Hardware: generic_ibm_compatible >>>>> Reference Number: Sun Contact: dana.myers at sun.com >>>>> Status: Open >>>>> Source: BugTraq2 >>>>> Reproducible: Submitted By: dana.myers at sun.com >>>>> Submitted Date: 2008-11-10 22:34:45 GMT+00:00 >>>>> Description: >>>>> >>>>> === *Service Request* >>>>> ======================================================== >>>>> ID: 1-493265801 >>>>> Customer: >>>>> Account Name: Sun Microsystems >>>>> Customer Contact: pawel.wojcik at sun.com >>>>> Customer Contact Role: D-Development >>>>> Customer Contact Type: I-Internal (SMI) Customer >>>>> Impact: Critical >>>>> Functionality: Primary >>>>> Severity: 1 >>>>> Synopsis: Product Name: solaris >>>>> Product Release: osol_2008.11 >>>>> Product Build: osol_2008.11 >>>>> Operating System: solaris >>>>> Hardware: intel >>>>> Reference Number: Sun Contact: pawel.wojcik at sun.com >>>>> Status: Source: BugTraq2 >>>>> Reproducible: Submitted By: pawel.wojcik at sun.com >>>>> Submitted Date: 2008-11-10 22:50:53 GMT+00:00 >>>>> Description: >>>>> >>>>> === *Activity* >>>>> =============================================================== >>>>> >>>>> >>>>> === *Multiple Release (MR) Cluster* - 0 >>>>> ====================================== >>>>> >>>>> >>>>> === *Escalations* >>>>> ============================================================ >>>>> >>>>> >>>> >>> >> >
