Jan and George:

I had also seen this on ultra24 and ultra40 when we test the 1.5 TB 
Seagate sata disk.  So far I had only seen this on system with sata drive.



jan damborsky wrote:
> George,
> 
> 
> George Wilson wrote:
>> Jan,
>>
>> It seems like the problem is not with ZFS but with the device driver. 
>> If the driver is failing to provide the devid then ZFS is just going 
>> to be a victim.
> 
> I agree with you that this is what we might be encountering
> with respect to 'devid' problem here.
> 
> 
>> I would recommend that we change the synopsis to devid_get() fails 
>> with "Invalid argument" and pass this to the driver folks.
> 
> I will let Sanjay comment on this, since he has done
> some more investigation recently.
> 
>> Do you know if it's always the same driver?
> 
> I can only reproduce it on one system - this one has SATA drive
> connected to the controller handled by nv_sata(7D) driver. I think
> that Sanjay encountered that problem also on system with SATA disk.
> 
> Thank you,
> Jan
> 
>> Thanks,
>> George
>>
>> jan damborsky wrote:
>>> Hi George,
>>>
>>>
>>> George Wilson wrote:
>>>> Jan,
>>>>
>>>> So who is working the UFS issue and how is that being tracked. 
>>> In general, bugs in OpenSolaris Caiman installer are tracked in 
>>> Bugzilla at
>>> defect.opensolaris.org. This is the preferred over filing bugs in 
>>> Bugster.
>>> Speaking about this particular problem, it is tracked by following bug:
>>>
>>> 4675 Fix for bug 30 causes ZFS label to be mangled - ending up in 
>>> GRUB prompt after installing OpenSolaris
>>> http://defect.opensolaris.org/bz/show_bug.cgi?id=4675
>>>
>>> Sanjay Nadkarni is assigned to this bug (CCing him).
>>>
>>>> I would recommend that we keep this bug as the UFS/install issue and 
>>>> create a new bug and send that to me.
>>> As pointed above, Bugzilla is preferred database to track issues in 
>>> Caiman installer.
>>>
>>> Please note that 6769487 was originally filed for tracking the 
>>> problem when
>>> GRUB can't access ZFS filesystem because 'devid' is not present in 
>>> ZFS label.
>>>
>>> It was overloaded later by 'UFS' problem.
>>>
>>>> Can you move the descriptions below from this bug and add them to 
>>>> the new one?
>>> To be honest, since installer part of problem related to UFS is 
>>> tracked by 4675,
>>> I don't see why we shouldn't continue to use 6769487 to track the 
>>> issue this bug
>>> was initially filed for and I think that we might lose some context when
>>> ZFS related information is moved from 6769487 to the new bug.
>>> That said, if you think it might be helpful, please let me know and
>>> I will try to capture all information from 6769487 I think is 
>>> relevant to
>>> the ZFS part in new bug.
>>>
>>>> Also since you can reproduce this can you tell me exactly how or 
>>>> point me at a system which I can login into to debug?
>>> Sure, the machine can be accessed via 'ssh', but since it is not
>>> directly accessible from SWAN (it is behind the NAT),
>>> I will provide you with instructions, how to access it.
>>> Unfortunately it doesn't have console access.
>>>
>>> Please let me know, in which state you would need to have that
>>> machine - right after the installation finished, but before reboot ?
>>>
>>> Unfortunately, following the procedure itself doesn't seem to be
>>> sufficient for reproducing the problem :-( I tried exactly the
>>> same steps on other bare metal as well as in virtual environment,
>>> but without success.
>>>
>>>
>>>> I want to make sure we don't lose sight of the UFS issue and this 
>>>> bug has already gone down to root cause so let's not overload this 
>>>> bug any further.
>>> UFS part of problem is being solved right now (please feel free to 
>>> monitor
>>> bug 4675 for progress and add anything you might consider relevant
>>> to that issue).
>>>
>>> Thank you,
>>> Jan
>>>
>>>> Thanks,
>>>> George
>>>>
>>>> jan damborsky wrote:
>>>>> Hi George,
>>>>>
>>>>> there are at least two parts of this problem:
>>>>>
>>>>> [1] UFS one
>>>>> This is what you are referring to and it is being tracked by 
>>>>> Bugzilla bug 4675.
>>>>> In that case workaround #2 helps to "solve" the problem.
>>>>>
>>>>> [2] ZFS one
>>>>> Please see original description #1. I am able to reproduce that on 
>>>>> system
>>>>> at will which didn't contain any UFS filesystem and thus [1] is not
>>>>> applicable here. 'zpool import' helps in this case.
>>>>>
>>>>> Also please see:
>>>>> * description #4
>>>>> * description #5
>>>>> * public comments #8
>>>>> * comments #6
>>>>>
>>>>> People are apparently encountering this problem in
>>>>> other configurations (e.g. when using virgin disk
>>>>> or installing on system containing only Windows).
>>>>>
>>>>> I am not stating that this is in fact problem in ZFS as it might
>>>>> be related for example to device driver code, but at this point it
>>>>> seems to me that ZFS team is the most eligible one to move
>>>>> things forward, as GRUB can't read menu.lst from ZFS
>>>>> filesystem .
>>>>>
>>>>> Please let me know if you have any questions or need more
>>>>> information.
>>>>>
>>>>> Thank you,
>>>>> Jan
>>>>>
>>>>>
>>>>> George Wilson wrote:
>>>>>> Jan,
>>>>>>
>>>>>> I don't understand how this is a ZFS problem. I thought from the 
>>>>>> evaluation that the issue is that UFS and ZFS are sharing the same 
>>>>>> block and this was being caused by the fact the the livecd had 
>>>>>> mounted a UFS filesystem as part of the installation. Could you 
>>>>>> clarify?
>>>>>>
>>>>>> Thanks,
>>>>>> George
>>>>>>
>>>>>> Jan.Damborsky at Sun.COM wrote:
>>>>>>>                         Sun Confidential: Internal only
>>>>>>>
>>>>>>> *Synopsis*: Ended up in 'grub>' prompt after installation of 
>>>>>>> OpenSolaris 2008.11 (build 101a)
>>>>>>>
>>>>>>> CrPrint: http://bt2ws.central.sun.com/CrPrint?id=6769487
>>>>>>> Monaco: http://monaco.sfbay.sun.com/detail.jsf?cr=6769487
>>>>>>>
>>>>>>> Due to a change of Responsible manager requested by 
>>>>>>> jan.damborsky at sun.com,
>>>>>>> david.brittle at sun.com is now the responsible manager for:
>>>>>>>
>>>>>>> Due to a change requested by jan.damborsky at sun.com,
>>>>>>> this CR is being redispatched:
>>>>>>>
>>>>>>> This is a high priority CR and requires your immediate attention.
>>>>>>> Please evaluate it as soon as possible.  Thank you.
>>>>>>>
>>>>>>> CR 6769487 changed on Nov 12 2008 by jan.damborsky at sun.com
>>>>>>>
>>>>>>> === Field ============ === New Value ============= === Old Value 
>>>>>>> =============
>>>>>>>
>>>>>>> Category               kernel                      
>>>>>>> opensolaris                Comments               New 
>>>>>>> Note                                               
>>>>>>> Comments               New Note                    Old 
>>>>>>> Note                   Comments               New 
>>>>>>> Note                    Old Note                   Public 
>>>>>>> Comments        New 
>>>>>>> Note                                               Responsible 
>>>>>>> Manager    david.brittle at sun.com       eric.ray at sun.com           
>>>>>>> Status                 1-Dispatched                5-Cause 
>>>>>>> Known              SubCategory            
>>>>>>> zfs                         livecd                     
>>>>>>> ====================== =========================== 
>>>>>>> ===========================
>>>>>>>
>>>>>>>      *Change Request ID*: 6769487
>>>>>>>
>>>>>>> *Synopsis*: Ended up in 'grub>' prompt after installation of 
>>>>>>> OpenSolaris 2008.11 (build 101a)
>>>>>>>
>>>>>>>   Product: solaris
>>>>>>>   Category: kernel
>>>>>>>   Subcategory: zfs
>>>>>>>   Type: Defect
>>>>>>>   Subtype: Functionality
>>>>>>>   Status: 1-Dispatched
>>>>>>>   Substatus:   Priority: 1-Very High
>>>>>>>   Introduced In Release:   Introduced In Build:   Responsible 
>>>>>>> Manager: david.brittle at sun.com
>>>>>>>   Responsible Engineer:   Initial Evaluator: zfs-team at sun.com
>>>>>>>   Keywords:
>>>>>>> === *Description* 
>>>>>>> ============================================================
>>>>>>> When testing installation with recent OpenSolaris builds, we have 
>>>>>>> been encountering that
>>>>>>> in some cases, people end up in GRUB prompt after the 
>>>>>>> installation - it seems that menu.lst
>>>>>>> can't be accessed for some reason. For now bunch of Bugzilla bugs 
>>>>>>> seem to be describing
>>>>>>> the same manifestation of the problem which root cause has not 
>>>>>>> been identified yet:
>>>>>>>
>>>>>>> 4051 opensolaris b99b/b100a does not install on 1.5 TB disk or 
>>>>>>> boot fails after install
>>>>>>> 4591 Install failure on a Sun Fire X4240 with Opensolaris 200811
>>>>>>> 4161 no grub in 2008.11 Development Builds (comment #20, comment 
>>>>>>> #31)
>>>>>>> 4760 Enter grub after installing 2008.11 RC 1
>>>>>>> ...
>>>>>>>
>>>>>>> I also hit that problem when testing Automated Installer (it is a 
>>>>>>> part of Caiman project
>>>>>>> and will replace current jumpstart install technology), I was 
>>>>>>> able to make GRUB find
>>>>>>> 'menu.lst' just by using 'zpool import' command - please see 
>>>>>>> below for detailed procedure.
>>>>>>>
>>>>>>>
>>>>>>> configuration:
>>>>>>> --------------
>>>>>>> HW: Ultra 20, 1GB RWM, 1 250GB SATA drive
>>>>>>> SW: Opensolaris build 100, 64bit mode
>>>>>>>
>>>>>>> steps used:
>>>>>>> -----------
>>>>>>> [1] OpenSolaris 100 installed using Automated Installer
>>>>>>>    - Solaris 2 partition created during installation
>>>>>>>
>>>>>>> * partition configuration before installation:
>>>>>>>
>>>>>>> # fdisk -W - c2t0d0p0
>>>>>>> ...* Id    Act  Bhead  Bsect  Bcyl    Ehead  Esect  Ecyl    
>>>>>>> Rsect      Numsect
>>>>>>>  192   0    0      1      1       254    63     1023    
>>>>>>> 16065      22491000
>>>>>>> * partition configuration after installation:
>>>>>>>
>>>>>>> # fdisk -W - c2t0d0p0
>>>>>>> ...* Id    Act  Bhead  Bsect  Bcyl    Ehead  Esect  Ecyl    
>>>>>>> Rsect      Numsect
>>>>>>>  192   0    0      1      1       254    63     1023    
>>>>>>> 16065      22491000  191   128  254    63     1023    254    
>>>>>>> 63     1023    22507065   30000000
>>>>>>>
>>>>>>> [2] When I reboot the system after the installation, I ended up 
>>>>>>> in GRUB prompt:
>>>>>>> grub> root
>>>>>>> (hd0,1,a): Filesystem type unknown, partition type 0xbf
>>>>>>>
>>>>>>> grub> cat /rpool/boot/grub/menu.lst
>>>>>>>
>>>>>>> Error 17: Cannot mount selected partition
>>>>>>>
>>>>>>> grub>
>>>>>>>
>>>>>>> [3] I rebooted into AI and did 'zpool import'
>>>>>>> # zdb -l /dev/rdsk/c2t0d0s0 > /tmp/zdb_before_import.txt (attached)
>>>>>>> # zpool import -f rpool
>>>>>>> # zdb -l /dev/rdsk/c2t0d0s0 > /tmp/zdb_after_import.txt (attached)
>>>>>>> # diff /tmp/zdb_before_import.txt /tmp/zdb_after_import.txt
>>>>>>> 7c7
>>>>>>> <     txg=21
>>>>>>> ---
>>>>>>>  
>>>>>>>>     txg=2675
>>>>>>>>     
>>>>>>> 9c9
>>>>>>> <     hostid=4741222
>>>>>>> ---
>>>>>>>  
>>>>>>>>     hostid=4247690
>>>>>>>>     
>>>>>>> 17a18
>>>>>>>  
>>>>>>>>         devid='id1,sd at f00c778e247ac7bd0000238460000/a'
>>>>>>>>     
>>>>>>> 31c32
>>>>>>> ...
>>>>>>> # reboot
>>>>>>>
>>>>>>> [4] Now GRUB can access menu.lst and Solaris is booted
>>>>>>>
>>>>>>> hypothesis
>>>>>>> ----------
>>>>>>> It seems that for some reason, when ZFS pool was created, 'devid' 
>>>>>>> information was not added to the ZFS label.
>>>>>>>
>>>>>>> When 'zpool import' was called, 'devid' got populated.
>>>>>>>
>>>>>>> Looking at the GRUB ZFS plug-in, it seems that 'devid' 
>>>>>>> (ZPOOL_CONFIG_DEVID attribute) is
>>>>>>> required in order to be able to access ZFS filesystem:
>>>>>>>
>>>>>>> In grub/grub-0.95/stage2/fsys_zfs.c:
>>>>>>>
>>>>>>> vdev_get_bootpath()
>>>>>>> {
>>>>>>> ...
>>>>>>>    if (strcmp(type, VDEV_TYPE_DISK) == 0) {
>>>>>>>        if (vdev_validate(nv) != 0 ||
>>>>>>>            (nvlist_lookup_value(nv, ZPOOL_CONFIG_PHYS_PATH,
>>>>>>>            bootpath, DATA_TYPE_STRING, NULL) != 0) ||
>>>>>>>            (nvlist_lookup_value(nv, ZPOOL_CONFIG_DEVID,
>>>>>>>            devid, DATA_TYPE_STRING, NULL) != 0))
>>>>>>>            return (ERR_NO_BOOTPATH);
>>>>>>> ...
>>>>>>> }
>>>>>>>
>>>>>>> additional observations:
>>>>>>> ------------------------
>>>>>>> [1] If 'devid' is populated during installation after 'zpool create'
>>>>>>> operation, the problem doesn't occur.
>>>>>>>
>>>>>>> [2] If following described procedure, the problem is reproducible
>>>>>>> at will on system where it was initially reproduced (please see 
>>>>>>> above for the configuration)
>>>>>>>
>>>>>>> [3] Other people reported this problem also for following 
>>>>>>> configurations:
>>>>>>> * vmware
>>>>>>> * Sun Java Workstation W2100z with 2xOpteron2.4G 3G Mem
>>>>>>>
>>>>>>> [4] When installation into existing Solaris2 partition containing 
>>>>>>> Solaris instance is done
>>>>>>> 'devid' is always populated and the problem doesn't occur (it 
>>>>>>> doesn't matter if partition
>>>>>>> is marked 'active' or not).
>>>>>>>
>>>>>>> *** (#1 of 5): 2008-11-10 10:27:21 GMT+00:00 jan.damborsky at sun.com
>>>>>>>
>>>>>>> If the system once be Navada, (101a as mine), install OpenSolaris 
>>>>>>> will hit this issue, while keep the partition but not choose the 
>>>>>>> entire disk (I suspect this caused the issue, perhaps)
>>>>>>> There's a diagnostic partition on there if Navada installed, and 
>>>>>>> opensolaris 2008.11 simply enter grub> as this CR mentioned. Then 
>>>>>>> I use the entire disk, this time the system boot up okay.
>>>>>>> But while I re-install it again with a smaller size than the 
>>>>>>> entire disk specified,
>>>>>>> grub has no problem, but GNOME cannot start (hang there endlessly)
>>>>>>>
>>>>>>> *** (#2 of 5): 2008-11-10 10:45:29 GMT+00:00 robin.guo at sun.com
>>>>>>>
>>>>>>> The root cause of this problem is the continued existence of UFS 
>>>>>>> filesystems structures on disk, even after the zfs filesystem is 
>>>>>>> created and is live.  Because ZFS did not destroy the UFS magic, 
>>>>>>> both GRUB and Solaris think there's a (horribly damaged) UFS 
>>>>>>> filesystem present on that slice (a WARNING is displayed at boot 
>>>>>>> time during OpenSolaris boot informing the user that 
>>>>>>> /mnt/solaris<N> (where <N> is a number) could not be mounted 
>>>>>>> because of filesystem problems -- in reality, that slice is where 
>>>>>>> the zfs root is located.
>>>>>>>
>>>>>>> In GRUB, since code that attempts to mount root does so by trying 
>>>>>>> each filesystem module in the order in which they are listed in 
>>>>>>> the fsys_table[] array, and since UFS is listed before ZFS, GRUB 
>>>>>>> thinks that a UFS filesystem exists in the slice actually 
>>>>>>> containing the ZFS root filesystem (and fails trying to mount it, 
>>>>>>> leaving it unable to locate the real root filesystem).  A 
>>>>>>> modified version of GRUB that modifies fsys_table by declaring 
>>>>>>> the ZFS operations before the UFS operations confirms this 
>>>>>>> hypothesis.
>>>>>>>
>>>>>>> Therefore, a valid workaround destroys the UFS magic, preventing 
>>>>>>> both GRUB's and Solaris's UFS modules from recognizing the slice 
>>>>>>> as a UFS filesystem.  When GRUB's UFS code fails to find a valid 
>>>>>>> UFS filesystem, the ZFS module is subsequently tried and is able 
>>>>>>> to successfully mount the filesystem.
>>>>>>>
>>>>>>> *** (#3 of 5): 2008-11-11 03:23:04 GMT+00:00 seth.goldberg at sun.com
>>>>>>> *** Last Edit: 2008-11-11 03:45:05 GMT+00:00 seth.goldberg at sun.com
>>>>>>>
>>>>>>> I think there are two separate issues here.  The UFS label 
>>>>>>> appears to be one. The signature for this bug is that at grub 
>>>>>>> prompt, typing root - generates the UFS filesystem info.
>>>>>>>  However there is a secondary bug where after installation, one 
>>>>>>> gets a grub prompt. Typing root command at the grub prompmt  
>>>>>>> generates -  unknown file system. In this case no UFS filesystems 
>>>>>>> were detected or mounted.  The workaround for this has been to 
>>>>>>> run zpool import.   This still needs to be investigated.
>>>>>>>
>>>>>>> *** (#4 of 5): 2008-11-12 00:04:16 GMT+00:00 sanjay.nadkarni at sun.com
>>>>>>>
>>>>>>> We were able to recreate the grub failure where typing root at 
>>>>>>> the prompt returns unknown file system. This was on a Fujistu 
>>>>>>> LifeBook S7211.  It was installed with installed with Vista.  We 
>>>>>>> then booted OpenSolaris and started the install. At the end of 
>>>>>>> the installation we noted that the zfs label did  not have devid 
>>>>>>> information.
>>>>>>>
>>>>>>> We then loaded a simple program that would get the devid 
>>>>>>> (devid_get).  This failed with "Invalid argument".  We then 
>>>>>>> rebooted the liveCD again and reran this program and this time it 
>>>>>>> printed out the device id.  The disk is off a SATA controller.  
>>>>>>> The driver that attached to this is ahci.  The device is: 
>>>>>>> 82801HBM/HEM. The disk is Fujitsu MHY2120BH
>>>>>>>
>>>>>>> *** (#5 of 5): 2008-11-12 02:43:18 GMT+00:00 sanjay.nadkarni at sun.com
>>>>>>>
>>>>>>>
>>>>>>> === *Public Comments* 
>>>>>>> ========================================================
>>>>>>> Following Bugzilla bugs were closed as duplicate of this issue:
>>>>>>>
>>>>>>> 4772 Cannot install OpenSolaris 2008.11 on VMware Server 2.0
>>>>>>> http://defect.opensolaris.org/bz/show_bug.cgi?id=4772
>>>>>>>
>>>>>>> 4756 after reboot when finishing the installation, system can not 
>>>>>>> boot
>>>>>>> http://defect.opensolaris.org/bz/show_bug.cgi?id=4756
>>>>>>>
>>>>>>> 4749 After installed opensolaris0811RC1 on Dell PowerEdge, can't 
>>>>>>> boot from disk.
>>>>>>> http://defect.opensolaris.org/bz/show_bug.cgi?id=4749
>>>>>>>
>>>>>>> *** (#1 of 9): 2008-11-10 17:20:54 GMT+00:00 dave.miner at sun.com
>>>>>>> *** Last Edit: 2008-11-11 11:45:41 GMT+00:00 jan.damborsky at sun.com
>>>>>>>
>>>>>>> zpool import doesn't help for me, nor would I expect it to (it's 
>>>>>>> a mystery
>>>>>>> why it seems to).  Clearing the UFS magic helps.
>>>>>>>
>>>>>>> Looking further, I find that the data on disk at 8k seems to still
>>>>>>> be a UFS superblock, not a zfs vdev_boot_header_t, which doesn't 
>>>>>>> make
>>>>>>> sense to me; in any ZFS initialization scheme, one would expect 
>>>>>>> all parts
>>>>>>> of the label to be completely written.
>>>>>>>
>>>>>>> The expected vdev_boot_header_t appears at the label copy at 
>>>>>>> 256K+8K, as
>>>>>>> expected.
>>>>>>>
>>>>>>> *** (#2 of 9): 2008-11-11 04:39:09 GMT+00:00 dan.mick at sun.com
>>>>>>>
>>>>>>> It appears that ZFS doesn't validate that first 8k (the 
>>>>>>> vdev_boot_header), so
>>>>>>> that explains why the kernel was happy even with a UFS superblock 
>>>>>>> where the
>>>>>>> vdev_boot_header was supposed to be.
>>>>>>>
>>>>>>> Also, the last few bits of the 8k block in question seem to 
>>>>>>> contain a
>>>>>>> zio_block_tail_t (i.e. a zbt_magic and a zbt_cksum), so it seems 
>>>>>>> this block
>>>>>>> was written by ZFS sometime in the past.
>>>>>>> Possible theories:  1) the ZFS initialization somehow skipped 
>>>>>>> this 8k header,
>>>>>>> or 2) somehow the 8k superblock was rewritten over the block 
>>>>>>> after ZFS initialized it.
>>>>>>>
>>>>>>> *** (#3 of 9): 2008-11-11 04:49:57 GMT+00:00 dan.mick at sun.com
>>>>>>>
>>>>>>> Another possible theory: could this be the superblock flush from 
>>>>>>> a still-mounted UFS being shut down?
>>>>>>>
>>>>>>> (The block was correct until after the OpenSolaris installer said 
>>>>>>> it was done,
>>>>>>> and waited for me to press a button to reboot.  I suspect
>>>>>>> the original UFS was mounted and not unmounted before the ZFS 
>>>>>>> creation,
>>>>>>> so they both think they own the device.)
>>>>>>>
>>>>>>> Supporting evidence: the "last mounted" path in the superblock is 
>>>>>>> "/mnt/solaris0".
>>>>>>>
>>>>>>> I suspect the cause of this bug is a UFS that's mounted and 
>>>>>>> should be
>>>>>>> unmounted by the installer before ZFS creation.
>>>>>>>
>>>>>>> What's the right category/subcategory for Caiman?
>>>>>>>
>>>>>>> *** (#4 of 9): 2008-11-11 07:34:58 GMT+00:00 dan.mick at sun.com
>>>>>>>
>>>>>>> The live CD has historically automatically mounted up any UFS 
>>>>>>> file systems that it found, going back to Belenix.  Interesting 
>>>>>>> that this is just now a problem, but it probably is a result of 
>>>>>>> switching to ZFS for swap, as up until build 96 we always created 
>>>>>>> a swap slice at the start of the disk, which it appears would 
>>>>>>> have masked this problem.
>>>>>>>
>>>>>>> *** (#5 of 9): 2008-11-11 15:02:03 GMT+00:00 dave.miner at sun.com
>>>>>>>
>>>>>>> Installer takes care of releasing the target device before Target 
>>>>>>> Instantiation
>>>>>>> phase is launched. Among other things, it
>>>>>>>
>>>>>>> * releases all swap devices created on target disk
>>>>>>> * unmounts whatever is mounted on target disk
>>>>>>>
>>>>>>> For the latter, /etc/mnttab is read and if there is mounted 
>>>>>>> device which is part of
>>>>>>> the target disk, installer tries to unmount it.
>>>>>>>
>>>>>>> The problem is after fix for Bugzilla bug 30 was integrated, UFS 
>>>>>>> filesystems are
>>>>>>> mounted with '-o m' option which causes the filesystem being 
>>>>>>> mounted without making
>>>>>>> entry in /etc/mnttab. Then mountpoints are hidden, installer 
>>>>>>> can't see those and
>>>>>>> doesn't unmount them.
>>>>>>>
>>>>>>> That said, this explains UFS part of the problem  (when 'dd' 
>>>>>>> workaround works),
>>>>>>> but doesn't seems to be related to ZFS part of the issue, when 
>>>>>>> 'zpool import' workaround helped.
>>>>>>>
>>>>>>> *** (#6 of 9): 2008-11-11 16:25:09 GMT+00:00 jan.damborsky at sun.com
>>>>>>> *** Last Edit: 2008-11-11 16:34:29 GMT+00:00 jan.damborsky at sun.com
>>>>>>>
>>>>>>> We should probably file leave this bug to resolve zpool create 
>>>>>>> not removing evidence of the
>>>>>>> previous ufs fs, and file another one to chase down the other 
>>>>>>> issue(s?).
>>>>>>>
>>>>>>>  Chris, if you run zbd -l on you virgin device, are you missing 
>>>>>>> any zfs properties? The reader
>>>>>>> in GRUB pretty much gives up if things like the devid aren't set.
>>>>>>>
>>>>>>> *** (#7 of 9): 2008-11-11 19:30:56 GMT+00:00 
>>>>>>> jan.setje-eilers at sun.com
>>>>>>>
>>>>>>> Concur that Chris' problem is different; the UFS superblock does 
>>>>>>> not exist in
>>>>>>> the first 256kb attached to the bug.  It appears as though 
>>>>>>> phys_path and devid
>>>>>>> are present, although it's difficult to be sure.  We should 
>>>>>>> probably see if we can
>>>>>>> send a debug version of Grub to Chris, with installation 
>>>>>>> instructions, to see
>>>>>>> why it seems unable to find the zfs.
>>>>>>>
>>>>>>> *** (#8 of 9): 2008-11-11 22:16:50 GMT+00:00 dan.mick at sun.com
>>>>>>>
>>>>>>> The root cause of 'UFS part' of this problem is in 'livecd code' 
>>>>>>> and is tracked by
>>>>>>> following Bugzilla bug:
>>>>>>>
>>>>>>> 4675 Fix for bug 30 causes ZFS label to be mangled - ending up in 
>>>>>>> GRUB prompt after installing OpenSolaris
>>>>>>>
>>>>>>> Please feel free to use this bug (6769487) for tracking other 
>>>>>>> part(s) of the problem.
>>>>>>> Resetting category to solaris/kernel/zfs and Status to 'Dispatched'.
>>>>>>>
>>>>>>> *** (#9 of 9): 2008-11-12 12:46:21 GMT+00:00 jan.damborsky at sun.com
>>>>>>>
>>>>>>>
>>>>>>> === *Comments* 
>>>>>>> ===============================================================
>>>>>>> Moved to public comments.
>>>>>>>
>>>>>>> *** (#1 of 6): 2008-11-10 17:04:10 GMT+00:00 jan.damborsky at sun.com
>>>>>>> *** Last Edit: 2008-11-10 17:20:54 GMT+00:00 dave.miner at sun.com
>>>>>>>
>>>>>>> Same situation (without zfs) on:
>>>>>>> White Box based on Intel DG33TL motherboard with ICH9R chipset, 
>>>>>>> 2Gb memory, 3 SATA drives, 1 SATA CD/DVD, Intel graphics.
>>>>>>>
>>>>>>> *** (#2 of 6): 2008-11-10 22:52:23 GMT+00:00 pawel.wojcik at sun.com
>>>>>>>
>>>>>>> Workaround #1 does not cause the system to boot properly on the 
>>>>>>> system I tried installing (that seems to be consistent with what 
>>>>>>> others are reporting in the opensolaris defect report), but 
>>>>>>> workaround #2 DOES.
>>>>>>>
>>>>>>> *** (#3 of 6): 2008-11-11 01:56:43 GMT+00:00 seth.goldberg at sun.com
>>>>>>> *** Last Edit: 2008-11-11 03:41:48 GMT+00:00 seth.goldberg at sun.com
>>>>>>>
>>>>>>> I've reproduced this on a "virgin" disk, see SR record against 
>>>>>>> this bug, (had to purchase a new spindle as previous disk failed 
>>>>>>> and new disk removed supplier packaging was inserted into laptop 
>>>>>>> and then 2008.11 CD booted).
>>>>>>>
>>>>>>> After a discussion with Dan Mick on email data requested by dan 
>>>>>>> was capture root command from grub prompt:
>>>>>>>
>>>>>>> (hd0,0,a): Filesystem type is zfs, partition type 0xbf
>>>>>>>
>>>>>>> Also, can you boot from the CD and collect the first 256kb of the 
>>>>>>> disk, with
>>>>>>>
>>>>>>> dd if=<your s0 slice here> of=first.256kb bs=256k count=1
>>>>>>>
>>>>>>> This is attached.
>>>>>>>
>>>>>>> *** (#4 of 6): 2008-11-11 10:46:29 GMT+00:00 
>>>>>>> christopher.armes at sun.com
>>>>>>>
>>>>>>> Saw this bug on several machines today which I was helping to 
>>>>>>> install. One person did a reinstall and it worked fine the second 
>>>>>>> time as some reported.
>>>>>>>
>>>>>>> 2 other machines could use the workaround which Lin Ling pointed 
>>>>>>> us to with this bug. That did save a couple folks from having to 
>>>>>>> reinstall, so was very helpful. Thanks Lin! Of the installs of 
>>>>>>> people that installed to a hard drive (i.e., not within 
>>>>>>> VirtualBox), about 12 systems, we saw this on 3 machines, so 
>>>>>>> about 25% of the systems in this small sampling.
>>>>>>>
>>>>>>> *** (#5 of 6): 2008-11-12 09:58:01 GMT+00:00 alan.duboff at sun.com
>>>>>>>
>>>>>>> Moved to public comments.
>>>>>>>
>>>>>>> *** (#6 of 6): 2008-11-12 12:43:18 GMT+00:00 jan.damborsky at sun.com
>>>>>>> *** Last Edit: 2008-11-12 12:46:43 GMT+00:00 jan.damborsky at sun.com
>>>>>>>
>>>>>>>
>>>>>>> === *Evaluation* 
>>>>>>> =============================================================
>>>>>>> See Description.
>>>>>>>
>>>>>>> *** (#1 of 4): 2008-11-11 03:23:04 GMT+00:00 seth.goldberg at sun.com
>>>>>>>
>>>>>>> remove mislead evaluation.
>>>>>>>
>>>>>>> *** (#2 of 4): 2008-11-11 21:45:12 GMT+00:00 lin.ling at sun.com
>>>>>>> *** Last Edit: 2008-11-11 23:16:07 GMT+00:00 lin.ling at sun.com
>>>>>>>
>>>>>>> What?  No, read the public comments.  The problem is that the UFS 
>>>>>>> filesystem is still mounted as the installer lays down the ZFS.  
>>>>>>> Then, on reboot, the UFS, as
>>>>>>> it's syncing, writes its superblock back to the filesystem it 
>>>>>>> thinks it owns,
>>>>>>> over the top of the now-ZFS-owned space.
>>>>>>>
>>>>>>> The installer must ensure that other filesystems are not mounted 
>>>>>>> on the slice
>>>>>>> where it's creating the ZFS rpool.
>>>>>>>
>>>>>>> *** (#3 of 4): 2008-11-11 22:11:35 GMT+00:00 dan.mick at sun.com
>>>>>>>
>>>>>>> You are right. I misunderstood.
>>>>>>> George Wilson just corrected me that 'zpool create' indeed clears 
>>>>>>> the space correctly:
>>>>>>>
>>>>>>> vdev_label_init() {
>>>>>>>     :
>>>>>>>         vp = zio_buf_alloc(sizeof (vdev_phys_t));
>>>>>>>         bzero(vp, sizeof (vdev_phys_t));
>>>>>>>     :
>>>>>>>         bzero(vb, sizeof (vdev_boot_header_t));
>>>>>>>     :
>>>>>>> }
>>>>>>>
>>>>>>> Thanks for the clarification.
>>>>>>>
>>>>>>> *** (#4 of 4): 2008-11-11 22:49:04 GMT+00:00 lin.ling at sun.com
>>>>>>>
>>>>>>>
>>>>>>> === *Suggested Fix* 
>>>>>>> ==========================================================
>>>>>>>
>>>>>>> === *Workaround* 
>>>>>>> =============================================================
>>>>>>> [1] Boot LiveCD
>>>>>>> $ pfexec su -
>>>>>>> # zpool import -f rpool
>>>>>>>
>>>>>>> *** (#1 of 3): 2008-11-10 10:27:21 GMT+00:00 jan.damborsky at sun.com
>>>>>>>
>>>>>>> ZERO OUT The leftover UFS magic:
>>>>>>>
>>>>>>> For GNU dd:
>>>>>>> dd if=/dev/zero bs=1 count=4 seek=9564 /dev/dsk/<SLICE>
>>>>>>>
>>>>>>> (e.g.:
>>>>>>> dd if=/dev/zero bs=1 count=4 seek=9564 /dev/dsk/c4t0d0s0
>>>>>>> )
>>>>>>>
>>>>>>> *** (#2 of 3): 2008-11-11 03:36:55 GMT+00:00 seth.goldberg at sun.com
>>>>>>>
>>>>>>> I did the following in dd to workaround around the issue:
>>>>>>>
>>>>>>> root at opensolaris:~# dd if=/dev/zero of=/dev/dsk/c1t0d0s0 bs=1 
>>>>>>> count=4 seek=9564
>>>>>>> 4+0 records in
>>>>>>> 4+0 records out
>>>>>>> 4 bytes (4 B) copied, 0.0394095 s, 0.1 kB/s
>>>>>>> root at opensolaris:~#
>>>>>>>
>>>>>>> *** (#3 of 3): 2008-11-11 19:07:04 GMT+00:00 mary.ding at sun.com
>>>>>>>
>>>>>>>
>>>>>>> === *Justification* 
>>>>>>> ==========================================================
>>>>>>> Priority changed from [] to [1-Very High]
>>>>>>> Installed OpenSolaris 2008.11 doesn't boot
>>>>>>> jan.damborsky at sun.com 2008-11-10 10:27:21 GMT
>>>>>>>
>>>>>>> *** (#1 of 1): 2008-11-10 10:27:21 GMT+00:00 jan.damborsky at sun.com
>>>>>>>
>>>>>>>
>>>>>>> === *Additional Details* 
>>>>>>> =====================================================
>>>>>>>         Targeted Release:         Commit To Fix In Build:         
>>>>>>> Fixed In Build:         Integrated In Build:         Verified In 
>>>>>>> Build:   See Also: 6769534
>>>>>>>   Duplicate of:   Hooks:
>>>>>>>         Hook1:         Hook2:         Hook3:         
>>>>>>> Hook4:         Hook5:         Hook6:   Interest List: 
>>>>>>> dan.mick at sun.com, dave.miner at sun.com, david.comay at sun.com, 
>>>>>>> frank.batschulat at sun.com, kerberos-iteam at Sun.COM, 
>>>>>>> lin.ling at sun.com, nick.todd at sun.com, peter.dennis at sun.com, 
>>>>>>> plus1tb at sun.com, sdg at sun.com, si-bugs at sun.com, sst-prg at 
>>>>>>> sun.com, 
>>>>>>> tomas.hurka at sun.com
>>>>>>>   Program Management: New Defect
>>>>>>>   Root Cause:   Is a Security Vulnerability?: No
>>>>>>>   Fix Affects Documentation: No
>>>>>>>   Fix Affects Localization: No
>>>>>>>   Reported by:
>>>>>>> === *History* 
>>>>>>> ================================================================
>>>>>>>         Date Submitted: 2008-11-10 10:27:21 GMT+00:00
>>>>>>>         Submitted By: jan.damborsky at sun.com
>>>>>>>
>>>>>>>         Status Changed    Date Updated                  Updated By
>>>>>>>         3-Accepted        2008-11-10 23:59:05 GMT+00:00 
>>>>>>> lin.ling at sun.com
>>>>>>>         5-Cause Known     2008-11-11 03:23:04 GMT+00:00 
>>>>>>> seth.goldberg at sun.com
>>>>>>>         1-Dispatched      2008-11-12 12:43:18 GMT+00:00 
>>>>>>> jan.damborsky at sun.com
>>>>>>>
>>>>>>>
>>>>>>> === *Solution* 
>>>>>>> ===============================================================
>>>>>>>
>>>>>>>
>>>>>>> === *Service Request* 
>>>>>>> ========================================================
>>>>>>>         ID: 1-493023606
>>>>>>>         Customer:
>>>>>>>         Account Name: Sun Microsystems
>>>>>>>         Customer Contact:         Customer Contact Role: 
>>>>>>> D-Development
>>>>>>>         Customer Contact Type: I-Internal (SMI) Customer
>>>>>>>         Impact: Critical
>>>>>>>         Functionality: Primary
>>>>>>>         Severity: 1
>>>>>>>         Synopsis:         Product Name: solaris
>>>>>>>         Product Release: osol_2008.11
>>>>>>>         Product Build:         Operating System: osol_2008.11
>>>>>>>         Hardware: generic
>>>>>>>         Reference Number:         Sun Contact: jan.damborsky at sun.com
>>>>>>>         Status: Open
>>>>>>>         Source: BugTraq2
>>>>>>>         Reproducible:         Submitted By: jan.damborsky at sun.com
>>>>>>>         Submitted Date: 2008-11-10 10:27:21 GMT+00:00
>>>>>>>         Description:
>>>>>>>
>>>>>>> === *Service Request* 
>>>>>>> ========================================================
>>>>>>>         ID: 1-493053806
>>>>>>>         Customer:
>>>>>>>         Account Name: SUN MicroSystems
>>>>>>>         Customer Contact:         Customer Contact Role: 
>>>>>>> D-Development
>>>>>>>         Customer Contact Type: I-Internal (SMI) Customer
>>>>>>>         Impact: Critical
>>>>>>>         Functionality: Primary
>>>>>>>         Severity: 1
>>>>>>>         Synopsis: After installing 2008.11RC1b boot from hard 
>>>>>>> disk fails
>>>>>>>         Product Name: solaris
>>>>>>>         Product Release: osol_2008.11
>>>>>>>         Product Build:         Operating System: osol_2008.11
>>>>>>>         Hardware: x86
>>>>>>>         Reference Number:         Sun Contact: 
>>>>>>> christopher.armes at sun.com
>>>>>>>         Status: Open
>>>>>>>         Source: BugTraq2
>>>>>>>         Reproducible: Always
>>>>>>>         Submitted By: christopher.armes at sun.com
>>>>>>>         Submitted Date: 2008-11-10 12:54:24 GMT+00:00
>>>>>>>         Description: Booting from the livecd and then selecting 
>>>>>>> install works fine upon reboot with either cd in and selecting 
>>>>>>> boot from hard disk or without cd allowing grub menu to boot, 
>>>>>>> causes boot to fail drops system to "grub>" prompt
>>>>>>>
>>>>>>>
>>>>>>> === *Service Request* 
>>>>>>> ========================================================
>>>>>>>         ID: 1-493177108
>>>>>>>         Customer:
>>>>>>>         Account Name: SUN
>>>>>>>         Customer Contact:         Customer Contact Role: 
>>>>>>> D-Development
>>>>>>>         Customer Contact Type: I-Internal (SMI) Customer
>>>>>>>         Impact: Critical
>>>>>>>         Functionality: Primary
>>>>>>>         Severity: 1
>>>>>>>         Synopsis:         Product Name: solaris
>>>>>>>         Product Release: osol_2008.11
>>>>>>>         Product Build: osol_2008.11
>>>>>>>         Operating System: osol_2008.11
>>>>>>>         Hardware: amd
>>>>>>>         Reference Number:         Sun Contact: 
>>>>>>> garrett.damore at sun.com
>>>>>>>         Status:         Source: BugTraq2
>>>>>>>         Reproducible:         Submitted By: garrett.damore at sun.com
>>>>>>>         Submitted Date: 2008-11-10 20:16:41 GMT+00:00
>>>>>>>         Description: I hit this when updating my Ultra 20 
>>>>>>> (original model, not M2) from b77ish to OSOL 2008.11rc1b
>>>>>>>
>>>>>>> System has 1.5GB ram, SATA hard disk.
>>>>>>>
>>>>>>>
>>>>>>> === *Service Request* 
>>>>>>> ========================================================
>>>>>>>         ID: 1-493257401
>>>>>>>         Customer:
>>>>>>>         Account Name: Sun Microsystems, Inc.
>>>>>>>         Customer Contact:         Customer Contact Role: 
>>>>>>> D-Development
>>>>>>>         Customer Contact Type: I-Internal (SMI) Customer
>>>>>>>         Impact: Critical
>>>>>>>         Functionality: Primary
>>>>>>>         Severity: 1
>>>>>>>         Synopsis:         Product Name: solaris
>>>>>>>         Product Release: osol_2008.11
>>>>>>>         Product Build: osol_2008.11
>>>>>>>         Operating System: osol_2008.11
>>>>>>>         Hardware: generic_ibm_compatible
>>>>>>>         Reference Number:         Sun Contact: dana.myers at sun.com
>>>>>>>         Status: Open
>>>>>>>         Source: BugTraq2
>>>>>>>         Reproducible:         Submitted By: dana.myers at sun.com
>>>>>>>         Submitted Date: 2008-11-10 22:34:45 GMT+00:00
>>>>>>>         Description:
>>>>>>>
>>>>>>> === *Service Request* 
>>>>>>> ========================================================
>>>>>>>         ID: 1-493265801
>>>>>>>         Customer:
>>>>>>>         Account Name: Sun Microsystems
>>>>>>>         Customer Contact: pawel.wojcik at sun.com
>>>>>>>         Customer Contact Role: D-Development
>>>>>>>         Customer Contact Type: I-Internal (SMI) Customer
>>>>>>>         Impact: Critical
>>>>>>>         Functionality: Primary
>>>>>>>         Severity: 1
>>>>>>>         Synopsis:         Product Name: solaris
>>>>>>>         Product Release: osol_2008.11
>>>>>>>         Product Build: osol_2008.11
>>>>>>>         Operating System: solaris
>>>>>>>         Hardware: intel
>>>>>>>         Reference Number:         Sun Contact: pawel.wojcik at sun.com
>>>>>>>         Status:         Source: BugTraq2
>>>>>>>         Reproducible:         Submitted By: pawel.wojcik at sun.com
>>>>>>>         Submitted Date: 2008-11-10 22:50:53 GMT+00:00
>>>>>>>         Description:
>>>>>>>
>>>>>>> === *Activity* 
>>>>>>> ===============================================================
>>>>>>>
>>>>>>>
>>>>>>> === *Multiple Release (MR) Cluster* - 0 
>>>>>>> ======================================
>>>>>>>
>>>>>>>
>>>>>>> === *Escalations* 
>>>>>>> ============================================================
>>>>>>>
>>>>>>>   
> 
> _______________________________________________
> caiman-discuss mailing list
> caiman-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/caiman-discuss


Reply via email to