Re: [zfs-discuss] [mdb-discuss] onnv_142 - vfs_mountroot: cannot mount root
Hi, The issue here was using DKIOCGMEDIAINFOEXT by ZFS introduced in changeset 12208. Forcing DKIOCGMEDIAINFO solved that. On Tue, Sep 7, 2010 at 4:35 PM, Gavin Maltby wrote: > On 09/07/10 23:26, Piotr Jasiukajtis wrote: >> >> Hi, >> >> After upgrade from snv_138 to snv_142 or snv_145 I'm unable to boot the >> system. >> Here is what I get. >> >> Any idea why it's not able to import rpool? >> >> I saw this issue also on older builds on a different machines. > > This sounds (based on the presence of cpqary) not unlike: > > 6972328 Installation of snv_139+ on HP BL685c G5 fails due to panic during > auto install process > > which was introduced into onnv_139 by the fix for this > > 6927876 For 4k sector support, ZFS needs to use DKIOCGMEDIAINFOEXT > > The fix is in onnv_148 after the external push switch-off, fixed via > > 6967658 sd_send_scsi_READ_CAPACITY_16() needs to handle SBC-2 and SBC-3 > response formats > > I experienced this on data pools rather than the rpool, but I suspect on the > rpool > you'd get the vfs_mountroot panic you see when rpool import fails. My > workaround > was to compile a zfs with the fix for 6927876 changed to force the default > physical block size of 512 and drop that into the BE before booting to it. > There was no simpler workaround available. > > Gavin > -- Piotr Jasiukajtis | estibi | SCA OS0072 http://estseg.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] dedup testing?
AFAIK that part of dedup code is not changed in b147. On Sat, Sep 25, 2010 at 6:54 PM, Roy Sigurd Karlsbakk wrote: > Hi all > > Has anyone done any testing with dedup with OI? On opensolaris there is a > nifty "feature" that allows the system to hang for hours or days if > attempting to delete a dataset on a deduped pool. This is said to be fixed, > but I haven't seen that myself, so I'm just wondering... > > I'll get a 10TB test box released for testing OI in a few weeks, but before > than, has anyone tested this? > > Vennlige hilsener / Best regards > > roy > -- > Roy Sigurd Karlsbakk > (+47) 97542685 > r...@karlsbakk.net > http://blogg.karlsbakk.net/ > -- > I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det > er et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av > idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og > relevante synonymer på norsk. > > ___ > OpenIndiana-discuss mailing list > openindiana-disc...@openindiana.org > http://openindiana.org/mailman/listinfo/openindiana-discuss > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > -- Piotr Jasiukajtis | estibi | SCA OS0072 http://estseg.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [mdb-discuss] mdb -k - I/O usage
TYPE STAGEWAITER ff05664b0328 NULL CHECKSUM_VERIFY ff051bb13b00 ff05628fa680WRITE VDEV_IO_START- ff0567d15370 WRITE VDEV_IO_START- ff0567409ce0 WRITE VDEV_IO_START- ff0566cbf968 WRITE VDEV_IO_START- ff056748cca8WRITE VDEV_IO_START- ff055b184028 WRITE VDEV_IO_START- ff0567482328 WRITE VDEV_IO_START- ff0562f73658 WRITE VDEV_IO_START- ff04eb660060 NULL OPEN - ff04e96f9c88 NULL OPEN - ff05207bd658 NULL CHECKSUM_VERIFY ff001fe7fc60 ff055bc67060WRITE VDEV_IO_START- ff0568160048 WRITE VDEV_IO_START- ff05661fbca8 WRITE VDEV_IO_START- ff0566edacc0 WRITE VDEV_IO_START- ff05665d5018 WRITE VDEV_IO_START- ff05667c3008 WRITE VDEV_IO_START- ff05664b39c0 WRITE VDEV_IO_START- ff051cea6010 WRITE VDEV_IO_START- ff051d70 WRITE VDEV_IO_START- ff0521255048 WRITE VDEV_IO_START- This is not all output. > ::walk zio_root | ::zio -r ! wc -l 7099 I am hitting this issue on 2 machines, both 128. The system is not response (ping still works) so I bet there is some kind of deadlock within ZFS. Were there any known ZFS related bugs similar to this one within b128? On Mon, Sep 6, 2010 at 12:13 PM, Jason Banham wrote: > On 06/09/2010 10:56, Piotr Jasiukajtis wrote: >> >> Hi, >> >> I am looking for the ideas on how to check if the machine was under >> high I/O pressure before it panicked (caused manually by an NMI). >> By I/O I mean disks and ZFS stack. >> > > Do you believe ZFS was a key component in the I/O pressure? > I've CC'd zfs-discuss@opensolaris.org on my reply. > > If you think there was a lot of I/O happening, you could run: > > ::walk zio_root | ::zio -r > > This should give you an idea of the amount of ZIO going through ZFS. > I would also be curious to look at the state of the pool(s) and the > ZFS memory usage: > > ::spa -ev > ::arc > > > > > Kind regards, > > Jason > ___ > mdb-discuss mailing list > mdb-disc...@opensolaris.org > -- Piotr Jasiukajtis | estibi | SCA OS0072 http://estseg.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [mdb-discuss] mdb -k - I/O usage
I don't have any errors from fmdump or syslog. The machine is SUN FIRE X4275 I don't use mpt or lsi drivers. It could be a bug in a driver since I see this on 2 the same machines. On Fri, Sep 10, 2010 at 9:51 PM, Carson Gaspar wrote: > On 9/10/10 4:16 PM, Piotr Jasiukajtis wrote: >> >> Ok, now I know it's not related to the I/O performance, but to the ZFS >> itself. >> >> At some time all 3 pools were locked in that way: >> >> extended device statistics errors >> --- >> r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b s/w h/w >> trn tot device >> 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 1 >> 0 1 c8t0d0 >> 0.0 0.0 0.0 0.0 0.0 8.0 0.0 0.0 0 100 0 0 >> 0 0 c7t0d0 > > Nope, most likely your disks or disk controller/driver. Note that you have 8 > outstanding I/O requests that aren't being serviced. Look in your syslog, > and I bet you'll see I/O timeout errors. I have seen this before with > Western Digital disks attached to an LSI controller using the mpt driver. > There was a lot of work diagnosing it, see the list archives - an > /etc/system change fixed it for me (set xpv_psm:xen_support_msi = -1), but I > was using a xen kernel. Note that replacing my disks with larger Seagate > ones made the problem go away as well. > _______ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > -- Piotr Jasiukajtis | estibi | SCA OS0072 http://estseg.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [mdb-discuss] mdb -k - I/O usage
Ok, now I know it's not related to the I/O performance, but to the ZFS itself. At some time all 3 pools were locked in that way: extended device statistics errors --- r/sw/s kr/s kw/s wait actv wsvc_t asvc_t %w %b s/w h/w trn tot device 0.00.00.00.0 0.0 0.00.00.0 0 0 0 1 0 1 c8t0d0 0.00.00.00.0 0.0 8.00.00.0 0 100 0 0 0 0 c7t0d0 0.00.00.00.0 0.0 8.00.00.0 0 100 0 0 0 0 c7t1d0 0.00.00.00.0 0.0 4.00.00.0 0 100 0 0 0 0 c7t2d0 0.00.00.00.0 0.0 4.00.00.0 0 100 0 0 0 0 c7t3d0 0.00.00.00.0 0.0 4.00.00.0 0 100 0 0 0 0 c7t4d0 0.00.00.00.0 0.0 4.00.00.0 0 100 0 0 0 0 c7t5d0 0.00.00.00.0 0.0 4.00.00.0 0 100 0 0 0 0 c7t10d0 0.00.00.00.0 0.0 3.00.00.0 0 100 0 0 0 0 c7t11d0 ^C # zpool status pool: data state: ONLINE scrub: none requested config: NAMESTATE READ WRITE CKSUM dataONLINE 0 0 0 mirror-0 ONLINE 0 0 0 c7t2d0 ONLINE 0 0 0 c7t3d0 ONLINE 0 0 0 mirror-1 ONLINE 0 0 0 c7t4d0 ONLINE 0 0 0 c7t5d0 ONLINE 0 0 0 errors: No known data errors pool: rpool state: ONLINE status: The pool is formatted using an older on-disk format. The pool can still be used, but some features are unavailable. action: Upgrade the pool using 'zpool upgrade'. Once this is done, the pool will no longer be accessible on older software versions. scrub: none requested config: NAME STATE READ WRITE CKSUM rpool ONLINE 0 0 0 mirror-0ONLINE 0 0 0 c7t0d0s0 ONLINE 0 0 0 c7t1d0s0 ONLINE 0 0 0 errors: No known data errors pool: tmp_data state: ONLINE status: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state. action: Wait for the resilver to complete. scrub: resilver in progress for 0h1m, 0.74% done, 2h21m to go config: NAME STATE READ WRITE CKSUM tmp_data ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 c7t11d0 ONLINE 0 0 0 c7t10d0 ONLINE 0 0 0 2.07G resilvered errors: No known data errors Resilvering tmp_data is not related. I did zpool attach manually. On Tue, Sep 7, 2010 at 12:39 PM, Piotr Jasiukajtis wrote: > This is snv_128 x86. > >> ::arc > hits = 39811943 > misses = 630634 > demand_data_hits = 29398113 > demand_data_misses = 490754 > demand_metadata_hits = 10413660 > demand_metadata_misses = 133461 > prefetch_data_hits = 0 > prefetch_data_misses = 0 > prefetch_metadata_hits = 170 > prefetch_metadata_misses = 6419 > mru_hits = 2933011 > mru_ghost_hits = 43202 > mfu_hits = 36878818 > mfu_ghost_hits = 45361 > deleted = 1299527 > recycle_miss = 46526 > mutex_miss = 355 > evict_skip = 25539 > evict_l2_cached = 0 > evict_l2_eligible = 77011188736 > evict_l2_ineligible = 76253184 > hash_elements = 278135 > hash_elements_max = 279843 > hash_collisions = 1653518 > hash_chains = 75135 > hash_chain_max = 9 > p = 4787 MB > c = 5722 MB > c_min = 715 MB > c_max = 5722 MB > size = 5428 MB > hdr_size = 56535840 > data_size = 5158287360 > other_size = 477726560 > l2_hits = 0 > l2_misses = 0 > l2_feeds = 0 > l2_rw_clash = 0 > l2_read_bytes = 0 > l2_write_bytes = 0 > l2_writes_sent = 0 > l2_writes_done = 0 > l2_writes_error = 0 > l2_writes_hdr_miss = 0 > l2_evict_lock_retry = 0 > l2_evict_reading = 0 > l2_free_on_write = 0 > l2_abort_lowmem = 0 > l2_cksum_bad
[zfs-discuss] ZFS dedup issue
Hi, Lets take a look: # zpool list NAMESIZE USED AVAILCAP DEDUP HEALTH ALTROOT rpool68G 13.9G 54.1G20% 42.27x ONLINE - # zfs get all rpool/export/data NAME PROPERTYVALUE SOURCE rpool/export/data typefilesystem - rpool/export/data creationMon Nov 2 16:11 2009 - rpool/export/data used46.7G - rpool/export/data available 38.7M - rpool/export/data referenced 46.7G - rpool/export/data compressratio 1.00x - rpool/export/data mounted yes - rpool/export/data quota none default rpool/export/data reservation none default rpool/export/data recordsize 128K default rpool/export/data mountpoint /export/data inherited from rpool/export rpool/export/data sharenfsoff default rpool/export/data checksumon default rpool/export/data compression off default rpool/export/data atime on default rpool/export/data devices on default rpool/export/data execon default rpool/export/data setuid on default rpool/export/data readonlyoff default rpool/export/data zoned off default rpool/export/data snapdir hidden default rpool/export/data aclmode groupmask default rpool/export/data aclinherit restricted default rpool/export/data canmounton default rpool/export/data shareiscsi off default rpool/export/data xattr on default rpool/export/data copies 1 default rpool/export/data version 4 - rpool/export/data utf8onlyoff - rpool/export/data normalization none - rpool/export/data casesensitivity sensitive - rpool/export/data vscan off default rpool/export/data nbmand off default rpool/export/data sharesmboff default rpool/export/data refquotanone default rpool/export/data refreservation none default rpool/export/data primarycacheall default rpool/export/data secondarycache all default rpool/export/data usedbysnapshots 0 - rpool/export/data usedbydataset 46.7G - rpool/export/data usedbychildren 0 - rpool/export/data usedbyrefreservation0 - rpool/export/data logbias latency default rpool/export/data dedup on local rpool/export/data org.opensolaris.caiman:install ready inherited from rpool # df -h FilesystemSize Used Avail Use% Mounted on rpool/ROOT/os_b123_dev 2.4G 2.4G 40M 99% / swap 9.1G 336K 9.1G 1% /etc/svc/volatile /usr/lib/libc/libc_hwcap1.so.1 2.4G 2.4G 40M 99% /lib/libc.so.1 swap 9.1G 0 9.1G 0% /tmp swap 9.1G 40K 9.1G 1% /var/run rpool/export 40M 25K 40M 1% /export rpool/export/home 40M 30K 40M 1% /export/home rpool/export/home/admin 460M 421M 40M 92% /export/home/admin rpool 40M 83K 40M 1% /rpool rpool/expo