Re: [zfs-discuss] HP Proliant DL360 G7

2013-01-08 Thread mark

 
 On Jul 2, 2012, at 7:57 PM, Richard Elling wrote:
 

 
 FYI, HP also sells an 8-port IT-style HBA (SC-08Ge), but it is hard to locate 
 with their configurators. There might be a more modern equivalent cleverly
 hidden somewhere difficult to find.
  -- richard
 

Richard,

Do you know if the HBAs in HP controllers be swapped out with any well 
characterized (by nexenta) HBAs like the 9211-8e or do they require a specific 
'controller HBA' like the SC-08Ge?  IE, does it void the warranty if you open 
up 
the controller and stick a third party card in there?  Did you ever try to 
'bypass' the controllers at all and just plug into an expander?  I prefer HP 
hardware also but the controller is getting in the way.

Ill be asking HP the same questions in the next few weeks with any luck but 
your 
opinion and experiences are on another level compared to HPs pre-sales 
department... not that theyre bad but in this realm youre the man :)

Thanks,
Mark



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] HP Proliant DL360 G7

2013-01-08 Thread Mark -
Good call Saso.  Sigh... I guess I wait to hear from HP on supported IT
mode HBAs in their D2000s or other jbods.


On Tue, Jan 8, 2013 at 11:40 AM, Sašo Kiselkov skiselkov...@gmail.comwrote:

 On 01/08/2013 04:27 PM, mark wrote:
  On Jul 2, 2012, at 7:57 PM, Richard Elling wrote:
 
  FYI, HP also sells an 8-port IT-style HBA (SC-08Ge), but it is hard to
 locate
  with their configurators. There might be a more modern equivalent
 cleverly
  hidden somewhere difficult to find.
   -- richard
 
 
  Richard,
 
  Do you know if the HBAs in HP controllers be swapped out with any well
  characterized (by nexenta) HBAs like the 9211-8e or do they require a
 specific
  'controller HBA' like the SC-08Ge?  IE, does it void the warranty if you
 open up
  the controller and stick a third party card in there?  Did you ever try
 to
  'bypass' the controllers at all and just plug into an expander?  I
 prefer HP
  hardware also but the controller is getting in the way.
 
  Ill be asking HP the same questions in the next few weeks with any luck
 but your
  opinion and experiences are on another level compared to HPs pre-sales
  department... not that theyre bad but in this realm youre the man :)

 I know you didn't ask me, but I can tell you my experience: it depends
 on what you mean by warranty. If you mean as in warranty on sales of
 goods (as mandated by law), then no, sticking a different HBA in your
 servers does not void your warranty (unless this is expressly labeled on
 the product - manufacturers typically also put protective labels on
 screws then).

 When it comes to support services, though, such as phone support and
 firmware updates, then yes, using a third-party HBA can make these
 difficult and/or impossible. HP storage enclosure and drive firmware,
 for example, can only be flashed through an HP-branded SmartArray card.

 Depending on what software you are running on the machines it can make
 no difference at all, or a lot of difference. For instance, if you're
 running proprietary storage controller software on the server (think
 something like NexentaStor, but from the HW vendor), then your custom
 HBA might simply be flat out unsupported and the only response you'll
 get from the vendor support team is stick the card we shipped it with
 back in. OTOH if you're running something not HW vendor-specific (like
 the aforementioned NexentaStor, or any other Illumos variant), and the
 HW vendor at least gives lip service to supporting your platform (always
 tell the support folk you're running Solaris), then chances are that
 your support contract will be just as valid as before. I've had drives
 fail on Dell machines and each time support was happy when I just told
 them drive dead, running Solaris, here's the log output, send a new one
 please.

 Cheers,
 --
 Saso

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Repairing corrupted ZFS pool

2012-11-19 Thread Mark Shellenbaum

On 11/16/12 17:15, Peter Jeremy wrote:

I have been tracking down a problem with zfs diff that reveals
itself variously as a hang (unkillable process), panic or error,
depending on the ZFS kernel version but seems to be caused by
corruption within the pool.  I am using FreeBSD but the issue looks to
be generic ZFS, rather than FreeBSD-specific.

The hang and panic are related to the rw_enter() in
opensolaris/uts/common/fs/zfs/zap.c:zap_get_leaf_byblk()



There is probably nothing wrong with the snapshots.  This is a bug in 
ZFS diff.  The ZPL parent pointer is only guaranteed to be correct for 
directory objects.  What you probably have is a file that was hard 
linked multiple times and the parent pointer (i.e. directory) was 
recycled and is now a file




The error is:
Unable to determine path or stats for object 2128453 in 
tank/beckett/home@20120518: Invalid argument

A scrub reports no issues:
root@FB10-64:~ # zpool status
   pool: tank
  state: ONLINE
status: The pool is formatted using a legacy on-disk format.  The pool can
 still be used, but some features are unavailable.
action: Upgrade the pool using 'zpool upgrade'.  Once this is done, the
 pool will no longer be accessible on software that does not support 
feature
 flags.
   scan: scrub repaired 0 in 3h24m with 0 errors on Wed Nov 14 01:58:36 2012
config:

 NAMESTATE READ WRITE CKSUM
 tankONLINE   0 0 0
   ada2  ONLINE   0 0 0

errors: No known data errors

But zdb says that object is the child of a plain file - which isn't sane:

root@FB10-64:~ # zdb -vvv tank/beckett/home@20120518 2128453
Dataset tank/beckett/home@20120518 [ZPL], ID 605, cr_txg 8379, 143G, 2026419 objects, 
rootbp DVA[0]=0:266a0efa00:200  DVA[1]=0:31b07fbc00:200  [L0 DMU objset] 
fletcher4 lzjb LE contiguous unique double size=800L/200P birth=8375L/8375P fill=2026419 
cksum=1acdb1fbd9:93bf9c61e94:1b35c72eb8adb:389743898e4f79

 Object  lvl   iblk   dblk  dsize  lsize   %full  type
2128453116K  1.50K  1.50K  1.50K  100.00  ZFS plain file
 264   bonus  ZFS znode
 dnode flags: USED_BYTES USERUSED_ACCOUNTED
 dnode maxblkid: 0
 path???object#2128453
 uid 1000
 gid 1000
 atime   Fri Mar 23 16:34:52 2012
 mtime   Sat Oct 22 16:13:42 2011
 ctime   Sun Oct 23 21:09:02 2011
 crtime  Sat Oct 22 16:13:42 2011
 gen 2237174
 mode100444
 size1089
 parent  2242171
 links   1
 pflags  4080004
 xattr   0
 rdev0x

root@FB10-64:~ # zdb -vvv tank/beckett/home@20120518 2242171
Dataset tank/beckett/home@20120518 [ZPL], ID 605, cr_txg 8379, 143G, 2026419 objects, 
rootbp DVA[0]=0:266a0efa00:200  DVA[1]=0:31b07fbc00:200  [L0 DMU objset] 
fletcher4 lzjb LE contiguous unique double size=800L/200P birth=8375L/8375P fill=2026419 
cksum=1acdb1fbd9:93bf9c61e94:1b35c72eb8adb:389743898e4f79

 Object  lvl   iblk   dblk  dsize  lsize   %full  type
2242171316K   128K  25.4M  25.5M  100.00  ZFS plain file
 264   bonus  ZFS znode
 dnode flags: USED_BYTES USERUSED_ACCOUNTED
 dnode maxblkid: 203
 path/jashank/Pictures/sch/pdm-a4-11/stereo-pair-2.png
 uid 1000
 gid 1000
 atime   Fri Mar 23 16:41:53 2012
 mtime   Mon Oct 24 21:15:56 2011
 ctime   Mon Oct 24 21:15:56 2011
 crtime  Mon Oct 24 21:15:37 2011
 gen 2286679
 mode100644
 size26625731
 parent  7001490
 links   1
 pflags  4080004
 xattr   0
 rdev0x

root@FB10-64:~ # zdb -vvv tank/beckett/home@20120518 7001490
Dataset tank/beckett/home@20120518 [ZPL], ID 605, cr_txg 8379, 143G, 2026419 objects, 
rootbp DVA[0]=0:266a0efa00:200  DVA[1]=0:31b07fbc00:200  [L0 DMU objset] 
fletcher4 lzjb LE contiguous unique double size=800L/200P birth=8375L/8375P fill=2026419 
cksum=1acdb1fbd9:93bf9c61e94:1b35c72eb8adb:389743898e4f79

 Object  lvl   iblk   dblk  dsize  lsize   %full  type
7001490116K512 1K512  100.00  ZFS directory
 264   bonus  ZFS znode
 dnode flags: USED_BYTES USERUSED_ACCOUNTED
 dnode maxblkid: 0
 path/jashank/Pictures/sch/pdm-a4-11
 uid 1000
 gid 1000
 atime   Thu May 17 03:38:32 2012
 mtime   Mon Oct 24 21:15:37 2011
 ctime   Mon Oct 24 21:15:37 2011
 crtime  Fri Oct 14 22:17:44 2011
 gen 2088407
 mode40755
 size6
 parent  6370559
 links   2
 pflags  4080144
 xattr   0
 rdev0x
 microzap: 512 bytes, 4 entries

 

Re: [zfs-discuss] Repairing corrupted ZFS pool

2012-11-19 Thread Mark Shellenbaum

On 11/19/12 1:14 PM, Jim Klimov wrote:

On 2012-11-19 20:58, Mark Shellenbaum wrote:

There is probably nothing wrong with the snapshots.  This is a bug in
ZFS diff.  The ZPL parent pointer is only guaranteed to be correct for
directory objects.  What you probably have is a file that was hard
linked multiple times and the parent pointer (i.e. directory) was
recycled and is now a file


Interesting... do the ZPL files in ZFS keep pointers to parents?



The parent pointer for hard linked files is always set to the last link 
to be created.


$ mkdir dir.1
$ mkdir dir.2
$ touch dir.1/a
$ ln dir.1/a dir.2/a.linked
$ rm -rf dir.2

Now the parent pointer for a will reference a removed directory.

The parent pointer is a single 64 bit quantity that can't track all the 
possible parents a hard linked file could have.


Now when the original dir.2 object number is recycled you could have a 
situation where the parent pointer for points to a non-directory.


The ZPL never uses the parent pointer internally.  It is only used by 
zfs diff and other utility code to translate object numbers to full 
pathnames.  The ZPL has always set the parent pointer, but it is more 
for debugging purposes.



How in the COW transactiveness could the parent directory be
removed, and not the pointer to it from the files inside it?
Is this possible in current ZFS, or could this be a leftover
in the pool from its history with older releases?

Thanks,
//Jim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Trick to keeping NFS file references in kernel memory for Dtrace?

2012-10-03 Thread Mark

Hey all,

So I have a couple of storage boxes (NexentaCore  Illumian) and have 
been playing with some DTrace scripts to monitor NFS usage.  Initially I 
ran into the (seemingly common) problem of basically everything showing 
up as 'Unknown', and then after some searching online I found a 
workaround was to do a 'find' on the file system from the remote end and 
it would refresh the kernels knowledge of the files.  This works.. 
however it doesn't stay for good.  It seems to sometimes last a couple 
of hours (and sometimes much less) and then we are back to receiving 
Unknown's.


Has anyone else come across something similar?  Does anyone know what 
may be causing the kernel to lose the references?  There is plenty of 
memory in the main system (72gb with ARC sitting ~53gb and 11gb 'free'), 
so I don't think a OOM situation is causing it.



Otherwise does anyone have any other tips for monitoring usage?  I 
wonder how they have it all working in Fishworks gear as some of the 
analytics demos show you being able to drill down on through file 
activity in real time.



Any advice or suggestions greatly appreciated.

Cheers,
Mark
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Zpool recovery after too many failed disks

2012-08-27 Thread Mark Wolek
RAIDz set, lost a disk, replaced it... lost another disk during resilver.  
Replaced it, ran another resilver, and now it shows all disks with too many 
errors.

Safe to say this is getting rebuilt and restored, or is there hope to recover 
some of the data?  I assume this is the case because rpool/filemover has 
errors, is that fixable?



# zpool status -v
  pool: rpool
state: DEGRADED
status: One or more devices has experienced an error resulting in data
corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
entire pool from backup.
   see: http://www.sun.com/msg/ZFS-8000-8A
scrub: resilver completed after 4h51m with 190449 errors on Sat Aug 25 05:45:12 
2012
config:

NAME  STATE READ WRITE CKSUM
rpool DEGRADED  455K 0 0
  raidz1  DEGRADED  455K 0 0
c3t0d0DEGRADED 0 0 0  too many errors
c2t1d0DEGRADED 0 0 0  too many errors
replacing UNAVAIL  0 0 0  insufficient replicas
  c2t0d0s0/o  FAULTED  0 0 0  too many errors
  c2t0d0  FAULTED  0 0 0  too many errors
c3t1d0DEGRADED 0 0 0  too many errors
c4t0d0DEGRADED 0 0 0  too many errors
c4t1d0DEGRADED 0 0 0  too many errors

errors: Permanent errors have been detected in the following files:

rpool/filemover:0x1

# zfs list
NAME  USED  AVAIL  REFER  MOUNTPOINT
rpool6.64T  0  29.9K  /rpool
rpool/filemover  6.64T   323G  6.32T  -

Thanks
Mark

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Problem with ESX NFS store on ZFS

2012-03-01 Thread Mark Wolek
Thank you, it was the NFS ACL I had wrong!  Fixed now and working on all 3 
nodes.  I changed below and it works now, very simple can't believe I missed 
that

zfs get sharenfs
pool1/nas/vol1 sharenfs  rw,nosuid,root=192.168.1.52  local

zfs get sharenfs
pool1/nas/vol1 sharenfs  rw,nosuid,root=192.168.1.52:192.168.1.51:192.168.1.53  
local

-Original Message-
From: Jim Klimov [mailto:jimkli...@cos.ru] 
Sent: Wednesday, February 29, 2012 1:44 PM
To: Mark Wolek
Cc: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] Problem with ESX NFS store on ZFS

2012-02-29 21:15, Mark Wolek wrote:
 Running Solaris 11 with ZFS and the VM's on this storage can only be 
 opened and run on 1 ESX host, if I move the files to another host I 
 get access denied, even though root has full permissions to the files.

 Any ideas or does it ring any bells for anyone before I contact VMware 
 or something?



Probably, NFS UID mappng is faulty, or the NFS server ACL does not allow for 
another server.

For UID mapping, in particular see the domain name settings:
/etc/defaultdomain
/etc/resolv.conf (search, domain lines)
/etc/default/nfs or appropriate SMF settings (NFSMAPID_DOMAIN)

For NFS ACL see the sharenfs property:

# zfs set
sharenfs='rw=esx:cvs:.domain.com:.jumbo.domain.com:@192.168.127.0/24,root=esx:cvs:192.168.127.99'
 
pool/esxfiles

Critical fields are 'rw', 'ro' and 'root' lists of hosts or subnets of clients 
which have appropriate types of access.
For hosts not in 'root' list, their allowed 'ro' or 'rw'
access as root user will be remapped to nobody.
You might also want 'anon=0,sec=sys' which seem to be appended by default on my 
installations of Solaris, not sure if it is the default in Sol11.

Note that clients' hostnames can be resolved via /etc/hosts, DNS or LDAP, as 
configured in your /etc/nsswitch.conf, and sometimes via /etc/inet/ipnodes as 
well as a fallback mechanism.
Your server only gets one shot at resolving the client's name, and if it is not 
literally the same as in NFS ACL, access is denied. You might want to fall back 
to domain-based or subnet-based ACLs (may require the @ character).

For pointers to server-side ACL denials see the server's dmesg with entries 
resembling this:

Feb 29 19:35:01 thumper mountd[10782]: [ID 770583 daemon.error] 
esx.demo.domain.com denied access to /esxfiles/vm5

In particular, the entry produces the client's hostname as the server resolved 
it, so you can see if your ACL (or naming service) was misconfigured.

HTH,
//Jim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Problem with ESX NFS store on ZFS

2012-02-29 Thread Mark Wolek
Running Solaris 11 with ZFS and the VM's on this storage can only be opened and 
run on 1 ESX host, if I move the files to another host I get access denied, 
even though root has full permissions to the files.

Any ideas or does it ring any bells for anyone before I contact VMware or 
something?

Thanks
Mark
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Improving L1ARC cache efficiency with dedup

2011-12-08 Thread Mark Musante

You can see the original ARC case here:

http://arc.opensolaris.org/caselog/PSARC/2009/557/20091013_lori.alt

On 8 Dec 2011, at 16:41, Ian Collins wrote:

 On 12/ 9/11 12:39 AM, Darren J Moffat wrote:
 On 12/07/11 20:48, Mertol Ozyoney wrote:
 Unfortunetly the answer is no. Neither l1 nor l2 cache is dedup aware.
 
 The only vendor i know that can do this is Netapp
 
 In fact , most of our functions, like replication is not dedup aware.
 For example, thecnicaly it's possible to optimize our replication that
 it does not send daya chunks if a data chunk with the same chechsum
 exists in target, without enabling dedup on target and source.
 We already do that with 'zfs send -D':
 
   -D
 
   Perform dedup processing on the stream. Deduplicated
   streams  cannot  be  received on systems that do not
   support the stream deduplication feature.
 
 
 
 
 Is there any more published information on how this feature works?
 
 -- 
 Ian.
 
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] First zone creation - getting ZFS error

2011-12-06 Thread Mark Creamer
I'm running OI 151a. I'm trying to create a zone for the first time, and am
getting an error about zfs. I'm logged in as me, then su - to root before
running these commands.

I have a pool called datastore, mounted at /datastore

Per the wiki document http://wiki.openindiana.org/oi/Building+in+zones, I
first created the zfs file system (note that the command syntax in the
document appears to be wrong, so I did the options I wanted separately):

zfs create datastore/zones
zfs set compression=on datastore/zones
zfs set mountpoint=/zones datastore/zones

zfs list shows:

NAME USED  AVAIL  REFER  MOUNTPOINT
datastore   28.5M  7.13T  57.9K  /datastore
datastore/dbdata28.1M  7.13T  28.1M  /datastore/dbdata
datastore/zones 55.9K  7.13T  55.9K  /zones
rpool   27.6G   201G45K  /rpool
rpool/ROOT  2.89G   201G31K  legacy
rpool/ROOT/openindiana  2.89G   201G  2.86G  /
rpool/dump  12.0G   201G  12.0G  -
rpool/export5.53M   201G32K  /export
rpool/export/home   5.50M   201G32K  /export/home
rpool/export/home/mcreamer  5.47M   201G  5.47M  /export/home/mcreamer
rpool/swap  12.8G   213G   137M  -

Then I went about creating the zone:

zonecfg -z zonemaster
create
set autoboot=true
set zonepath=/zones/zonemaster
set ip-type=exclusive
add net
set physical=vnic0
end
exit

That all goes fine, then...

zoneadm -z zonemaster install

which returns...

ERROR: the zonepath must be a ZFS dataset.
The parent directory of the zonepath must be a ZFS dataset so that the
zonepath ZFS dataset can be created properly.

Since the zfs dataset datastore/zones is created, I don't understand what
the error is trying to get me to do. Do I have to do:

zfs create datastore/zones/zonemaster

before I can create a zone in that path? That's not in the documentation,
so I didn't want to do anything until someone can point out my error for
me. Thanks for your help!

-- 
Mark
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Log disk with all ssd pool?

2011-10-28 Thread Mark Wolek
Still kicking around this idea and didn't see it addressed in any of the 
threads before the forum closed.

If one made an all ssd pool, would a log/cache drive just slow you down?  Would 
zil slow you down?  Thinking rotate MLC drives with sandforce controllers every 
few years to avoid losing a drive to sorry no more writes aloud scenarios.

Thanks
Mark


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Log disk with all ssd pool?

2011-10-28 Thread Mark Wolek
Having the log disk slowed it down a lot in your tests (when it wasn't a SSD), 
30MB/s vs 7.  Is this is also a 100% write / 100% sequential workload?  Forcing 
sync?

It's gotten to the point where I can buy a 120G SSD for less or the same price 
as a 146G SAS disk...Sure the MLC drives have limited lifetime, but at $150 
(and dropping) just replace them every few years to be safe, work out a 
rotation/rebuild cycle, it's tempting...  I suppose if we do end up buying all 
SSD's it becomes really easy to test if we should use a log or not!


From: zfs-discuss-boun...@opensolaris.org 
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Neil Perrin
Sent: Friday, October 28, 2011 11:38 AM
To: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] Log disk with all ssd pool?

On 10/28/11 00:54, Neil Perrin wrote:

On 10/28/11 00:04, Mark Wolek wrote:
Still kicking around this idea and didn't see it addressed in any of the 
threads before the forum closed.

If one made an all ssd pool, would a log/cache drive just slow you down?  Would 
zil slow you down?  Thinking rotate MLC drives with sandforce controllers every 
few years to avoid losing a drive to sorry no more writes aloud scenarios.

Thanks
Mark

Interesting question. I don't think there's a straightforward answer. Oracle 
uses write optimised log devices and read optimised cache devices in it's 
appliances. However, assuming all the SSDs are the same then I suspect neither 
a log nor a cache device would help:

Log
If there is a log then it is solely used, and can be written to in parallel 
with periodic TXG commit writes to the other pool devices.  If that log were 
part of the pool then the ZIL code will spread the load among all pool devices, 
but will compete with TXG commit writes.  My gut feeling is that this would be 
the higher performing option though.  I think, a long time ago, I experimented 
with designating one disk out of the pool as a log and saw degradation on 
synchronous performance. That seems to be the equivalent to your SSD question.

Cache
Similarly for cache devices the read would compete at TXG commit writes, but 
otherwise performance ought to be higher.

Neil.
Did some quick tests with disks to check if my memory was correct.
'sb' is a simple problem to spawn a number of threads to fill a file of a 
certain size
with specified sized non zero writes. Bandwidth is also important.

1. Simple 2 disk system.
   32KB synchronous writes filling 1GB with 20 threads

zpool create whirl  2 disks; zfs set recordsize=32k whirl
st1 -n /whirl/f -f 1073741824 -b 32768 -t 20
Elapsed time 95s  10.8MB/s

zpool create whirl disk log disk ; zfs set recordsize=32k whirl
st1 -n /whirl/f -f 1073741824 -b 32768 -t 20
Elapsed time 151s  6.8MB/s

2. Higher end 6 disk system.
   32KB synchronous writes filling 1GB with 100 threads

zpool create whirl 6 disks; zfs set recordsize=32k whirl
st1 -n /whirl/f -f 1073741824 -b 32768 -t 100
Elapsed time 33s  31MB/s

zpool create whirl 5 disks  log 1disk; zfs set recordsize=32k whirl
st1 -n /whirl/f -f 1073741824 -b 32768 -t 100
Elapsed time 147s  7.0MB/s

and for interest:
zpool create whirl 5 disk log SSD; zfs set recordsize=32k whirl
st1 -n /whirl/f -f 1073741824 -b 32768 -t 100
 Elapsed time 8s  129MB/s

3. Higher end smaller writes
   2K synchronous writes filling 128MB with 100 threads

zpool create whirl 6 disks: zfs set recordsize=1k whirl
st1 -n /whirl/f -f 134217728 -b 2048 -t 100
Elapsed time 16s  8.2MB/s

zpool create whirl 5 disks  log 1 disk
zfs set recordsize=1k whirl
ds8 -n /whirl/f -f 134217728 -b 2048 -t 100
Elapsed time 24s  5.5MB/s

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] File contents changed with no ZFS error

2011-10-22 Thread Mark Sandrock
Why don't you see which byte differs, and how it does?
Maybe that would suggest the failure mode. Is it the
same byte data in all affected files, for instance?

Mark

Sent from my iPhone

On Oct 22, 2011, at 2:08 PM, Robert Watzlavick rob...@watzlavick.com wrote:

 On Oct 22, 2011, at 13:14, Edward Ned Harvey 
 opensolarisisdeadlongliveopensola...@nedharvey.com wrote:
 
 How can you outrule the possibility of something changed the file.
 Intentionally, not as a form of filesystem corruption.
 
 I suppose that's possible but seems unlikely. One byte on a file changed on 
 the disk with no corresponding change in the mod time seems unlikely. I did 
 access that file for read sometime I'm the past few months but again, if it 
 had accidentally been written to, the time would have been updated. 
 
 If you have snapshots on your ZFS filesystem, you can use zhist (or whatever
 technique you want) to see in which snapshot(s) it changed, and find all the
 unique versions of it.  'Course that will only give you any valuable
 information if you have different versions of the file in different
 snapshots.
 
 I only have one or two snapshots but I'll look. 
 
 Thanks,
 -Bob
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-10-18 Thread Mark Sandrock

On Oct 18, 2011, at 11:09 AM, Nico Williams wrote:

 On Tue, Oct 18, 2011 at 9:35 AM, Brian Wilson wrote:
 I just wanted to add something on fsck on ZFS - because for me that used to
 make ZFS 'not ready for prime-time' in 24x7 5+ 9s uptime environments.
 Where ZFS doesn't have an fsck command - and that really used to bug me - it
 does now have a -F option on zpool import.  To me it's the same
 functionality for my environment - the ability to try to roll back to a
 'hopefully' good state and get the filesystem mounted up, leaving the
 corrupted data objects corrupted.  [...]
 
 Yes, that's exactly what it is.  There's no point calling it fsck
 because fsck fixes individual filesystems, while ZFS fixups need to
 happen at the volume level (at volume import time).
 
 It's true that this should have been in ZFS from the word go.  But
 it's there now, and that's what matters, IMO.

Doesn't a scrub do more than what
'fsck' does?

 
 It's also true that this was never necessary with hardware that
 doesn't lie, but it's good to have it anyways, and is critical for
 personal systems such as laptops.

IIRC, fsck was seldom needed at
my former site once UFS journalling
became available. Sweet update.

Mark
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Mirror Gone

2011-09-27 Thread Mark Musante

On 27 Sep 2011, at 18:29, Edward Ned Harvey wrote:

 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of Tony MacDoodle
 
 
 Now:
 mirror-0  ONLINE   0 0 0
 c1t2d0  ONLINE   0 0 0
 c1t3d0  ONLINE   0 0 0
   c1t4d0ONLINE   0 0 0
   c1t5d0ONLINE   0 0 0
 
 There is only one way for this to make sense:  You did not have mirror-1 in
 the first place.  

An easy way to tell is taking a look at the zpool history command for this pool.
What does that show?

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] All drives intact but vdev UNAVAIL in raidz1

2011-09-06 Thread Mark J Musante

On Tue, 6 Sep 2011, Tyler Benster wrote:

It seems quite likely that all of the data is intact, and that something 
different is preventing me from accessing the pool. What can I do to 
recover the pool? I have downloaded the Solaris 11 express livecd if 
that would be of any use.


Try running zdb -l on the disk and see if the labels are still there. 
Also, could you show us the output of 'zpools status'? Normally zfs would 
not hang if one disk of a raidz group is missing, but it might do that if 
one toplevel is missing.


If the zdb command shows all four labels to be correct, then you can try a 
zpool scrub and see if that resilvers the data for you.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool replace

2011-08-15 Thread Mark J Musante


Hi Doug,

The vms pool was created in a non-redundant way, so there is no way to 
get the data off of it unless you can put back the original c0t3d0 disk.


If you can still plug in the disk, you can always do a zpool replace on it 
afterwards.


If not, you'll need to restore from backup, preferably to a pool with 
raidz or mirroring so zfs can repair faults automatically.



On Mon, 15 Aug 2011, Doug Schwabauer wrote:


Help - I've got a bad disk in a zpool and need to replace it.  I've got an 
extra drive that's not being used, although it's still marked like it's in a 
pool. 
So I need to get the xvm pool destroyed, c0t5d0 marked as available, and 
replace c0t3d0 with c0t5d0.

root@kc-x4450a # zpool status -xv
  pool: vms
 state: UNAVAIL
status: One or more devices are faulted in response to IO failures.
action: Make sure the affected devices are connected, then run 'zpool clear'.
   see: http://www.sun.com/msg/ZFS-8000-HC
 scrub: none requested
config:

    NAME    STATE READ WRITE CKSUM
    vms UNAVAIL  0 3 0  insufficient replicas
  c0t2d0    ONLINE   0 0 0
  c0t3d0    UNAVAIL  0 6 0  experienced I/O failures
  c0t4d0    ONLINE   0 0 0

errors: Permanent errors have been detected in the following files:

    vms:0x5
    vms:0xb
root@kc-x4450a # zpool replace -f vms c0t3d0 c0t5d0
cannot replace c0t3d0 with c0t5d0: pool I/O is currently suspended
root@kc-x4450a # zpool import
  pool: xvm
    id: 14176680653869308477
 state: DEGRADED
status: The pool was last accessed by another system.
action: The pool can be imported despite missing or damaged devices.  The
    fault tolerance of the pool may be compromised if imported.
   see: http://www.sun.com/msg/ZFS-8000-EY
config:

    xvm DEGRADED
  mirror-0  DEGRADED
    c0t4d0  FAULTED  corrupted data
    c0t5d0  ONLINE

Thanks!

-Doug




Regards,
markm___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Large scale performance query

2011-08-06 Thread Mark Sandrock
Shouldn't the choice of RAID type also
be based on the i/o requirements?

Anyway, with RAID-10, even a second
failed disk is not catastophic, so long
as it is not the counterpart of the first
failed disk, no matter the no. of disks.
(With 2-way mirrors.)

But that's why we do backups, right?

Mark

Sent from my iPhone

On Aug 6, 2011, at 7:01 AM, Orvar Korvar knatte_fnatte_tja...@yahoo.com wrote:

 Ok, so mirrors resilver faster.
 
 But, it is not uncommon that another disk shows problem during resilver (for 
 instance r/w errors), this scenario would mean your entire raid is gone, 
 right? If you are using mirrors, and one disk crashes and you start resilver. 
 Then the other disk shows r/w errors because of the increased load - then you 
 are screwed? Because large disks take long time to resilver, possibly weeks?
 
 In that case, it would be preferable to use mirrors with 3 disks in each 
 vdev. Trimorrs. Each vdev should be one raidz3.
 -- 
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] [illumos-Developer] zfs refratio property

2011-06-06 Thread Mark Musante

minor quibble: compressratio uses a lowercase x for the description text 
whereas the new prop uses an uppercase X


On 6 Jun 2011, at 21:10, Eric Schrock wrote:

 Webrev has been updated:
 
 http://dev1.illumos.org/~eschrock/cr/zfs-refratio/
 
 - Eric
 
 -- 
 Eric Schrock
 Delphix
 
 275 Middlefield Road, Suite 50
 Menlo Park, CA 94025
 http://www.delphix.com
 
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] NFS acl inherit problem

2011-06-01 Thread Mark Shellenbaum

On 6/1/11 12:51 AM, lance wilson wrote:

The problem is that nfs clients that connect to my solaris 11 express server 
are not inheriting the acl's that are set for the share. They create files that 
don't have any acl assigned to them, just the normal unix file permissions. Can 
someone please provide some additional things to test so that I can get this 
sorted out.

This is the output of a normal ls -al

drwxrwxrwx+ 5 root root 11 2011-05-31 11:14 acltest

The compact version is ls -Vd

drwxrwxrwx+ 5 root root 11 May 31 11:14 /smallstore/acltest
user:root:rwxpdDaARWcCos:fd-:allow
everyone@:rwxpdDaARWc--s:fd-:allow

The parent share has the following permissions
drwxr-xr-x+ 5 root root 5 May 30 22:26 /smallstore/
user:root:rwxpdDaARWcCos:fd-:allow
everyone@:r-x---a-R-c---:fd-:allow
owner@:rwxpdDaARWcCos:fd-:allow

This is the acl for the files created by a ubuntu client. There is no acl 
inheritance occurring.

-rw-r--r-- 1 1000 1000 0 May 31 22:20 /smallstore/acltest/ubuntu_file
owner@:rw-p--aARWcCos:---:allow
group@:r-a-R-c--s:---:allow
everyone@:r-a-R-c--s:---:allow


Looks like the linux client did a chmod(2) after creating the file.

what happens when you create a file locally in that directory on the 
solaris system?




This is the acl for files created by a user from a windows client. There is 
full acl inheritance.
-rwxrwxrwx+ 1 ljw staff 0 May 31 22:22 /smallstore/acltest/windows_file
user:root:rwxpdDaARWcCos:--I:allow
everyone@:rwxpdDaARWc--s:--I:allow

The acl inheritance is on at both the share and directory levels so it should 
be passing them to files that are created.

smallstore aclinherit restricted default
smallstore/acltest aclinherit passthrough local

Again any help would be most appreciated.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Another zfs issue

2011-06-01 Thread Mark Musante

Yeah, this is a known problem. The DTL on the toplevel shows an outage, and is 
preventing the removal of the spare even though removing the spare won't make 
the outage worse.

Unfortunately, for opensolaris anyway, there is no workaround.

You could try doing a full scrub, replacing any disks that show errors, and 
waiting for the resilver to complete. That may clean up the DTL enough to 
detach the spare.

On 1 Jun 2011, at 20:20, Roy Sigurd Karlsbakk wrote:

 Hi all
 
 I have this pool that has been suffering from some bad backplanes etc. 
 Currently it's showing up ok, but after a resilver, a spare is stuck.
 
  raidz2-5 ONLINE   0 0 4
c4t1d0 ONLINE   0 0 0
c4t2d0 ONLINE   1 0 0
c4t3d0 ONLINE   0 0 0
c4t4d0 ONLINE   0 0 0
spare-4ONLINE   0 0 0
  c4t5d0   ONLINE   0 0 0
  c4t44d0  ONLINE   0 0 0
c4t6d0 ONLINE   0 0 0
c4t7d0 ONLINE   0 0 0
 
 So, the VDEV seems ok, the pool reports two data errors, which is sad, but 
 not a showstopper, however, trying to detach the spare from that vdev doesn's 
 seem to easy
 
 roy@dmz-backup:~$ sudo zpool detach dbpool c4t44d0
 cannot detach c4t44d0: no valid replicas
 
 iostat -en shows some issues with drives in that pool, but none on the two in 
 the spare mirror
 
0   0   0   0 c4t1d0
0  82 131 213 c4t2d0
0   0   0   0 c4t3d0
0   0   0   0 c4t4d0
0   0   0   0 c4t5d0
0   0   0   0 c4t6d0
0   0   0   0 c4t7d0
0   0   0   0 c4t44d0
 
 Is there a good explaination why I can't detach this mirror from the VDEV?
 
 Vennlige hilsener / Best regards
 
 roy
 --
 Roy Sigurd Karlsbakk
 (+47) 97542685
 r...@karlsbakk.net
 http://blogg.karlsbakk.net/
 --
 I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det 
 er et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av 
 idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og 
 relevante synonymer på norsk.
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Recommended eSATA PCI cards

2011-05-06 Thread Mark Danico

 Hi Rich,
With the Ultra 20M2 there is a very cheap/easy alternative
that might work for you (until you need to expand past 2
more external devices anyway)

Pick up an eSATA pci bracket cable adapter, something like this-
http://www.newegg.com/Product/Product.aspx?Item=N82E16812226003cm_re=eSATA-_-12-226-003-_-Product
(I haven't used this specific product but it was the first example I found)

The U20M2 has slots for just 2 internal SATA drives but the
motherboard has a total of 4 SATA connectors so there are
two that normally go unused. Connect these to the bracket
and connect your external eSATA enclosures to these. You'll
get two eSATA ports without needing to use any PCI slots
and I believe that if you use the very bottom pci slot opening
you won't even block any of the actual pci slots from future use.

-Mark D.



On 05/ 6/11 12:04 PM, Rich Teer wrote:

Hi all,

I'm looking at replacing my old D1000 array with some new external drives,
most likely these: http://www.g-technology.com/products/g-drive.cfm .  In
the immediate term, I'm planning to use USB 2.0 connections, but the drive
I'm considering also supports eSATA, which is MUCH faster than USB, but
also (I think, please correct me if I'm wrong) more reliable.

Neither of the machines I'll be using as my server (currently an SB1000 but
will be an Ultra 20 M2 soon; this is my home network, very light workload)
has an integrated eSATA port, so I must turn to add-on PCI cards.  What are
people recommending?  I need to attach at least two drives (I'll be mirroring
them), preferably three or more.

The machines are currently running SXCE snv_b130, with an upgrade to Solaris
Express 11 not too far away.

Thanks!



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] X4540 no next-gen product?

2011-04-08 Thread Mark Sandrock

On Apr 8, 2011, at 2:37 AM, Ian Collins i...@ianshome.com wrote:

 On 04/ 8/11 06:30 PM, Erik Trimble wrote:
 On 4/7/2011 10:25 AM, Chris Banal wrote:
 While I understand everything at Oracle is top secret these days.
 
 Does anyone have any insight into a next-gen X4500 / X4540? Does some other 
 Oracle / Sun partner make a comparable system that is fully supported by 
 Oracle / Sun?
 
 http://www.oracle.com/us/products/servers-storage/servers/previous-products/index.html
  
 
 What do X4500 / X4540 owners use if they'd like more comparable zfs based 
 storage and full Oracle support?
 
 I'm aware of Nexenta and other cloned products but am specifically asking 
 about Oracle supported hardware. However, does anyone know if these type of 
 vendors will be at NAB this year? I'd like to talk to a few if they are...
 
 
 The move seems to be to the Unified Storage (aka ZFS Storage) line, which is 
 a successor to the 7000-series OpenStorage stuff.
 
 http://www.oracle.com/us/products/servers-storage/storage/unified-storage/index.html
  
 
 Which is not a lot of use to those of us who use X4540s for what they were 
 intended: storage appliances.

Can you elaborate briefly on what exactly the problem is?

I don't follow? What else would an X4540 or a 7xxx box
be used for, other than a storage appliance?

Guess I'm slow. :-)

Mark
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] X4540 no next-gen product?

2011-04-08 Thread Mark Sandrock

On Apr 8, 2011, at 3:29 AM, Ian Collins i...@ianshome.com wrote:

 On 04/ 8/11 08:08 PM, Mark Sandrock wrote:
 On Apr 8, 2011, at 2:37 AM, Ian Collinsi...@ianshome.com  wrote:
 
 On 04/ 8/11 06:30 PM, Erik Trimble wrote:
 On 4/7/2011 10:25 AM, Chris Banal wrote:
 While I understand everything at Oracle is top secret these days.
 
 Does anyone have any insight into a next-gen X4500 / X4540? Does some 
 other Oracle / Sun partner make a comparable system that is fully 
 supported by Oracle / Sun?
 
 http://www.oracle.com/us/products/servers-storage/servers/previous-products/index.html
 
 What do X4500 / X4540 owners use if they'd like more comparable zfs based 
 storage and full Oracle support?
 
 I'm aware of Nexenta and other cloned products but am specifically asking 
 about Oracle supported hardware. However, does anyone know if these type 
 of vendors will be at NAB this year? I'd like to talk to a few if they 
 are...
 
 The move seems to be to the Unified Storage (aka ZFS Storage) line, which 
 is a successor to the 7000-series OpenStorage stuff.
 
 http://www.oracle.com/us/products/servers-storage/storage/unified-storage/index.html
 
 Which is not a lot of use to those of us who use X4540s for what they were 
 intended: storage appliances.
 Can you elaborate briefly on what exactly the problem is?
 
 I don't follow? What else would an X4540 or a 7xxx box
 be used for, other than a storage appliance?
 
 Guess I'm slow. :-)
 
 No, I just wasn't clear - we use ours as storage/application servers.  They 
 run Samba, Apache and various other applications and P2V zones that access 
 the large pool of data.  Each also acts as a fail over box (both data and 
 applications) for the other.

You have built-in storage failover with an AR cluster;
and they do NFS, CIFS, iSCSI, HTTP and WebDav
out of the box.

And you have fairly unlimited options for application servers,
once they are decoupled from the storage servers.

It doesn't seem like much of a drawback -- although it
may be for some smaller sites. I see AR clusters going in
in local high schools and small universities.

Anything's a fraction of the price of a SAN, isn't it? :-)

Mark
 
 They replaced several application servers backed by a SAN for a fraction the 
 price of a new SAN.
 
 -- 
 Ian.
 
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] X4540 no next-gen product?

2011-04-08 Thread Mark Sandrock

On Apr 8, 2011, at 7:50 AM, Evaldas Auryla evaldas.aur...@edqm.eu wrote:

 On 04/ 8/11 01:14 PM, Ian Collins wrote:
 You have built-in storage failover with an AR cluster;
 and they do NFS, CIFS, iSCSI, HTTP and WebDav
 out of the box.
 
 And you have fairly unlimited options for application servers,
 once they are decoupled from the storage servers.
 
 It doesn't seem like much of a drawback -- although it
 may be for some smaller sites. I see AR clusters going in
 in local high schools and small universities.
 
 Which is all fine and dandy if you have a green field, or are willing to
 re-architect your systems.  We just wanted to add a couple more x4540s!
 
 
 Hi, same here, it's a sad news that Oracle decided to stop x4540s production 
 line. Before, ZFS geeks had choice - buy 7000 series if you want quick out 
 of the box storage with nice GUI, or build your own storage with x4540 line, 
 which by the way has brilliant engineering design, the choice is gone now.

Okay, so what is the great advantage
of an X4540 versus X86 server plus
disk array(s)?

Mark
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] X4540 no next-gen product?

2011-04-08 Thread Mark Sandrock

On Apr 8, 2011, at 9:39 PM, Ian Collins i...@ianshome.com wrote:

 On 04/ 9/11 03:20 AM, Mark Sandrock wrote:
 On Apr 8, 2011, at 7:50 AM, Evaldas Aurylaevaldas.aur...@edqm.eu  wrote:
 On 04/ 8/11 01:14 PM, Ian Collins wrote:
 You have built-in storage failover with an AR cluster;
 and they do NFS, CIFS, iSCSI, HTTP and WebDav
 out of the box.
 
 And you have fairly unlimited options for application servers,
 once they are decoupled from the storage servers.
 
 It doesn't seem like much of a drawback -- although it
 may be for some smaller sites. I see AR clusters going in
 in local high schools and small universities.
 
 Which is all fine and dandy if you have a green field, or are willing to
 re-architect your systems.  We just wanted to add a couple more x4540s!
 Hi, same here, it's a sad news that Oracle decided to stop x4540s 
 production line. Before, ZFS geeks had choice - buy 7000 series if you want 
 quick out of the box storage with nice GUI, or build your own storage 
 with x4540 line, which by the way has brilliant engineering design, the 
 choice is gone now.
 Okay, so what is the great advantage
 of an X4540 versus X86 server plus
 disk array(s)?
 
 One less x86 box (even more of an issue now we have to mortgage the children 
 for support), a lot less $.
 
 Not to mention an existing infrastructure built using X4540s and me looking a 
 fool explaining to the client they can't get any more so the systems we have 
 spent two years building up are a dead end.
 
 One size does not fit all, choice is good for business.

I'm not arguing. If it were up to me,
we'd still be selling those boxes.

Mark
 
 -- 
 Ian.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] X4540 no next-gen product?

2011-04-08 Thread Mark Sandrock

On Apr 8, 2011, at 11:19 PM, Ian Collins i...@ianshome.com wrote:

 On 04/ 9/11 03:53 PM, Mark Sandrock wrote:
 I'm not arguing. If it were up to me,
 we'd still be selling those boxes.
 
 Maybe you could whisper in the right ear?

I wish. I'd have a long list if I could do that.

Mark

 :)
 
 -- 
 Ian.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] trouble replacing spare disk

2011-04-05 Thread Mahabir, Mark I.

Hi,

I have a SunFire X4540 with 19TB in a RAID-Z configuration; here's my zpool 
status:

  pool: raid
 state: UNAVAIL
status: One or more devices are faulted in response to IO failures.
action: Make sure the affected devices are connected, then run 'zpool clear'.
   see: http://www.sun.com/msg/ZFS-8000-HC
 scrub: resilver in progress for 84h11m, 99.47% done, 0h27m to go
config:

NAME STATE READ WRITE CKSUM
raid UNAVAIL  0 0   451  insufficient replicas
 raidz1 UNAVAIL  0 0   902  insufficient replicas
   c0t3d0   ONLINE   0 0 0
   c1t3d0   ONLINE   0 0 0
   c2t3d0   ONLINE   0 0 0
   c3t3d0   ONLINE   0 0 0
   c4t3d0   UNAVAIL47294 0  cannot open
   c5t3d0   ONLINE   0 0 0
   c0t7d0   ONLINE   0 0 0
   c1t7d0   ONLINE   0 0 0
   c2t7d0   ONLINE   0 0 0
   c3t7d0   ONLINE   0 0 0
   c0t2d0   ONLINE   0 0 0
   c1t2d0   ONLINE   0 0 0
   c2t2d0   ONLINE   0 0 0
   c3t2d0   ONLINE   0 0 0
   c4t2d0   ONLINE   0 0 0
   c4t6d0   ONLINE   0 0 0
   spareDEGRADED 7 0 66.8M
 c5t2d0 FAULTED 11 2 0  too many errors
 replacing  DEGRADED 0 0 0
   c5t7d0   FAULTED 13 0 0  too many errors
   c5t6d0   ONLINE   0 0 0  202G resilvered
   c0t6d0   ONLINE   0 0 0
   c1t6d0   ONLINE   0 0 0
   c2t6d0   ONLINE   0 0 0
   c3t6d0   ONLINE   0 0 0
   spareDEGRADED 0 0 0
 c0t1d0 FAULTED  0 0 0  too many errors
 c4t7d0 ONLINE   0 0 0
   c1t1d0   ONLINE   0 0 0
   c2t1d0   ONLINE   0 0 0
   c3t1d0   ONLINE   0 0 0
   c4t1d0   ONLINE   0 0 0
   c4t5d0   ONLINE   0 0 0
   c0t5d0   ONLINE   0 0 0
   c1t5d0   ONLINE   0 0 0
   c2t5d0   ONLINE   0 0 0
   c3t5d0   ONLINE   0 0 0
   c5t1d0   ONLINE   0 0 0
   c5t5d0   ONLINE   0 0 0
   c0t4d0   ONLINE   0 0 0
   c2t0d0   ONLINE   0 0 0
   c3t0d0   ONLINE   0 0 0
   c4t0d0   ONLINE   0 0 0
   c5t0d0   ONLINE   0 0 0
   c1t4d0   ONLINE   0 0 0
   c2t4d0   ONLINE   0 0 0
   c3t4d0   ONLINE   0 0 0
   c4t4d0   ONLINE   0 0 0
spares
 c4t7d0 INUSE currently in use
 c5t7d0 INUSE currently in use
 c5t6d0 INUSE currently in use
 c5t4d0 AVAIL

errors: 911 data errors, use '-v' for a list

It looks like the resilver has got stuck; Oracle have sent out a replacement 
disk today and are asking me to replace c5t7d0.

If I am understanding the documentation correctly, I believe I need to do the 
following:

zpool offline raid c5t7d0
cfgadm -c unconfigure c5::dsk/c5t7d0

before physically replacing the disk. However, I get the following messages 
when trying to do this:

# zpool offline raid c5t7d0
cannot offline c5t7d0: device is reserved as a hot spare
# cfgadm -c unconfigure c5::dsk/c5t7d0
cfgadm: Hardware specific failure: failed to unconfigure SCSI device: Device 
busy

I also tried a detach:

# zpool detach raid c5t7d0
cannot detach c5t7d0: pool I/O is currently suspended

And I also tried using the last available spare to try and free up the disk I 
need to replace:

# zpool replace raid c5t2d0 c5t4d0
Cannot replace c5t2d0 with c5t4d0: device has already been replaced with a spare

I am new to ZFS, how would I go about safely removing the affected drive in the 
software, before physically replacing it?

I'm also not sure at exactly which juncture to do a 'zpool clear' and 'zpool 
scrub'?

I'd appreciate any guidance - thanks in advance,

Mark


Mark Mahabir
Systems Manager, X-Ray and Observational Astronomy

Dept. of Physics  Astronomy, University of Leicester, LE1 7RH
Tel: +44(0)116 252 5652
email: mark.maha...@leicester.ac.uk

Elite Without Being Elitist
Times Higher Awards Winner 2007, 2008, 2009, 2010
Follow us on Twitter http://twitter.com/uniofleicsnews

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Any use for extra drives?

2011-03-25 Thread Mark Sandrock

On Mar 24, 2011, at 7:23 AM, Anonymous wrote:

 Generally, you choose your data pool config based on data size,
 redundancy, and performance requirements.  If those are all satisfied with
 your single mirror, the only thing left for you to do is think about
 splitting your data off onto a separate pool due to better performance
 etc.  (Because there are things you can't do with the root pool, such as
 striping and raidz) 
 
 That's all there is to it.  To split, or not to split.
 
 Thanks for the update. I guess there's not much to do for this box since
 it's a development machine and doesn't have much need for extra redundancy
 although if I would have had some extra 500s I would have liked to stripe
 the root pool. I see from your answer that's not possible anyway. Cheers.

If you plan to generate a lot of data, why use the root pool? You can put the 
/home
and /proj filesystems (/export/...) on a separate pool, thus off-loading the 
root pool.

My two cents,
Mark


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Any use for extra drives?

2011-03-24 Thread Mark Sandrock

On Mar 24, 2011, at 5:42 AM, Edward Ned Harvey wrote:

 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of Nomen Nescio
 
 Hi ladies and gents, I've got a new Solaris 10 development box with ZFS
 mirror root using 500G drives. I've got several extra 320G drives and I'm
 wondering if there's any way I can use these to good advantage in this
 box. I've got enough storage for my needs with the 500G pool. At this
 point
 I would be looking for a way to speed things up if possible or add
 redundancy if necessary but I understand I can't use these smaller drives
 to
 stripe the root pool, so what would you suggest? Thanks.
 
 Generally, you choose your data pool config based on data size, redundancy,
 and performance requirements.  If those are all satisfied with your single
 mirror, the only thing left for you to do is think about splitting your data
 off onto a separate pool due to better performance etc.  (Because there are
 things you can't do with the root pool, such as striping and raidz)
 
 That's all there is to it.  To split, or not to split.

I'd just put /export/home on this second set of drives, as a striped mirror.

Same as I would have done in the old days under SDS. :-)

Mark
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] cannot replace c10t0d0 with c10t0d0: device is too small

2011-03-04 Thread Mark J Musante


The fix for 6991788 would probably let the 40mb drive work, but it would 
depend on the asize of the pool.


On Fri, 4 Mar 2011, Cindy Swearingen wrote:


Hi Robert,

We integrated some fixes that allowed you to replace disks of equivalent
sizes, but 40 MB is probably beyond that window.

Yes, you can do #2 below and the pool size will be adjusted down to the
smaller size. Before you do this, I would check the sizes of both
spares.

If both spares are equivalent smaller sizes, you could use those to
build the replacement pool with the larger disks and then put the extra
larger disks on the shelf.

Thanks,

Cindy



On 03/04/11 09:22, Robert Hartzell wrote:
In 2007 I bought 6 WD1600JS 160GB sata disks and used 4 to create a raidz 
storage pool and then shelved the other two for spares. One of the disks 
failed last night so I shut down the server and replaced it with a spare. 
When I tried to zpool replace the disk I get:


zpool replace tank c10t0d0 cannot replace c10t0d0 with c10t0d0: device is 
too small


The 4 original disk partition tables look like this:

Current partition table (original):
Total disk sectors available: 312560317 + 16384 (reserved sectors)

Part  TagFlag First Sector Size Last Sector
  0usrwm34  149.04GB  312560350 
1 unassignedwm 0   0   0  2 
unassignedwm 0   0   0  3 
unassignedwm 0   0   0  4 
unassignedwm 0   0   0  5 
unassignedwm 0   0   0  6 
unassignedwm 0   0   0  8 
reservedwm 3125603518.00MB  312576734


Spare disk partition table looks like this:

Current partition table (original):
Total disk sectors available: 312483549 + 16384 (reserved sectors)

Part  TagFlag First Sector Size Last Sector
  0usrwm34  149.00GB  312483582 
1 unassignedwm 0   0   0  2 
unassignedwm 0   0   0  3 
unassignedwm 0   0   0  4 
unassignedwm 0   0   0  5 
unassignedwm 0   0   0  6 
unassignedwm 0   0   0  8 
reservedwm 3124835838.00MB  312499966
 So it seems that two of the disks are slightly different models and are 
about 40mb smaller then the original disks. 
I know I can just add a larger disk but I would rather user the hardware I 
have if possible.

1) Is there anyway to replace the failed disk with one of the spares?
2) Can I recreate the zpool using 3 of the original disks and one of the 
slightly smaller spares? Will zpool/zfs adjust its size to the smaller 
disk?
3) If #2 is possible would I still be able to use the last still shelved 
disk as a spare?


If #2 is possible I would probably recreate the zpool as raidz2 instead of 
the current raidz1.


Any info/comments would be greatly appreciated.

Robert
  --  Robert Hartzell
b...@rwhartzell.net
 RwHartzell.Net, Inc.



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss




Regards,
markm
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Investigating a hung system

2011-02-25 Thread Mark Logan

Hi,

I'm investigating a hung system. The machine is running snv_159 and was 
running a full build of Solaris 11. You cannot get any response from the 
console and you cannot ssh in, but it responds to ping.


The output from ::arc shows:
arc_meta_used =  3836 MB
arc_meta_limit=  3836 MB
arc_meta_max  =  3951 MB

Is it normal for arc_meta_used == arc_meta_limit?
Does this explain the hang?

Thanks,
Mark
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS and Virtual Disks

2011-02-14 Thread Mark Creamer
Hi I wanted to get some expert advice on this. I have an ordinary hardware
SAN from Promise Tech that presents the LUNs via iSCSI. I would like to use
that if possible with my VMware environment where I run several Solaris /
OpenSolaris virtual machines. My question is regarding the virtual disks.

1. Should I create individual iSCSI LUNs and present those to the VMware
ESXi host as iSCSI storage, and then create virtual disks from there on each
Solaris VM?

 - or -

2. Should I (assuming this is possible), let the Solaris VM mount the iSCSI
LUNs directly (that is, NOT show them as VMware storage but let the VM
connect to the iSCSI across the network.) ?

Part of the issue is I have no idea if having a hardware RAID 5 or 6 disk
set will create a problem if I then create a bunch of virtual disks and then
use ZFS to create RAIDZ for the VM to use. Seems like that might be asking
for trouble.

This environment is completely available to mess with (no data at risk), so
I'm willing to try any option you guys would recommend.

Thanks!

-- 
Mark
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and spindle speed (7.2k / 10k / 15k)

2011-02-02 Thread Mark Sandrock

On Feb 2, 2011, at 8:10 PM, Eric D. Mudama wrote:

  All other
 things being equal, the 15k and the 7200 drive, which share
 electronics, will have the same max transfer rate at the OD.

Is that true? So the only difference is in the access time?

Mark
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Best choice - file system for system

2011-01-31 Thread Mark Sandrock
Why do you say fssnap has the same problem?

If it write locks the file system, it is only for a matter of seconds, as I 
recall.

Years ago, I used it on a daily basis to do ufsdumps of large fs'es.

Mark

On Jan 30, 2011, at 5:41 PM, Torrey McMahon wrote:

 On 1/30/2011 5:26 PM, Joerg Schilling wrote:
 Richard Ellingrichard.ell...@gmail.com  wrote:
 
 ufsdump is the problem, not ufsrestore. If you ufsdump an active
 file system, there is no guarantee you can ufsrestore it. The only way
 to guarantee this is to keep the file system quiesced during the entire
 ufsdump.  Needless to say, this renders ufsdump useless for backup
 when the file system also needs to accommodate writes.
 This is why there is a ufs snapshot utility.
 
 You'll have the same problem. fssnap_ufs(1M) write locks the file system when 
 you run the lock command. See the notes section of the man page.
 
 http://download.oracle.com/docs/cd/E19253-01/816-5166/6mbb1kq1p/index.html#Notes
 
 
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Best choice - file system for system

2011-01-31 Thread Mark Sandrock
iirc, we would notify the user community that the FS'es were going to hang 
briefly.

Locking the FS'es is the best way to quiesce it, when users are worldwide, imo.

Mark

On Jan 31, 2011, at 9:45 AM, Torrey McMahon wrote:

 A matter of seconds is a long time for a running Oracle database. The point 
 is that if you have to keep writing to a UFS filesystem - when the file 
 system also needs to accommodate writes - you're still out of luck. If you 
 can quiesce the apps, great, but if you can't then you're still stuck.  In 
 other words, fssnap_ufs doesn't solve the quiesce problem.
 
 On 1/31/2011 10:24 AM, Mark Sandrock wrote:
 Why do you say fssnap has the same problem?
 
 If it write locks the file system, it is only for a matter of seconds, as I 
 recall.
 
 Years ago, I used it on a daily basis to do ufsdumps of large fs'es.
 
 Mark
 
 On Jan 30, 2011, at 5:41 PM, Torrey McMahon wrote:
 
 On 1/30/2011 5:26 PM, Joerg Schilling wrote:
 Richard Ellingrichard.ell...@gmail.com   wrote:
 
 ufsdump is the problem, not ufsrestore. If you ufsdump an active
 file system, there is no guarantee you can ufsrestore it. The only way
 to guarantee this is to keep the file system quiesced during the entire
 ufsdump.  Needless to say, this renders ufsdump useless for backup
 when the file system also needs to accommodate writes.
 This is why there is a ufs snapshot utility.
 You'll have the same problem. fssnap_ufs(1M) write locks the file system 
 when you run the lock command. See the notes section of the man page.
 
 http://download.oracle.com/docs/cd/E19253-01/816-5166/6mbb1kq1p/index.html#Notes

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] A few questions

2010-12-20 Thread Mark Sandrock

On Dec 18, 2010, at 12:23 PM, Lanky Doodle wrote:

 Now this is getting really complex, but can you have server failover in ZFS, 
 much like DFS-R in Windows - you point clients to a clustered ZFS namespace 
 so if a complete server failed nothing is interrupted.

This is the purpose of an Amber Road dual-head cluster (7310C, 7410C, etc.) -- 
not only the storage pool fails over,
but also the server IP address fails over, so that NFS, etc. shares remain 
active, when one storage head goes down.

Amber Road uses ZFS, but the clustering and failover are not related to the 
filesystem type.

Mark
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] A few questions

2010-12-20 Thread Mark Sandrock
Erik,

just a hypothetical what-if ...

In the case of resilvering on a mirrored disk, why not take a snapshot, and then
resilver by doing a pure block copy from the snapshot? It would be sequential,
so long as the original data was unmodified; and random access in dealing with
the modified blocks only, right.

After the original snapshot had been replicated, a second pass would be done,
in order to update the clone to 100% live data.

Not knowing enough about the inner workings of ZFS snapshots, I don't know why
this would not be doable. (I'm biased towards mirrors for busy filesystems.)

I'm supposing that a block-level snapshot is not doable -- or is it?

Mark

On Dec 20, 2010, at 1:27 PM, Erik Trimble wrote:

 On 12/20/2010 9:20 AM, Saxon, Will wrote:
 -Original Message-
 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of Edward Ned Harvey
 Sent: Monday, December 20, 2010 11:46 AM
 To: 'Lanky Doodle'; zfs-discuss@opensolaris.org
 Subject: Re: [zfs-discuss] A few questions
 
 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of Lanky Doodle
 
 I believe Oracle is aware of the problem, but most of
 the core ZFS team has left. And of course, a fix for
 Oracle Solaris no longer means a fix for the rest of
 us.
 OK, that is a bit concerning then. As good as ZFS may be, i'm not sure I
 want
 to committ to a file system that is 'broken' and may not be fully fixed,
 if at all.
 
 ZFS is not broken.  It is, however, a weak spot, that resilver is very
 inefficient.  For example:
 
 On my server, which is made up of 10krpm SATA drives, 1TB each...  My
 drives
 can each sustain 1Gbit/sec sequential read/write.  This means, if I needed
 to resilver the entire drive (in a mirror) sequentially, it would take ...
 8,000 sec = 133 minutes.  About 2 hours.  In reality, I have ZFS mirrors,
 and disks are around 70% full, and resilver takes 12-14 hours.
 
 So although resilver is broken by some standards, it is bounded, and you
 can limit it to something which is survivable, by using mirrors instead of
 raidz.  For most people, even using 5-disk, or 7-disk raidzN will still be
 fine.  But you start getting unsustainable if you get up to 21-disk radiz3
 for example.
 This argument keeps coming up on the list, but I don't see where anyone has 
 made a good suggestion about whether this can even be 'fixed' or how it 
 would be done.
 
 As I understand it, you have two basic types of array reconstruction: in a 
 mirror you can make a block-by-block copy and that's easy, but in a parity 
 array you have to perform a calculation on the existing data and/or existing 
 parity to reconstruct the missing piece. This is pretty easy when you can 
 guarantee that all your stripes are the same width, start/end on the same 
 sectors/boundaries/whatever and thus know a piece of them lives on all 
 drives in the set. I don't think this is possible with ZFS since we have 
 variable stripe width. A failed disk d may or may not contain data from 
 stripe s (or transaction t). This information has to be discovered by 
 looking at the transaction records. Right?
 
 Can someone speculate as to how you could rebuild a variable stripe width 
 array without replaying all the available transactions? I am no filesystem 
 engineer but I can't wrap my head around how this could be handled any 
 better than it already is. I've read that resilvering is throttled - 
 presumably to keep performance degradation to a minimum during the process - 
 maybe this could be a tunable (e.g. priority: low, normal, high)?
 
 Do we know if resilvers on a mirror are actually handled differently from 
 those on a raidz?
 
 Sorry if this has already been explained. I think this is an issue that 
 everyone who uses ZFS should understand completely before jumping in, 
 because the behavior (while not 'wrong') is clearly NOT the same as with 
 more conventional arrays.
 
 -Will
 the problem is NOT the checksum/error correction overhead. that's 
 relatively trivial.  The problem isn't really even variable width (i.e. 
 variable number of disks one crosses) slabs.
 
 The problem boils down to this:
 
 When ZFS does a resilver, it walks the METADATA tree to determine what order 
 to rebuild things from. That means, it resilvers the very first slab ever 
 written, then the next oldest, etc.   The problem here is that slab age has 
 nothing to do with where that data physically resides on the actual disks. If 
 you've used the zpool as a WORM device, then, sure, there should be a strict 
 correlation between increasing slab age and locality on the disk.  However, 
 in any reasonable case, files get deleted regularly. This means that the 
 probability that for a slab B, written immediately after slab A, it WON'T be 
 physically near slab A.
 
 In the end, the problem is that using metadata order, while reducing the 
 total amount of work to do in the resilver

Re: [zfs-discuss] A few questions

2010-12-20 Thread Mark Sandrock

On Dec 20, 2010, at 2:05 PM, Erik Trimble wrote:

 On 12/20/2010 11:56 AM, Mark Sandrock wrote:
 Erik,
 
  just a hypothetical what-if ...
 
 In the case of resilvering on a mirrored disk, why not take a snapshot, and 
 then
 resilver by doing a pure block copy from the snapshot? It would be 
 sequential,
 so long as the original data was unmodified; and random access in dealing 
 with
 the modified blocks only, right.
 
 After the original snapshot had been replicated, a second pass would be done,
 in order to update the clone to 100% live data.
 
 Not knowing enough about the inner workings of ZFS snapshots, I don't know 
 why
 this would not be doable. (I'm biased towards mirrors for busy filesystems.)
 
 I'm supposing that a block-level snapshot is not doable -- or is it?
 
 Mark
 Snapshots on ZFS are true snapshots - they take a picture of the current 
 state of the system. They DON'T copy any data around when created. So, a ZFS 
 snapshot would be just as fragmented as the ZFS filesystem was at the time.

But if one does a raw (block) copy, there isn't any fragmentation -- except for 
the COW updates.

If there were no updates to the snapshot, then it becomes a 100% sequential 
block copy operation.

But even with COW updates, presumably the large majority of the copy would 
still be sequential i/o.

Maybe for the 2nd pass, the filesystem would have to be locked, so the 
operation would ever complete,
but if this is fairly short in relation to the overall resilvering time, then 
it could still be a win in many cases.

I'm probably not explaining it well, and may be way off, but it seemed an 
interesting notion.

Mark

 
 
 The problem is this:
 
 Let's say I write block A, B, C, and D on a clean zpool (what kind, it 
 doesn't matter).  I now delete block C.  Later on, I write block E.   There 
 is a probability (increasing dramatically as times goes on), that the on-disk 
 layout will now look like:
 
 A, B, E, D
 
 rather than
 
 A, B, [space], D, E
 
 
 So, in the first case, I can do a sequential read to get A  B, but then must 
 do a seek to get D, and a seek to get E.
 
 The fragmentation problem is mainly due to file deletion, NOT to file 
 re-writing.  (though, in ZFS, being a C-O-W filesystem, re-writing generally 
 looks like a delete-then-write process, rather than a modify process).
 
 
 -- 
 Erik Trimble
 Java System Support
 Mailstop:  usca22-123
 Phone:  x17195
 Santa Clara, CA
 Timezone: US/Pacific (GMT-0800)
 

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] A few questions

2010-12-20 Thread Mark Sandrock
It well may be that different methods are optimal for different use cases.

Mechanical disk vs. SSD; mirrored vs. raidz[123]; sparse vs. populated; etc.

It would be interesting to read more in this area, if papers are available.

I'll have to take a look. ... Or does someone have pointers?

Mark


On Dec 20, 2010, at 6:28 PM, Edward Ned Harvey wrote:

 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of Erik Trimble
 
 In the case of resilvering on a mirrored disk, why not take a snapshot,
 and
 then
 resilver by doing a pure block copy from the snapshot? It would be
 sequential,
 
 So, a
 ZFS snapshot would be just as fragmented as the ZFS filesystem was at
 the time.
 
 I think Mark was suggesting something like dd copy device 1 onto device 2,
 in order to guarantee a first-pass sequential resilver.  And my response
 would be:  Creative thinking and suggestions are always a good thing.  In
 fact, the above suggestion is already faster than the present-day solution
 for what I'm calling typical usage, but there are an awful lot of use
 cases where the dd solution would be worse... Such as a pool which is
 largely sequential already, or largely empty, or made of high IOPS devices
 such as SSD.  However, there is a desire to avoid resilvering unused blocks,
 so I hope a better solution is possible... 
 
 The fundamental requirement for a better optimized solution would be a way
 to resilver according to disk ordering...  And it's just a question for
 somebody that actually knows the answer ... How terrible is the idea of
 figuring out the on-disk order?
 

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Problem with a failed replace.

2010-12-07 Thread Mark J Musante

On Mon, 6 Dec 2010, Curtis Schiewek wrote:


Hi Mark,

I've tried running zpool attach media ad24 ad12 (ad12 being the new 
disk) and I get no response.  I tried leaving the command run for an 
extended period of time and nothing happens.


What version of solaris are you running?
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Zfs ignoring spares?

2010-12-05 Thread Mark Musante







On 5 Dec 2010, at 16:06, Roy Sigurd Karlsbakk r...@karlsbakk.net wrote:

 Hot spares are dedicated spares in the ZFS world. Until you replace
 the actual bad drives, you will be running in a degraded state. The
 idea is that spares are only used in an emergency. You are degraded
 until your spares are no longer in use. 
 
 --Tim 
 
 Thanks for the clarification. Wouldn't it be nice if ZFS could fail over
 to a spare and then allow the replacement as the new spare, as with what
 is done with most commercial hardware RAIDs?

If you use zpool detach to remove the disk that went bad, the spare is 
promoted to a proper member of the pool. Then, when you replace the bad disk, 
you can use zpool add to add it into the pool as a new spare.

Admittedly, this is all a manual procedure. It's unclear if you were asking for 
this to be fully automated.


 
 Vennlige hilsener / Best regards 
 
 roy 
 -- 
 Roy Sigurd Karlsbakk 
 (+47) 97542685 
 r...@karlsbakk.net 
 http://blogg.karlsbakk.net/ 
 -- 
 I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det 
 er et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av 
 idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og 
 relevante synonymer på norsk. 
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Problem with a failed replace.

2010-12-03 Thread Mark J Musante

On Fri, 3 Dec 2010, Curtis Schiewek wrote:


NAME   STATE READ WRITE CKSUM
media  DEGRADED 0 0 0
  raidz1   ONLINE   0 0 0
ad8ONLINE   0 0 0
ad10   ONLINE   0 0 0
ad4ONLINE   0 0 0
ad6ONLINE   0 0 0
  raidz1   DEGRADED 0 0 0
ad22   ONLINE   0 0 0
ad26   ONLINE   0 0 0
replacing  UNAVAIL  0 66.4K 0  insufficient replicas
  ad24 FAULTED  0 75.1K 0  corrupted data
  ad18 FAULTED  0 75.2K 0  corrupted data
ad24   ONLINE   0 0 0


What happens if you try zpool detach media ad24?
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Problem with a failed replace.

2010-12-03 Thread Mark J Musante



On Fri, 3 Dec 2010, Curtis Schiewek wrote:


cannot detach ad24: no valid replicas


I bet that's an instance of CR 6909724.  If you have another disk you can 
spare, you can do a zpool attach media ad24 newdisk, wait for it to 
finish resilvering, and then zfs should automatically clean up ad24  ad18 
for you.




On Fri, Dec 3, 2010 at 1:38 PM, Mark J Musante mark.musa...@oracle.comwrote:


On Fri, 3 Dec 2010, Curtis Schiewek wrote:

NAME   STATE READ WRITE CKSUM

   media  DEGRADED 0 0 0
 raidz1   ONLINE   0 0 0
   ad8ONLINE   0 0 0
   ad10   ONLINE   0 0 0
   ad4ONLINE   0 0 0
   ad6ONLINE   0 0 0
 raidz1   DEGRADED 0 0 0
   ad22   ONLINE   0 0 0
   ad26   ONLINE   0 0 0
   replacing  UNAVAIL  0 66.4K 0  insufficient replicas
 ad24 FAULTED  0 75.1K 0  corrupted data
 ad18 FAULTED  0 75.2K 0  corrupted data
   ad24   ONLINE   0 0 0



What happens if you try zpool detach media ad24?






___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Faster than 1G Ether... ESX to ZFS

2010-11-19 Thread Mark Little


On Fri, 19 Nov 2010 07:16:20 PST, Günther wrote:

i have the same problem with my 2HE supermicro server (24x2,5,
connected via 6x mini SAS 8087) and no additional mounting
possibilities for 2,5 or 3,5 drives.
brbr
on those machines i use one sas port (4 drives) of an old adaptec
3805 (i have used them in my pre zfs-times) to build a raid-1 + 
hotfix

for esxi to boot from. the other 20 slots are connected to 3 lsi sas
controller for pass-through - so i have 4 sas controller in these
machines.
brbr
maybee the new ssd-drives mounted on a pci-e (ex ocz revo drive) may
be an alternative. have anyone used them already with esxi?
brbr
gea



Hey - just as a side note..

Depending on what motherboard you use, you may be able to use this:  
MCP-220-82603-0N - Dual 2.5 fixed HDD tray kit for SC826 (for E-ATX X8 
DP MB)


I haven't used one yet myself but am currently planning a SMC build and 
contacted their support as I really did not want to have my system 
drives hanging off the controller.  As far as I can tell from a picture 
they sent, it mounts on top of the motherboard itself somewhere where 
there is normally open space, and it can hold two 2.5 drives.  So maybe 
give in touch with their support and see if you can use something 
similar.



Cheers,
Mark


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Excruciatingly slow resilvering on X4540 (build 134)

2010-11-15 Thread Mark Sandrock

On Nov 2, 2010, at 12:10 AM, Ian Collins wrote:

 On 11/ 2/10 08:33 AM, Mark Sandrock wrote:
 
 
 I'm working with someone who replaced a failed 1TB drive (50% utilized),
 on an X4540 running OS build 134, and I think something must be wrong.
 
 Last Tuesday afternoon, zpool status reported:
 
 scrub: resilver in progress for 306h0m, 63.87% done, 173h7m to go
 
 and a week being 168 hours, that put completion at sometime tomorrow night.
 
 However, he just reported zpool status shows:
 
 scrub: resilver in progress for 447h26m, 65.07% done, 240h10m to go
 
 so it's looking more like 2011 now. That can't be right.

 
 
 How is the pool configured?

Both 10 and 12 disk RAIDZ-2. That, plus too much other io
must be the problem. I'm thinking 5 x (7-2) would be better,
assuming he doesn't want to go RAID-10.

Thanks much for all the helpful replies.

Mark
 
 I look after a very busy x5400 with 500G drives configured as 8 drive raidz2 
 and these take about 100 hours to resilver.  The workload on this box is 
 probably worst case for resivering, it receives a steady stream of snapshots.
 
 -- 
 Ian.
 
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Faster than 1G Ether... ESX to ZFS

2010-11-15 Thread Mark Sandrock
Edward,

I recently installed a 7410 cluster, which had added Fiber Channel HBAs.

I know the site also has Blade 6000s running VMware, but no idea if they
were planning to run fiber to those blades (or even had the option to do so).

But perhaps FC would be an option for you?

Mark

On Nov 12, 2010, at 9:03 AM, Edward Ned Harvey wrote:

 Since combining ZFS storage backend, via nfs or iscsi, with ESXi heads, I’m 
 in love.  But for one thing.  The interconnect between the head  storage.
  
 1G Ether is so cheap, but not as fast as desired.  10G ether is fast enough, 
 but it’s overkill and why is it so bloody expensive?  Why is there nothing in 
 between?  Is there something in between?  Is there a better option?  I mean … 
 sata is cheap, and it’s 3g or 6g, but it’s not suitable for this purpose.  
 But the point remains, there isn’t a fundamental limitation that *requires* 
 10G to be expensive, or *requires* a leap directly from 1G to 10G.  I would 
 very much like to find a solution which is a good fit… to attach ZFS storage 
 to vmware.
  
 What are people using, as interconnect, to use ZFS storage on ESX(i)?
  
 Any suggestions?
  
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool split how it works?

2010-11-10 Thread Mark J Musante

On Wed, 10 Nov 2010, Darren J Moffat wrote:


On 10/11/2010 11:18, sridhar surampudi wrote:

I was wondering how zpool split works or implemented.

Or are you really asking about the implementation details ?  If you want 
to know how it is implemented then you need to read the source code.


Also or you can read the blog entry I wrote up after it was put back:

http://blogs.sun.com/mmusante/entry/seven_years_of_good_luck
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Excruciatingly slow resilvering on X4540 (build 134)

2010-11-01 Thread Mark Sandrock
Hello,

I'm working with someone who replaced a failed 1TB drive (50% utilized),
on an X4540 running OS build 134, and I think something must be wrong.

Last Tuesday afternoon, zpool status reported:

scrub: resilver in progress for 306h0m, 63.87% done, 173h7m to go

and a week being 168 hours, that put completion at sometime tomorrow night.

However, he just reported zpool status shows:

scrub: resilver in progress for 447h26m, 65.07% done, 240h10m to go

so it's looking more like 2011 now. That can't be right.

I'm hoping for a suggestion or two on this issue.

I'd search the archives, but they don't seem searchable. Or am I wrong about 
that?

Thanks.
Mark (subscription pending)


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZPOOL_CONFIG_IS_HOLE

2010-10-15 Thread Mark Musante
You should only see a HOLE in your config if you removed a slog after having 
added more stripes.  Nothing to do with bad sectors.

On 14 Oct 2010, at 06:27, Matt Keenan wrote:

 Hi,
 
 Can someone shed some light on what this ZPOOL_CONFIG is exactly.
 At a guess is it a bad sector of the disk, non writable and thus ZFS marks it 
 as a hole ?
 
 cheers
 
 Matt
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs unmount versus umount?

2010-09-30 Thread Mark J Musante

On Thu, 30 Sep 2010, Linder, Doug wrote:

Is there any technical difference between using zfs unmount to unmount 
a ZFS filesystem versus the standard unix umount command?  I always 
use zfs unmount but some of my colleagues still just use umount.  Is 
there any reason to use one over the other?


No, they're identical.  If you use 'zfs umount' the code automatically 
maps it to 'unmount'.  It also maps 'recv' to 'receive' and '-?' to call 
into the usage function.  Here's the relevant code from main():


/*
 * The 'umount' command is an alias for 'unmount'
 */
if (strcmp(cmdname, umount) == 0)
cmdname = unmount;

/*
 * The 'recv' command is an alias for 'receive'
 */
if (strcmp(cmdname, recv) == 0)
cmdname = receive;

/*
 * Special case '-?'
 */
if (strcmp(cmdname, -?) == 0)
usage(B_TRUE);


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs unmount versus umount?

2010-09-30 Thread Mark J Musante

On Thu, 30 Sep 2010, Darren J Moffat wrote:


* It can be applied recursively down a ZFS hierarchy


True.


* It will unshare the filesystems first


Actually, because we use the zfs command to do the unmount, we end up 
doing the unshare on the filesystem first.


See the opensolaris code for details:

http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/lib/libzfs/common/libzfs_mount.c#zfs_unmount
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Cannot access dataset

2010-09-20 Thread Mark J Musante

On Mon, 20 Sep 2010, Valerio Piancastelli wrote:


After a crash i cannot access one of my datasets anymore.

ls -v cts
brwxrwxrwx+  2 root root   0,  0 ott 18  2009 cts

zfs list sas/mail-cts
NAME   USED  AVAIL  REFER  MOUNTPOINT
sas/mail-cts   149G   250G   149G  /sas/mail-cts

as you can see, the space is referenced by this dataset, but i cannot access 
the directory /sas/mail-cts


Is the dataset mounted?  i.e. what does 'zfs get mounted sas/mail-cts' 
show?

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Cannot access dataset

2010-09-20 Thread Mark J Musante

On Mon, 20 Sep 2010, Valerio Piancastelli wrote:


Yes, it is mounted

r...@disk-00:/volumes/store# zfs get sas/mail-ccts
NAME  PROPERTY  VALUESOURCE
sas/mail-cts  mounted   yes  -


OK - so the next question would be where the data is.  I assume when you 
say you cannot access the dataset, it means when you type ls -l 
/sas/mail-cts it shows up as an empty directory.  Is that true?


With luck, the data will still be in a snapshot.  Given that the dataset 
has 149G referenced, it could be all there.  Does 'zfs list -rt snapshot 
sas/mail-cts' list any?  If so, you can try using the most recent snapshot 
by looking in /sas/mail-cts/.zfs/snapshot/snapshot name and seeing if 
all your data are there.  If it looks good, you can zfs rollback to that 
snapshot.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] moving rppol in laptop to spare SSD drive.

2010-09-19 Thread Mark Farmer
Hi Steve,

Couple of options.  

Create a new boot environment on the SSD, and this will copy the data over.

Or 

zfs send -R rp...@backup | zfs recv altpool

I'd use the alt boot environment, rather than the send and receive.

Cheers,
-Mark.



On 19/09/2010, at 5:37 PM, Steve Arkley wrote:

 Hello folks,
 
 I ordered a bunch of 128Gb SSD's the other day, placed 2 in PC, another in a 
 windoz laptop and I thought I'd place one in my opensolaris laptop, should be 
 straightforward or so I thought.
 
 The problem I seem to be running into is that the partition the rpool is on 
 is 130Gb, SSD once sliced up is only about 120Gb.
 
 I pulled the main disk from the latop and put it in a caddy, put the new ssd 
 in the drive bay and booted from cdrom.
 
 I imported the rpool and created an altpool on the ssd drive.
 
 zfs pool list shows both pools.
 altpool size 119G avail 119G
 rpool size 130G used 70G
 
 I created a snapshot of the rpool and tried to send it to the other disk but 
 it fails with file too large.
 
 zfs send -R rp...@backup  altpool
 warning: cannot send 'rpool/bu...@backup': file too large.
 
 is there anyway to get the data over onto the other drive at all?
 
 Thanks Steve.
 -- 
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss




inline: oracle_sig_logo.gif
Mark Farmer | Sales Consultant
Phone: +61730317106  | Mobile: +61414999143 
Oracle Systems

ORACLE Australia | 300 Ann St | Brisbane
inline: green-for-email-sig_0.gif
 Oracle is committed to developing practices and products that help protect the 
environment

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] performance leakage when copy huge data

2010-09-09 Thread Mark Little
On Thu, 9 Sep 2010 14:05:51 +, Markus Kovero
markus.kov...@nebula.fi wrote:
 On Sep 9, 2010, at 8:27 AM, Fei Xu twinse...@hotmail.com wrote:
 
 
 This might be the dreaded WD TLER issue. Basically the drive keeps retrying 
 a read operation over and over after a bit error trying to recover from a  
 read error themselves. With ZFS one really needs to disable this and have 
 the drives fail immediately.
 
 Check your drives to see if they have this feature, if so think about 
 replacing the drives in the source pool that have long service times and 
 make sure this feature is disabled on the destination pool drives.
 
 -Ross
 
 
 It might be due tler-issues, but I'd try to pin greens down to
 SATA1-mode (use jumper, or force via controller). It might help a bit
 with these disks, although these are not really suitable disks for any
 use in any raid configurations due tler issue, which cannot be
 disabled in later firmware versions.
 
 Yours
 Markus Kovero
 

Just to clarify - do you mean TLER should be off or on?  TLER = Time
Limited Error Recovery so the drive only takes a max time (eg: 7
seconds) to retrieve data or returns an error.  So you say 'cannot be
disabled' but I think you mean 'cannot be ENABLED' ?

I've been doing a lot of research for a new storage box at work, and
from reading a lot of the info available in the Storage forum on
hardforum.com, the experts there seem to recommend NOT having TLER
enabled when using ZFS as ZFS can be configured for its timeouts, etc,
and the main reason to use TLER is when using those drives with hardware
RAID cards which will kick a drive out of the array if it takes longer
than 10 seconds.

Can anyone else here comment if they have had experience with the WD
drives and ZFS and if they have TLER enabled or disabled?

Cheers,
Mark
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] onnv_142 - vfs_mountroot: cannot mount root

2010-09-07 Thread Mark J Musante


Did you run installgrub before rebooting?

On Tue, 7 Sep 2010, Piotr Jasiukajtis wrote:


Hi,

After upgrade from snv_138 to snv_142 or snv_145 I'm unable to boot the system.
Here is what I get.

Any idea why it's not able to import rpool?

I saw this issue also on older builds on a different machines.

--
Piotr Jasiukajtis | estibi | SCA OS0072
http://estseg.blogspot.com



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] new labelfix needed

2010-09-02 Thread Mark J Musante

On Wed, 1 Sep 2010, Benjamin Brumaire wrote:


your point have only a rethoric meaning.


I'm not sure what you mean by that.  I was asking specifically about your 
situation.  You want to run labelfix on /dev/rdsk/c0d1s4 - what happened 
to that slice that requires a labelfix?  Is there something that zfs might 
be doing to cause the problem?  Is there something that zfs could be doing 
to mitigate the problem?



BTW zfsck would be a great improvement to ZFS.


What specifically would zfsck do that is not done by scrub?
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to rebuild raidz after system reinstall

2010-09-02 Thread Mark J Musante


What does 'zpool import' show?  If that's empty, what about 'zpool import 
-d /dev'?

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to rebuild raidz after system reinstall

2010-09-02 Thread Mark J Musante

On Thu, 2 Sep 2010, Dominik Hoffmann wrote:


I think, I just destroyed the information on the old raidz members by doing

zpool create BackupRAID raidz /dev/disk0s2 /dev/disk1s2 /dev/disk2s2


It should have warned you that two of the disks were already formatted 
with a zfs pool.  Did it not do that?  If so, perhaps these aren't the 
same disks you were using in your pool.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] new labelfix needed

2010-08-31 Thread Mark J Musante

On Mon, 30 Aug 2010, Benjamin Brumaire wrote:

As this feature didn't make it into zfs it would be nice to have it 
again.


Better to spend time fixing the problem that requires a 'labelfix' as a 
workaround, surely.  What's causing the need to fix vdev labels?

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] pool died during scrub

2010-08-30 Thread Mark J Musante

On Mon, 30 Aug 2010, Jeff Bacon wrote:




All of this would be ok... except THOSE ARE THE ONLY DEVICES THAT WERE 
PART OF THE POOL. How can it be missing a device that didn't exist?


The device(s) in question are probably the logs you refer to here:

I can't obviously use b134 to import the pool without logs, since that 
would imply upgrading the pool first, which is hard to do if it's not 
imported.


The stack trace you show is indicative of a memory corruption that may 
have gotten out to disk.  In other words, ZFS wrote data to ram, ram was 
corrupted, then the checksum was calculated and the result was written 
out.


Do you have a core dump from the panic?  Also, what kind of DRAM does this 
system use?


If you're lucky, then there's no corruption and instead it's a stale 
config that's causing the problem.  Try removing /etc/zfs/zpool.cache and 
then doing an zpool import -a

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool status and format/kernel disagree about root disk

2010-08-27 Thread Mark J Musante

On Fri, 27 Aug 2010, Rainer Orth wrote:

zpool status thinks rpool is on c1t0d0s3, while format (and the kernel)
correctly believe it's c11t0d0(s3) instead.

Any suggestions?


Try removing the symlinks or using 'devfsadm -C' as suggested here:

https://defect.opensolaris.org/bz/show_bug.cgi?id=14999


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] VM's on ZFS - 7210

2010-08-27 Thread Mark
Hey thanks for the replies everyone.

Saddly most of those options will not work, since we are using a SUN Unified 
Storage 7210, the only option is to buy the SUN SSD's for it, which is about 
$15k USD for a pair.   We also don't have the ability to shut off ZIL or any of 
the other options that one might have under OpenSolaris itself :(

It sounds like I do want to change to a RAID10 mirror instead of RAIDz.   It 
sounds like enabling write-cash without the ZIL in place might work but would 
lead to corruption should something crash.

So the question is with a proper ZIL SSD from SUN, and a RAID10... would I be 
able to support all the VM's or would it still be pushing the limits a 44 disk 
pool?

Today there are 30 VM's, 25 are Windows 2008 and 5 are Cent OS 5.   A couple 
are DB servers that see very light load.  The only thing that see's any real 
load is a build server which we get a lot of complaints about.

I did some testing and posted my results a month ago, using OpenSolaris and 5 
disks with my personal Intel SSD and saw good results, but I don't know how it 
will scale :(
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] VM's on ZFS - 7210

2010-08-27 Thread Mark
It does, its on a pair of large APC's.

Right now we're using NFS for our ESX Servers.  The only iSCSI LUN's I have are 
mounted inside a couple Windows VM's.   I'd have to migrate all our VM's to 
iSCSI, which I'm willing to do if it would help and not cause other issues.   
So far the 7210 Appliance has been very stable.

I like the zilstat script.  I emailed a support tech I am working with on 
another issue to ask if one of the built in Analytics DTrace scripts will get 
that data.   

I found one called L2ARC Eligibility:  3235 true, 66 false.  This makes it 
sound like we would benefit from a READZilla, not quite what I had expected...  
I'm sure I don't know what I'm looking at anyways :)
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] VM's on ZFS - 7210

2010-08-26 Thread Mark
We are using a 7210, 44 disks I believe, 11 stripes of RAIDz sets.  When I 
installed I selected the best bang for the buck on the speed vs capacity chart.

We run about 30 VM's on it, across 3 ESX 4 servers.  Right now, its all running 
NFS, and it sucks... sooo slow.

iSCSI was no better.   

I am wondering how I can increase the performance, cause they want to add more 
vm's... the good news is most are idleish, but even idle vm's create a lot of 
random chatter to the disks!

So a few options maybe... 

1) Change to iSCSI mounts to ESX, and enable write-cache on the LUN's since the 
7210 is on a UPS.
2) get a Logzilla SSD mirror.  (do ssd's fail, do I really need a mirror?)
3) reconfigure the NAS to a RAID10 instead of RAIDz

Obviously all 3 would be ideal , though with a SSD can I keep using NFS for the 
same performance since the R_SYNC's would be satisfied with the SSD?

I am dreadful of getting the OK to spend the $$,$$$ SSD's and then not get the 
performance increase we want.

How would you weight these?  I noticed in testing on a 5 disk OpenSolaris, that 
changing from a single RAIDz pool to RAID10 netted a larger IOP increase then 
adding an Intel SSD as a Logzilla.  That's not going to scale the same though 
with a 44 disk, 11 raidz striped RAID set.

Some thoughts?  Would simply moving to write-cache enabled iSCSI LUN's without 
a SSD speed things up a lot by itself?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Narrow escape with FAULTED disks

2010-08-23 Thread Mark Bennett
Well I do have a plan.

Thanks to the portability of ZFS boot disks, I'll make two new OS disks on 
another machine with the next Nexcenta release, export the data pool and swap 
in the new ones.

That way, I can at least manage a zfs scrub without killing the performance and 
get the Intel SSD's I have been testing to work properly.

On the other hand, I could just use the spare 7210 Appliance boot disk I have 
lying about.

Mark.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Cant't detach spare device from pool

2010-08-18 Thread Mark Musante
You need to let the resilver complete before you can detach the spare.  This is 
a known problem, CR 6909724.

http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6909724



On 18 Aug 2010, at 14:02, Dr. Martin Mundschenk wrote:

 Hi!
 
 I had trouble with my raidz in the way, that some of the blockdevices where 
 not found by the OSOL Box the other day, so the spare device was hooked on 
 automatically.
 
 After fixing the problem, the missing device came back online, but I am 
 unable to detach the spare device, even though all devices are online and 
 functional.
 
 m...@iunis:~# zpool status tank
   pool: tank
  state: ONLINE
 status: One or more devices is currently being resilvered.  The pool will
 continue to function, possibly in a degraded state.
 action: Wait for the resilver to complete.
  scrub: resilver in progress for 1h5m, 1,76% done, 61h12m to go
 config:
 
 NAME   STATE READ WRITE CKSUM
 tank   ONLINE   0 0 0
   raidz1-0 ONLINE   0 0 0
 c9t0d1 ONLINE   0 0 0
 c9t0d3 ONLINE   0 0 0  15K resilvered
 c9t0d0 ONLINE   0 0 0
 spare-3ONLINE   0 0 0
   c9t0d2   ONLINE   0 0 0  37,5K resilvered
   c16t0d0  ONLINE   0 0 0  14,1G resilvered
 cache
   c18t0d0  ONLINE   0 0 0
 spares
   c16t0d0  INUSE currently in use
 
 errors: No known data errors
 
 m...@iunis:~# zpool detach tank c16t0d0
 cannot detach c16t0d0: no valid replicas
 
 How can I solve the Problem?
 
 Martin
 
 
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS pool and filesystem version list, OpenSolaris builds list

2010-08-16 Thread Mark J Musante


I keep the pool version information up-to-date here:

http://blogs.sun.com/mmusante/entry/a_zfs_taxonomy

On Sun, 15 Aug 2010, Haudy Kazemi wrote:


Hello,

This is a consolidated list of ZFS pool and filesystem versions, along with 
the builds and systems they are found in. It is based on multiple online 
sources. Some of you may find it useful in figuring out where things are at 
across the spectrum of systems supporting ZFS including FreeBSD and FUSE. At 
the end of this message there is a list of the builds OpenSolaris releases 
and some OpenSolaris derivatives are based on. The list is sort-of but not 
strictly comma delimited, and of course may contain errata.


-hk


Solaris Nevada xx = snv_xx = onnv_xx ~= testing builds for Solaris 11
SXCE = Solaris Express Community Edition

ZFS Pool Version, Where found (multiple), Notes about this version
1, Nevada/SXCE 36, Solaris 10 6/06, Initial ZFS on-disk format integrated on 
10/31/05. During the next six months of internal use, there were a few 
on-disk format changes that did not result in a version number change, but 
resulted in a flag day since earlier versions could not read the newer 
changes. For '6389368 fat zap should use 16k blocks (with backwards 
compatibility)' and '6390677 version number checking makes upgrades 
challenging'
2, Nevada/SXCE 38, Solaris 10 10/06 (build 9), Ditto blocks (replicated 
metadata) for '6410698 ZFS metadata needs to be more highly replicated (ditto 
blocks)'
3, Nevada/SXCE 42, Solaris 10 11/06 (build 3), Hot spares and double parity 
RAID-Z for '6405966 Hot Spare support in ZFS' and '6417978 double parity 
RAID-Z a.k.a. RAID6' and '6288488 du reports misleading size on RAID-Z'
4, Nevada/SXCE 62, Solaris 10 8/07, zpool history for '6529406 zpool history 
needs to bump the on-disk version' and '6343741 want to store a command 
history on disk'
5, Nevada/SXCE 62, Solaris 10 10/08, gzip compression algorithm for '6536606 
gzip compression for ZFS'
6, Nevada/SXCE 62, Solaris 10 10/08, FreeBSD 7.0, 7.1, 7.2, bootfs pool 
property for '4929890 ZFS boot support for the x86 platform' and '6479807 
pools need properties'
7, Nevada/SXCE 68, Solaris 10 10/08, Separate intent log devices for '6339640 
Make ZIL use NVRAM when available'
8, Nevada/SXCE 69, Solaris 10 10/08, Delegated administration for '6349470 
investigate non-root restore/backup'
9, Nevada/SXCE 77, Solaris 10 10/08, refquota and refreservation properties 
for '6431277 want filesystem-only quotas' and '6483677 need immediate 
reservation' and '6617183 CIFS Service - PSARC 2006/715'
10, Nevada/SXCE 78, OpenSolaris 2008.05, Solaris 10 5/09 (Solaris 10 10/08 
supports ZFS version 10 except for cache devices), Cache devices for '6536054 
second tier (external) ARC'
11, Nevada/SXCE 94, OpenSolaris 2008.11, Solaris 10 10/09, Improved 
scrub/resilver performance for '6343667 scrub/resilver has to start over when 
a snapshot is taken'
12, Nevada/SXCE 96, OpenSolaris 2008.11, Solaris 10 10/09, added Snapshot 
properties for '6701797 want user properties on snapshot'
13, Nevada/SXCE 98, OpenSolaris 2008.11, Solaris 10 10/09, FreeBSD 7.3+, 
FreeBSD 8.0-RELEASE, Linux ZFS-FUSE 0.5.0, added usedby properties for 
'6730799 want user properties on snapshots' and 'PSARC/2008/518 ZFS space 
accounting enhancements'
14, Nevada/SXCE 103, OpenSolaris 2009.06, Solaris 10 10/09, FreeBSD 8-STABLE, 
8.1-RELEASE, 9-CURRENT, added passthrough-x aclinherit property support for 
'6765166 Need to provide mechanism to optionally inherit ACE_EXECUTE' and 
'PSARC 2008/659 New ZFS passthrough-x ACL inheritance rules'
15, Nevada/SXCE 114, added quota property support for '6501037 want 
user/group quotas on ZFS' and 'PSARC 2009/204 ZFS user/group quotas  space 
accounting'
16, Nevada/SXCE 116, Linux ZFS-FUSE 0.6.0, added stmf property support for 
'6736004 zvols need an additional property for comstar support'
17, Nevada/SXCE 120, added triple-parity RAID-Z for '6854612 triple-parity 
RAID-Z'
18, Nevada/SXCE 121, Linux zfs-0.4.9, added ZFS snapshot holds for '6803121 
want user-settable refcounts on snapshots'
19, Nevada/SXCE 125, added ZFS log device removal option for '6574286 
removing a slog doesn't work'
20, Nevada/SXCE 128, added zle compression to support dedupe in version 21 
for 'PSARC/2009/571 ZFS Deduplication Properties'
21, Nevada/SXCE 128, added deduplication properties for 'PSARC/2009/571 ZFS 
Deduplication Properties'
22, Nevada/SXCE 128a, Nexenta Core Platform Beta 2, Beta 3, added zfs receive 
properties for 'PSARC/2009/510 ZFS Received Properties'
23, Nevada 135, Linux ZFS-FUSE 0.6.9, added slim ZIL support for '6595532 ZIL 
is too talkative'
24, Nevada 137, added support for system attributes for '6716117 ZFS needs 
native system attribute infrastructure' and '6516171 zpl symlinks should have 
their own object type'

25, Nevada ??, Nexenta Core Platform RC1
26, Nevada 141, Linux zfs-0.5.0


ZFS Pool Version, OpenSolaris, Solaris 10, Description
1 snv_36 Solaris 10 6/06 

Re: [zfs-discuss] Replaced pool device shows up in zpool status

2010-08-16 Thread Mark J Musante

On Mon, 16 Aug 2010, Matthias Appel wrote:


Can anybody tell me how to get rid of c1t3d0 and heal my zpool?


Can you do a zpool detach performance c1t3d0/o?  If that works, then 
zpool replace performance c1t3d0 c1t0d0 should replace the bad disk with 
the new hot spare.  Once the resilver completes, do a zpool detach 
performance c1t3d0 to remove the bad disk and promote the hot spare to a 
full member of the pool.


Or, if that doesn't work, try the same thing with c1t3d0 and c1t3d0/o 
swapped around.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How do I Import rpool to an alternate location?

2010-08-16 Thread Mark Musante

On 16 Aug 2010, at 22:30, Robert Hartzell wrote:

 
 cd /mnt ; ls
 bertha export var
 ls bertha
 boot etc
 
 where is the rest of the file systems and data?

By default, root filesystems are not mounted.  Try doing a zfs mount 
bertha/ROOT/snv_134___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Opensolaris is apparently dead

2010-08-14 Thread Mark Bennett
That's a very good question actually. I would think that COMSTAR would
stay because its used by the Fishworks appliance... however, COMSTAR is
a competitive advantage for DIY storage solutions. Maybe they will rip
it out of S11 and make it an add-on or something. That would suck.


I guess the only real reason you can't yank COMSTAR is because its now
the basis for iSCSI Target support. But again, there is nothing saying
that Target support has to be part of the standard OS offering.

Scary to think about. :)

benr.

That would be the sensible commercial decision, and kill off the competition in 
the storage market using OpenSolaris based product.

I haven't found a linux that can reliably spin the 100Tb I currently have 
behind OpenSolaris and ZFS.
Luckily b134 doesn't seem to have any major issues, and I'm currently looking 
into a USB boot/raidz root combination for 1U storage.

I ran Red Hat 9 with updated packages for quite a few years.
As long as the kernel is stable, and you can work through the hurdles, it can 
still do the job.


Mark.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS development moving behind closed doors

2010-08-14 Thread Mark Bennett
On 8/13/10 8:56 PM -0600 Eric D. Mudama wrote:
 On Fri, Aug 13 at 19:06, Frank Cusack wrote:
 Interesting POV, and I agree. Most of the many distributions of
 OpenSolaris had very little value-add. Nexenta was the most interesting
 and why should Oracle enable them to build a business at their expense?

 These distributions are, in theory, the gateway drug where people
 can experiment inexpensively to try out new technologies (ZFS, dtrace,
 crossbow, comstar, etc.) and eventually step up to Oracle's big iron
 as their business grows.

I've never understood how OpenSolaris was supposed to get you to Solaris.
OpenSolaris is for enthusiasts and great great folks like Nexenta.
Solaris lags so far behind it's not really an upgrade path.

Fedora is a great beta test arena for what eventually becomes a commercial 
Enterprise offering. OpenSolaris was the Solaris equivalent.

Losing the free bleeding edge testing community will no doubt impact on the 
Solaris code quality.

It is now even more likely Solaris will revert to it's niche on SPARC over the 
next few years.

Mark.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs replace problems please please help

2010-08-11 Thread Mark J Musante

On Tue, 10 Aug 2010, seth keith wrote:


# zpool status
 pool: brick
state: UNAVAIL
status: One or more devices could not be used because the label is missing
   or invalid.  There are insufficient replicas for the pool to continue
   functioning.
action: Destroy and re-create the pool from a backup source.
  see: http://www.sun.com/msg/ZFS-8000-5E
scrub: none requested
config:

   NAME   STATE READ WRITE CKSUM
   brick  UNAVAIL  0 0 0  insufficient replicas
 raidz1   UNAVAIL  0 0 0  insufficient replicas
   c13d0  ONLINE   0 0 0
   c4d0   ONLINE   0 0 0
   c7d0   ONLINE   0 0 0
   c4d1   ONLINE   0 0 0
   replacing  UNAVAIL  0 0 0  insufficient replicas
 c15t0d0  UNAVAIL  0 0 0  cannot open
 c11t0d0  UNAVAIL  0 0 0  cannot open
   c12d0  FAULTED  0 0 0  corrupted data
   c6d0   ONLINE   0 0 0

What I want is to remove c15t0d0 and c11t0d0 and replace with the original 
c6d1. Suggestions?


Do the labels still exist on c6d1?  e.g. what do you get from zdb -l 
/dev/rdsk/c6d1s0?


If the label still exists, and the pool guid is the same as the labels on 
the other disks, you could try doing a zpool detach brick c15t0d0 (or 
c11t0d0), then export  try re-importing.  ZFS may find c6d1 at that 
point.  There's no way to guarantee that'll work.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs replace problems please please help

2010-08-11 Thread Mark J Musante

On Wed, 11 Aug 2010, Seth Keith wrote:



When I do a zdb -l /dev/rdsk/any device I get the same output for all my 
drives in the pool, but I don't think it looks right:

# zdb -l /dev/rdsk/c4d0


What about /dev/rdsk/c4d0s0?
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs replace problems please please help

2010-08-11 Thread Mark J Musante

On Wed, 11 Aug 2010, seth keith wrote:


   NAME  STATE READ WRITE CKSUM
   brick DEGRADED 0 0 0
 raidz1  DEGRADED 0 0 0
   c13d0 ONLINE   0 0 0
   c4d0  ONLINE   0 0 0
   c7d0  ONLINE   0 0 0
   c4d1  ONLINE   0 0 0
   14607330800900413650  UNAVAIL  0 0 0  was 
/dev/dsk/c15t0d0s0
   c11t1d0   ONLINE   0 0 0
   c6d0  ONLINE   0 0 0


OK, that's good - your missing disk can be replaced with a brand new disk 
using zpool replace brick 14607330800900413650 disk name.  Then wait 
for the resilver to complete and do a full scrub to be on the safe side.



errors: 352808 data errors, use '-v' for a list

I there someway I can take the original zpool label from the first 500GB 
drive I replaced and use it to fix up the other drives in the pool?


No.  The files with errors can only be restored from any backups you made. 
If there is an original disk that's not part of your pool, you might want 
to try making a backup of it, plug it in, and see if a zpool export/zpool 
import will find it.  But it will only find it if zdb -l shows four valid 
labels.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs replace problems please please help

2010-08-10 Thread Mark J Musante

On Tue, 10 Aug 2010, seth keith wrote:


first off I don't have the exact failure messages here, and I did not take good 
notes of the failures, so I will do the best I can. Please try and give me 
advice anyway.

I have a 7 drive raidz1 pool with 500G drives, and I wanted to replace them all 
with 2TB drives. Immediately I ran into trouble. If I tired:

  zpool offline brick device


Were you doing an in-place replace?  i.e. pulling out the old disk and 
putting in the new one?



I got a message like: insufficient replicas


This means that there was a problem with the pool already.  When ZFS opens 
a pool, it looks at the disks that are part of that pool.  For raidz1, if 
more than one disk is unopenable, then the pool will report that there are 
no valid replicas, which is probably the error message you saw.


If that's the case, then your pool already had one failed drive in, and 
you were attempting to disable a second drive.  Do you have a copy of the 
output from zpool status brick from before you tried your experiment?




I tried to

   zpool replace brick old device new device

and I got something like: new device must be a single disk


Unfortunately, this just means that we got back an EINVAL from the kernel, 
which could mean any one of a number of things, but probably there was an 
issue with calculating the drive size.  I'd try plugging it separately and 
using 'format' to see how big solaris thinks the drive is.




I finally got replace and offline to work by:

   zpool export brick
   [reboot]
   zpool import brick


Probably didn't need to reboot there.


now

   zpool offline brick old device
   zpool replace brick old device new device


If you use this form for the replace command, you don't need to offline 
the old disk first.  You only need to offline a disk if you're going to 
pull it out.  And then you can do an in-place replace just by issuing 
zpool replace brick device-you-swapped


This worked. zpool status showed replacing in progress, and then after 
about 26 hours of resilvering, everything looked fine. The old device 
was gone, and no errors in the pool. Now I tried to do it again with the 
next device. I missed the zpool offline part however. Immediately, I 
started getting disk errors on both the drive I was replacing and the 
first drive I replaced.


Read errors?  Write errors?  Checksum errors?  Sounds like a full scrub 
would have been a good idea prior to replacing the second disk.


I have the two original drives, they are in good shape and should still 
have all the data on them, can I somehow put my original zpool back. 
How? Please help!


You can try exporting the pool, plugging in the original drives, and then 
do a recovery on it.  See the zpool manpage under zpool import for the 
recovery options and what the flags mean.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to identify user-created zfs filesystems?

2010-08-04 Thread Mark J Musante


You can use 'zpool history -l syspool' to show the username of the person 
who created the dataset.  The history is in a ring buffer, so if too many 
pool operations have happened since the dataset was created, the 
information is lost.



On Wed, 4 Aug 2010, Peter Taps wrote:


Folks,

In my application, I need to present user-created filesystems. For my test, I created a 
zfs pool called mypool and two file systems called cifs1 and cifs2. However, when I run 
zfs list, I see a lot more entries:

# zfs list
NAME USED  AVAIL  REFER  MOUNTPOINT
mypool  1.31M  1.95G33K  /volumes/mypool
mypool/cifs11.12M  1.95G  1.12M  /volumes/mypool/cifs1
mypool/cifs2  44K  1.95G44K  /volumes/mypool/cifs2
syspool 3.58G  4.23G  35.5K  legacy
syspool/dump 716M  4.23G   716M  -
syspool/rootfs-nmu-000  1.85G  4.23G  1.36G  legacy
syspool/rootfs-nmu-001  53.5K  4.23G  1.15G  legacy
syspool/swap1.03G  5.19G  71.4M  -

I just need to present cifs1 and cifs2 to the user. Is there a property on the 
filesystem that I can use to determine user-created filesystems?

Thank you in advance for your help.

Regards,
Peter
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss




Regards,
markm
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Splitting root mirror to prep for re-install

2010-08-04 Thread Mark Musante

You can also use the zpool split command and save yourself having to do the 
zfs send|zfs recv step - all the data will be preserved.

zpool split rpool preserve does essentially everything up to and including 
the zpool export preserve commands you listed in your original email.  Just 
don't try to boot off it.

On 4 Aug 2010, at 20:58, Edward Ned Harvey wrote:

 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of Chris Josephes
 
 I have a host running svn_133 with a root mirror pool that I'd like to
 rebuild with a fresh install on new hardware; but I still have data on
 the pool that I would like to preserve.
 
 So, after rebuilding, you don't want to restore the same OS that you're
 currently running.  But there are some files you'd like to save for after
 you reinstall.  Why not just copy them off somewhere, in a tarball or
 something like that?
 
 
 Given a rpool with disks c7d0s0 and c6d0s0, I think the following
 process will do what I need:
 
 1. Run these commands
 
 # zpool detach rpool c6d0s0
 # zpool create preserve c6d0s0
 
 The only reason you currently have the rpool in a slice (s0) is because
 that's a requirement for booting.  If you aren't planning to boot from the
 device after breaking it off the mirror ... Maybe just use the whole device
 instead of the slice.
 
 zpool create preserve c6d0
 
 
 # zfs create export/home
 # zfs send rpool/export/home | zfs receive preserve/home
 # zfs send (other filesystems)
 # zpool export preserve
 
 These are not right.  It should be something more like this:
 zfs create -o readonly=on preserve/rpool_export_home
 zfs snapshot rpool/export/h...@fubarsnap
 zfs send rpool/export/h...@fubarsnap | zfs receive -F
 preserve/rpool_export_home
 
 And finally
 zpool export preserve
 
 
 2. Build out new host with svn_134, placing new root pool on c6d0s0 (or
 whatever it's called on the new SATA controller)
 
 Um ... I assume that's just a type-o ... 
 Yes, install fresh.  No, don't overwrite the existing preserve disk.
 
 For that matter, why break the mirror at all?  Just install the OS again,
 onto a single disk, which implicitly breaks the mirror.  Then when it's all
 done, use zpool import to import the other half of the mirror, which you
 didn't overwrite.
 
 
 3. Run zpool import against preserve, copy over data that should be
 migrated.
 
 4. Rebuild the mirror by destroying the preserve pool and attaching
 c7d0s0 to the rpool mirror.
 
 Am I missing anything?
 
 If you blow away the partition table of the 2nd disk (as I suggested above,
 but now retract) then you'll have to recreate the partition table of the
 second disk.  So you only attach s0 to s0.
 
 After attaching, and resilvering, you'll want to installgrub on the 2nd
 disk, or else it won't be bootable after the first disk fails.  See the ZFS
 Troubleshooting Guide for details.
 
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] snapshot question

2010-07-29 Thread Mark
I'm trying to understand how snapshots work in terms of how I can use them for 
recovering and/or duplicating virtual machines, and how I should set up my file 
system.

I want to use OpenSolaris as a storage platform with NFS/ZFS for some 
development VMs; that is, the VMs use the OpenSolaris box as their NAS for 
shared access.

Should I set up a separate ZFS file system for each VM so I can individually 
snapshot each one on a regular basis, or does it matter? The goal would be to 
be able to take an individual VM back to a previous point in time without 
changing the others.

Thanks
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] root pool expansion

2010-07-28 Thread Mark J Musante

On Wed, 28 Jul 2010, Gary Gendel wrote:


Right now I have a machine with a mirrored boot setup.  The SAS drives are 43Gs 
and the root pool is getting full.

I do a backup of the pool nightly, so I feel confident that I don't need to 
mirror the drive and can break the mirror and expand the pool with the detached 
drive.

I understand how to do this on a normal pool, but is there any restrictions for 
doing this on the root pool?  Are there any grub issues?


You cannot stripe a root pool.  Best you could do in this instance is to 
create a new pool from the detached mirror.  You may want to consider 
keepting the redundancy of the mirror config so that zfs can automatically 
repair any corruption it detects.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] VMGuest IOMeter numbers

2010-07-25 Thread Mark
Hello, first time posting.  I've been working with zfs on and off with limited 
*nix experience for a year or so now, and have read a lot of things by a lot of 
you I'm sure.  Still tons I don't understand/know I'm sure.

We've been having awful IO latencies on our 7210 running about 40 VM's spread 
over 3 hosts, no SSD's / Intent Logs.  I am trying to get some, but the 
price...  so I had to work up some sort of PoC to show it would help.   It so 
happens I just purchased 3, X25-M's for my own use, and could spare one for a 
few weeks (though I hate to think how many cells I burned testing), and we also 
happen to have a couple home built small ZFS servers around to test with.

Pretty limited resources, we have a home built box with 5, 250gb 7200rpm sata 
disks, each connected to the Intel Server board's built in sata ports.  I 
reduced the RAM to 2GB for the tests.  The OS is on a sata disk in its own 
single disk pool.  The X25-M was used as a ZIL Log for the SSD Tests.

I created 5 VM's on a single ESX Host (Dell 1950) with a Data Store connected 
to the mini-thumper running 2009.06 snv_111b via NFS over a single GB link.

Each VM runs Windows 2003 R2 on a 4GB C:\ vmdk.  Dynamo runs 1 worker on the 
local C: vmdk on each guest and reports to my workstation, so the numbers below 
are totals of the dynamo on all 5 guests.

Each test consisted of an 8k transfer with a 67% read, and 70% random pattern.  
The tests were run for 5 minutes each.

Queue Depth, IOPS, Avg Latency (ms)

RAID0 - 5 Disk  
1,  326,15.3
2,  453,22
4,  516,38.7
8,  503,72.3
16, 526,152
32, 494,323

RAID0-4 Disk +SSD   
1,  494,10.1
2,  579,17.2
4,  580,34.4
8,  603,66.3
16, 598,133.6
32, 600,266

RAIDz - 5 Disk  
1,  144,34
2,  162,60
4,  184,108
8,  183,218
16, 175,455
32, 185,864

RAIDz - 4 Disk +SSD 
1,  222,22
2,  201,50
4,  221,90
8,  219,181
16, 228,348
32, 228,700

RAID10 - 4 Disk 
1,  159,31
2,  206,48
4,  236,84
8,  194,205
16, 243,328
32, 219,728

RAID10 - 4 Disk +SSD
1,  270,18
2,  332,30
4,  363,54
8,  320,124
16, 325,245
32, 333,479

(wonders how the formatting will turn out)

Its interesting that going from a 5 disk RAIDz to a 4 disk Mirror (both with no 
log device) has a bigger increase then using X25-M Log with a 4 disk RAIDz.  

The increase in IO's adding the X25-M to the Mirror setup is nice, but smaller 
then I had expected, but the halving of the latencies is even nicer.

I am curious how this would scale with a lot more disks, the SSD didn't 
increase performance as much as I had hoped, but its still nice to see...  I'm 
thinking that's mostly due to my limit of 4-5 disks.  I'm not sure how much 
difference there is between the X25-M and the SUN SSD's for the 7000 series.  

From what I've read so far the X25-E needs to have its write-cache forced off 
to function proper, where the X25-M seems to obey the flush commands?  I was 
also curious if I would have seen a bigger increase with an SLC drive instead 
of the MLC...  searching turns up so much old info.

Comments welcome!
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] invalid vdev configuration meltdown

2010-07-15 Thread Mark J Musante

On Thu, 15 Jul 2010, Tim Castle wrote:


j...@opensolaris:~# zpool import -d /dev

...shows nothing after 20 minutes


OK, then one other thing to try is to create a new directory, e.g. /mydev, 
and create in it symbolic links to only those drives that are part of your 
pool.


Based on your label output, I see:

path='/dev/ad6'
path='/dev/ad4'
path='/dev/ad16'
path='/dev/ad18'
path='/dev/ad8'
path='/dev/ad10'

I'm guessing /dev has many more entries in, and the zpool import command 
is hanging in its attempt to open each one of those.


So try doing:

# ln -s /dev/ad6 /mydev/ad6
...
# ln -s /dev/ad10 /mydev/ad10

This way, you can issue zpool import -d /mydev and the import code 
should *only* see the devices that are part of the pool.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] invalid vdev configuration meltdown

2010-07-14 Thread Mark J Musante


What does 'zpool import -d /dev' show?

On Wed, 14 Jul 2010, Tim Castle wrote:


My raidz1 (ZFSv6) had a power failure, and a disk failure. Now:


j...@opensolaris:~# zpool import
 pool: files
   id: 3459234681059189202
state: UNAVAIL
status: One or more devices contains corrupted data.
action: The pool cannot be imported due to damaged devices or data.
  see: http://www.sun.com/msg/ZFS-8000-5E
config:

files  UNAVAIL  insufficient replicas
  raidz1   UNAVAIL  insufficient replicas
c8d1s8 UNAVAIL  corrupted data
c9d0p0 ONLINE
/dev/ad16  OFFLINE
c9d1s8 UNAVAIL  corrupted data
/dev/ad8   UNAVAIL  corrupted data
c8d0p0 ONLINE
j...@opensolaris:~# zpool import files
cannot import 'files': pool may be in use from other system
use '-f' to import anyway
j...@opensolaris:~# zpool import -f files
cannot import 'files': invalid vdev configuration


ad16 is the dead drive.
ad8 is fine but disconnected. I can only connect 4 sata drives to open solaris: 
my pci sata card isn't compatible.
I created and used the pool with FreeNAS, which gives me the same error when 
all 5 drives are connected.

So why do c8d1s8 c9d1s8 show up as slices? c9d0p0, c8d0p0, and ad8 when 
connected, show up as partitions.

zdb -l returns the same thing for all 5 drives. Labels 0 and 1 are fine. 2 and 
3 fail to unpack.


j...@opensolaris:~# zdb -l /dev/dsk/c8d1s8

LABEL 0

   version=6
   name='files'
   state=0
   txg=2123835
   pool_guid=3459234681059189202
   hostid=0
   hostname='freenas.local'
   top_guid=18367164273662411813
   guid=7276810192259058351
   vdev_tree
   type='raidz'
   id=0
   guid=18367164273662411813
   nparity=1
   metaslab_array=14
   metaslab_shift=32
   ashift=9
   asize=6001199677440
   children[0]
   type='disk'
   id=0
   guid=7276810192259058351
   path='/dev/ad6'
   devid='ad:STF602MR3GHBZP'
   whole_disk=0
   DTL=1012
   children[1]
   type='disk'
   id=1
   guid=5425645052930513342
   path='/dev/ad4'
   devid='ad:STF602MR3EZ0WP'
   whole_disk=0
   DTL=1011
   children[2]
   type='disk'
   id=2
   guid=4766543340687449042
   path='/dev/ad16'
   devid='ad:GTA000PAG7PGGA'
   whole_disk=0
   DTL=1010
   offline=1
   children[3]
   type='disk'
   id=3
   guid=16172918065436695818
   path='/dev/ad18'
   devid='ad:WD-WCAU42121120'
   whole_disk=0
   DTL=1009
   children[4]
   type='disk'
   id=4
   guid=3693181954889803829
   path='/dev/ad8'
   devid='ad:STF602MR3EYWJP'
   whole_disk=0
   DTL=1008
   children[5]
   type='disk'
   id=5
   guid=5419080715831351987
   path='/dev/ad10'
   devid='ad:STF602MR3ESPYP'
   whole_disk=0
   DTL=1007

LABEL 1

   version=6
   name='files'
   state=0
   txg=2123835
   pool_guid=3459234681059189202
   hostid=0
   hostname='freenas.local'
   top_guid=18367164273662411813
   guid=7276810192259058351
   vdev_tree
   type='raidz'
   id=0
   guid=18367164273662411813
   nparity=1
   metaslab_array=14
   metaslab_shift=32
   ashift=9
   asize=6001199677440
   children[0]
   type='disk'
   id=0
   guid=7276810192259058351
   path='/dev/ad6'
   devid='ad:STF602MR3GHBZP'
   whole_disk=0
   DTL=1012
   children[1]
   type='disk'
   id=1
   guid=5425645052930513342
   path='/dev/ad4'
   devid='ad:STF602MR3EZ0WP'
   whole_disk=0
   DTL=1011
   children[2]
   type='disk'
   id=2
   guid=4766543340687449042
   path='/dev/ad16'
   devid='ad:GTA000PAG7PGGA'
   whole_disk=0
   DTL=1010
   offline=1
   children[3]
   type='disk'
   id=3
   guid=16172918065436695818
   path='/dev/ad18'
   devid='ad:WD-WCAU42121120'
   whole_disk=0
   DTL=1009
   children[4]
   type='disk'
   id=4
   

[zfs-discuss] ZFS crash

2010-07-07 Thread Mark Christooph
I had an interesting dilemma recently and I'm wondering if anyone here can 
illuminate on why this happened.

I have a number of pools, including the root pool, in on-board disks on the 
server. I also have one pool on a SAN disk, outside the system. Last night the 
SAN crashed, and shortly thereafter, the ZFS system executed a number of cron 
jobs, most of which involved running functions on the pool that was on the SAN. 
This caused a number of problems, most notably that when the SAN eventually 
came up, those cron jobs finished, and then crashed the system again.

Only by [i]zfs destroy[/i] on the newly created zfs file system that the cron 
jobs created was the system able to boot up again. As long as those corrupted 
zfs file systems remained on the SAN disk, not even the rpool would boot up 
correctly. None of the zfs file systems would mount, and most services were 
disabled. Once I destroyed the newly created zfs file systems, everything 
instantly mounted and all services started.

Question: why would those one zfs file systems prevent ALL pools from mounting, 
even when they are on different disks and file systems, and prevent all 
services from starting? I thought ZFS was more resistant to this sort of thing. 
I will have to edit my scripts and add SAN-checking to make sure it is up 
before they execute to prevent this from happening again. Luckily I still had 
all the raw data that the cron jobs were working with, so I was able to quickly 
re-create what the cron jobs did originally.   Although this happened with 
Solaris 10, perhaps the discussion could be applicable to OpenSolaris as well 
(I use both).
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS fsck?

2010-07-06 Thread Mark J Musante

On Tue, 6 Jul 2010, Roy Sigurd Karlsbakk wrote:


Hi all

With several messages in here about troublesome zpools, would there be a 
good reason to be able to fsck a pool? As in, check the whole thing 
instead of having to boot into live CDs and whatnot?


You can do this with zpool scrub.  It visits every allocated block and 
verifies that everything is correct.  It's not the same as fsck in that 
scrub can detect and repair problems with the pool still online and all 
datasets mounted, whereas fsck cannot handle mounted filesystems.


If you really want to use it on an exported pool, you can use zdb, 
although it might take some time.  Here's an example on a small empty 
pool:


# zpool create -f mypool raidz c4t1d0s0 c4t2d0s0 c4t3d0s0 c4t4d0s0 c4t5d0s0
# zpool list mypool
NAME SIZE  ALLOC   FREECAP  DEDUP  HEALTH  ALTROOT
mypool   484M   280K   484M 0%  1.00x  ONLINE  -
# zpool export mypool
# zdb -ebcc mypool

Traversing all blocks to verify checksums and verify nothing leaked ...

No leaks (block sum matches space maps exactly)

bp count:  48
bp logical:378368  avg:   7882
bp physical:39424  avg:821 compression:   9.60
bp allocated:  185344  avg:   3861 compression:   2.04
bp deduped: 0ref1:  0   deduplication:   1.00
SPA allocated: 185344 used:  0.04%

#
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS fsck?

2010-07-06 Thread Mark J Musante

On Tue, 6 Jul 2010, Roy Sigurd Karlsbakk wrote:

what I'm saying is that there are several posts in here where the only 
solution is to boot onto a live cd and then do an import, due to 
metadata corruption. This should be doable from the installed system


Ah, I understand now.

A couple of things worth noting:

- if the root filesystem in a boot pool cannot be mounted, it's 
problematic to access the tools necessary to repair it.  So going to a 
livecd (or a network boot for that matter) is the best way forward.


- if the tools available to failsafe are insufficient to repair a pool, 
then booting off a livecd/network is the only way forward.


It is also worth pointing out here that the 134a build has the pool 
recovery code built-in.  The -F option to zpool import only became 
available after build 128 or 129.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS on external iSCSI storage

2010-07-01 Thread Mark
I'm new with ZFS, but I have had good success using it with raw physical disks. 
One of my systems has access to an iSCSI storage target. The underlying 
physical array is in a propreitary disk storage device from Promise. So the 
question is, when building a OpenSolaris host to store its data on an external 
iSCSI device, is there anything conceptually wrong with creating a raidz pool 
from a group of raw LUNs carved from the iSCSI device?

Thanks for your advice.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Zpool import not working

2010-06-12 Thread Mark Musante

I'm guessing that the virtualbox VM is ignoring write cache flushes.  See this 
for more ifno:
http://forums.virtualbox.org/viewtopic.php?f=8t=13661

On 12 Jun, 2010, at 5.30, zfsnoob4 wrote:

 Thanks, that works. But it only when I do a proper export first.
 
 If I export the pool then I can import with:
 zpool import -d /
 (test files are located in /)
 
 but if I destroy the pool, then I can no longer import it back, even though 
 the files are still there. Is this normal?
 
 
 Thanks for your help.
 -- 
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] NOTICE: spa_import_rootpool: error 5

2010-06-07 Thread Mark S Durney

IHAC

Who has an x4500(x86 box) who has a zfs root filesystem. They installed 
patches today,
the latest solaris 10 x86 recommended patch cluster and the patching 
seemed to complete
successfully. Then when they tried to reboot the box the machine would 
not boot? They

get the following error


NOTICE: spa_import_rootpool: error 5, Inc. All rights reserved.
Cannot mount root on
/p...@0,0/pci8086,2...@4/pci111d,8...@0/pci111d,8...@4/pci108e,2...@0/d...@0,0:a 

/p...@0,0/pci8086,2...@4/pci111d,8...@0/pci111d,8...@4/pci108e,2...@0/d...@1,0:a 


fstype zfs

panic[cpu0]/thread=fbc28820: vfs_mountroot: cannot mount root

fbc4b190 genunix:vfs_mountroot+323 ()
fbc4b1d0 genunix:main+a9 ()
fbc4b1e0 unix:_start+95 ()

skipping system dump - no dump device configured
rebooting...


The customer states that he backed out the kernel patch 142901-12 and then
the x4500 boots successfully???  Has anyone seen this? It almost seems like
the zfs root pool is not being seen upon reboot??

Any help on this would be greatly appreciated.





___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and IBM SDD Vpaths

2010-05-29 Thread Mark Musante

Can you find the devices in /dev/rdsk?  I see there is a path in /pseudo at 
least, but the zpool import command only looks in /dev.  One thing you can try 
is doing this:

# mkdir /tmpdev
# ln -s /pseudo/vpat...@1:1 /tmpdev/vpath1a

And then see if 'zpool import -d /tmpdev' finds the pool.


On 29 May, 2010, at 19.53, morris hooten wrote:

 I have 6 zfs pools and after rebooting init 6 the vpath device path names 
 have changed for some unknown reason. But I can't detach, remove and reattach 
 to the new device namesANY HELP! please
 
 pjde43m01  -  -  -  -  FAULTED  -
 pjde43m02  -  -  -  -  FAULTED  -
 pjde43m03  -  -  -  -  FAULTED  -
 poas43m01  -  -  -  -  FAULTED  -
 poas43m02  -  -  -  -  FAULTED  -
 poas43m03  -  -  -  -  FAULTED  -
 
 
 One pool listed below as example
 
 pool: poas43m01
 state: UNAVAIL
 status: One or more devices could not be opened.  There are insufficient
replicas for the pool to continue functioning.
 action: Attach the missing device and online it using 'zpool online'.
   see: http://www.sun.com/msg/ZFS-8000-3C
 scrub: none requested
 config:
 
NAMESTATE READ WRITE CKSUM
poas43m01   UNAVAIL  0 0 0  insufficient replicas
  vpath4c   UNAVAIL  0 0 0  cannot open
 
 
 
 before
 
 30. vpath1a IBM-2145- cyl 8190 alt 2 hd 64 sec 256
  /pseudo/vpat...@1:1
  31. vpath2a IBM-2145- cyl 13822 alt 2 hd 64 sec 256
  /pseudo/vpat...@2:2
  32. vpath3a IBM-2145- cyl 13822 alt 2 hd 64 sec 256
  /pseudo/vpat...@3:3
  33. vpath4a IBM-2145- cyl 13822 alt 2 hd 64 sec 256
  /pseudo/vpat...@4:4
  34. vpath5a IBM-2145- cyl 27646 alt 2 hd 64 sec 256
  /pseudo/vpat...@5:5
  35. vpath6a IBM-2145- cyl 27646 alt 2 hd 64 sec 256
  /pseudo/vpat...@6:6
  36. vpath7a IBM-2145- cyl 27646 alt 2 hd 64 sec 256
  /pseudo/vpat...@7:7
 
 
 after
 
 30. vpath1a IBM-2145- cyl 8190 alt 2 hd 64 sec 256
  /pseudo/vpat...@1:1
  31. vpath8a IBM-2145- cyl 13822 alt 2 hd 64 sec 256
  /pseudo/vpat...@8:8
  32. vpath9a IBM-2145- cyl 13822 alt 2 hd 64 sec 256
  /pseudo/vpat...@9:9
  33. vpath10a IBM-2145- cyl 13822 alt 2 hd 64 sec 256
  /pseudo/vpat...@10:10
  34. vpath11a IBM-2145- cyl 27646 alt 2 hd 64 sec 256
  /pseudo/vpat...@11:11
  35. vpath12a IBM-2145- cyl 27646 alt 2 hd 64 sec 256
  /pseudo/vpat...@12:12
  36. vpath13a IBM-2145- cyl 27646 alt 2 hd 64 sec 256
  /pseudo/vpat...@13:13
 
 
 
 
 {usbderp...@root} zpool detach poas43m03 vpath2c
 cannot open 'poas43m03': pool is unavailable
 -- 
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool vdev's

2010-05-28 Thread Mark Musante

On 28 May, 2010, at 17.21, Vadim Comanescu wrote:

 In a stripe zpool configuration (no redundancy) is a certain disk regarded as 
 an individual vdev or do all the disks in the stripe represent a single vdev 
 ? In a raidz configuration im aware that every single group of raidz disks is 
 regarded as a top level vdev but i was wondering how is it in the case i 
 mentioned earlier. Thanks.

In a stripe config, each disk is considered a top-level vdev.


Regards,
markm
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] cannot import pool from another system, device-ids different! please help!

2010-05-24 Thread Mark J Musante

On Mon, 24 May 2010, h wrote:

i had 6 disks in a raidz1 pool that i replaced from 1TB drives to 2TB 
drives. i have installed the older 1TB drives in another system and 
would like to import the old pool to access some files i accidentally 
deleted from the new pool.


Did you use the 'zpool replace' command to do the replace?  If so, once 
the replace completes, the ZFS label on the original disk is overwritten 
to make it available for new pools.



Regards,
markm
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs mount -a kernel panic

2010-05-20 Thread Mark J Musante

On Wed, 19 May 2010, John Andrunas wrote:


ff001f45e830 unix:die+dd ()
ff001f45e940 unix:trap+177b ()
ff001f45e950 unix:cmntrap+e6 ()
ff001f45ea50 zfs:ddt_phys_decref+c ()
ff001f45ea80 zfs:zio_ddt_free+55 ()
ff001f45eab0 zfs:zio_execute+8d ()
ff001f45eb50 genunix:taskq_thread+248 ()
ff001f45eb60 unix:thread_start+8 ()


This shows you're using some recent bits that includes dedup.  How recent 
is your build?  The stack you show here is similar to that in CR 6915314, 
which we haven't been able to root-cause yet.


Let me know if you get a chance to upload the core as Lori Alt outlined, 
and I can update our bug tracking system to reflect that.



Regards,
markm
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Very serious performance degradation

2010-05-20 Thread Mark J Musante

On Thu, 20 May 2010, Edward Ned Harvey wrote:

Also, since you've got s0 on there, it means you've got some 
partitions on that drive.  You could manually wipe all that out via 
format, but the above is pretty brainless and reliable.


The s0 on the old disk is a bug in the way we're formatting the output. 
This was fixed in CR 6881631.



Regards,
markm
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs mount -a kernel panic

2010-05-19 Thread Mark J Musante


Do you have a coredump?  Or a stack trace of the panic?

On Wed, 19 May 2010, John Andrunas wrote:


Running ZFS on a Nexenta box, I had a mirror get broken and apparently
the metadata is corrupt now.  If I try and mount vol2 it works but if
I try and mount -a or mount vol2/vm2 is instantly kernel panics and
reboots.  Is it possible to recover from this?  I don't care if I lose
the file listed below, but the other data in the volume would be
really nice to get back.  I have scrubbed the volume to no avail.  Any
other thoughts.


zpool status -xv vol2
 pool: vol2
state: ONLINE
status: One or more devices has experienced an error resulting in data
   corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
   entire pool from backup.
  see: http://www.sun.com/msg/ZFS-8000-8A
scrub: none requested
config:

   NAMESTATE READ WRITE CKSUM
   vol2ONLINE   0 0 0
 mirror-0  ONLINE   0 0 0
   c3t3d0  ONLINE   0 0 0
   c3t2d0  ONLINE   0 0 0

errors: Permanent errors have been detected in the following files:

   vol2/v...@snap-daily-1-2010-05-06-:/as5/as5-flat.vmdk

--
John
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss




Regards,
markm
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] MPT issues strikes back

2010-04-27 Thread Mark Ogden
Bruno Sousa on Tue, Apr 27, 2010 at 09:16:08AM +0200 wrote:
 Hi all,
 
 Yet another story regarding mpt issues, and in order to make a long
 story short everytime that a Dell R710 running snv_134 logs the information
  scsi: [ID 107833 kern.warning] WARNING:
 /p...@0,0/pci8086,3...@4/pci1028,1...@0 (mpt0): , the system freezes and
 ony a hard-reset fixes the issue.
 
 Is there any sort of parameter to be used to minimize/avoid this issue?


We had the same problem on a X4600, turned out to be a bad
SSD and or connection at the location listed in the error message. 

Since removing that drive, we have not encounted that issue. 

You might want to look at

http://bugs.opensolaris.org/bugdatabase/view_bug.do;jsessionid=7acda35c626180d9cda7bd1df451?bug_id=6894775
 too.


-Mark

 Machine specs :
 
 Dell R710, 16 GB memory, 2 Intel Quad-Core E5506
 SunOS san01 5.11 snv_134 i86pc i386 i86pc Solaris
 Dell Integrated SAS 6/i Controller ( mpt0 Firmware version v0.25.47.0
 (IR) ) with 2 disks attached without raid
 
 
 Thanks in advance,
 Bruno
 
 
 
 -- 
 This message has been scanned for viruses and
 dangerous content by MailScanner, and is
 believed to be clean.
 
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re-attaching zpools after machine termination [amazon ebs ec2]

2010-04-23 Thread Mark Musante

On 23 Apr, 2010, at 7.06, Phillip Oldham wrote:
 
 I've created an OpenSolaris 2009.06 x86_64 image with the zpool structure 
 already defined. Starting an instance from this image, without attaching the 
 EBS volume, shows the pool structure exists and that the pool state is 
 UNAVAIL (as expected). Upon attaching the EBS volume to the instance the 
 status of the pool changes to ONLINE, the mount-point/directory is 
 accessible and I can write data to the volume.
 
 Now, if I terminate the instance, spin-up a new one, and connect the same 
 (now unattached) EBS volume to this new instance the data is no longer there 
 with the EBS volume showing as blank. 

Could you share with us the zpool commands you are using?


Regards,
markm___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re-attaching zpools after machine termination [amazon ebs ec2]

2010-04-23 Thread Mark Musante

On 23 Apr, 2010, at 7.31, Phillip Oldham wrote:

 I'm not actually issuing any when starting up the new instance. None are 
 needed; the instance is booted from an image which has the zpool 
 configuration stored within, so simply starts and sees that the devices 
 aren't available, which become available after I've attached the EBS device.
 

Forgive my ignorance with EC2/EBS, but why doesn't the instance remember that 
there were EBS volumes attached?  Why aren't they automatically attached prior 
to booting solaris within the instance?  The error output from zpool status 
that you're seeing matches what I would expect if we are attempting to import 
the pool at boot, and the disks aren't present.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re-attaching zpools after machine termination [amazon ebs ec2]

2010-04-23 Thread Mark Musante

On 23 Apr, 2010, at 8.38, Phillip Oldham wrote:

 The instances are ephemeral; once terminated they cease to exist, as do all 
 their settings. Rebooting an image keeps any EBS volumes attached, but this 
 isn't the case I'm dealing with - its when the instance terminates 
 unexpectedly. For instance, if a reboot operation doesn't succeed or if 
 there's an issue with the data-centre.

OK, I think if this issue can be addressed, it would be by people familiar with 
how EC2  EBS interact.  The steps I see are:

- start a new instance
- attach the EBS volumes to it
- log into the instance and zpool online the disks

I know the last step can be automated with a script inside the instance, but 
I'm not sure about the other two steps.


Regards,
markm

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


  1   2   3   4   5   6   >