Re: [zfs-discuss] previously mentioned J4000 released

2008-07-11 Thread Ross
Yes, but pricing that's so obviously disconnected with cost leads customers to 
feel they're being ripped off.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] x4500 panic report.

2008-07-11 Thread Jorgen Lundman

Today we had another panic, at least it was during work time :) Just a 
shame the 999GB ufs takes 80+ mins to fsck. (Yes, it is mounted 'logging').







panic[cpu3]/thread=ff001e70dc80:
free: freeing free block, dev:0xb60024, block:13144, ino:1737885, 
fs:/export
/saba1


ff001e70d500 genunix:vcmn_err+28 ()
ff001e70d550 ufs:real_panic_v+f7 ()
ff001e70d5b0 ufs:ufs_fault_v+1d0 ()
ff001e70d6a0 ufs:ufs_fault+a0 ()
ff001e70d770 ufs:free+38f ()
ff001e70d830 ufs:indirtrunc+260 ()
ff001e70dab0 ufs:ufs_itrunc+738 ()
ff001e70db60 ufs:ufs_trans_itrunc+128 ()
ff001e70dbf0 ufs:ufs_delete+3b0 ()
ff001e70dc60 ufs:ufs_thread_delete+da ()
ff001e70dc70 unix:thread_start+8 ()

syncing file systems...

panic[cpu3]/thread=ff001e70dc80:
panic sync timeout

dumping to /dev/dsk/c6t0d0s1, offset 65536, content: kernel


  $c
vpanic()
vcmn_err+0x28(3, f783a128, ff001e70d678)
real_panic_v+0xf7(0, f783a128, ff001e70d678)
ufs_fault_v+0x1d0(ff04facf65c0, f783a128, ff001e70d678)
ufs_fault+0xa0()
free+0x38f(ff001e70d8d0, a6a7358, 2000, 89)
indirtrunc+0x260(ff001e70d8d0, a6a42b8, , 0, 89)
ufs_itrunc+0x738(ff0550b9fde0, 0, 81, fffec0594db0)
ufs_trans_itrunc+0x128(ff0550b9fde0, 0, 81, fffec0594db0)
ufs_delete+0x3b0(fffed20e2a00, ff0550b9fde0, 1)
ufs_thread_delete+0xda(64704840)
thread_start+8()

  ::panicinfo
  cpu3
   thread ff001e70dc80
  message
free: freeing free block, dev:0xb60024, block:13144, ino:1737885, 
fs:/export
/saba1
  rdi f783a128
  rsi ff001e70d678
  rdx f783a128
  rcx ff001e70d678
   r8 f783a128
   r90
  rax3
  rbx0
  rbp ff001e70d4d0
  r10 fffec3d40580
  r10 fffec3d40580
  r11 ff001e70dc80
  r12 f783a128
  r13 ff001e70d678
  r143
  r15 f783a128
   fsbase0
   gsbase fffec3d40580
   ds   4b
   es   4b
   fs0
   gs  1c3
   trapno0
  err0
  rip fb83c860
   cs   30
   rflags  246
  rsp ff001e70d488
   ss   38
   gdt_hi0
   gdt_lo 81ef
   idt_hi0
   idt_lo 7fff
  ldt0
 task   70
  cr0 8005003b
  cr2 fed0e010
  cr3  2c0
  cr4  6f8





Jorgen Lundman wrote:
 On Saturday the X4500 system paniced, and rebooted. For some reason the 
 /export/saba1 UFS partition was corrupt, and needed fsck. This is why 
 it did not come back online. /export/saba1 is mounted logging,noatime, 
 so fsck should never (-ish) be needed.
 
 SunOS x4500-01.unix 5.11 snv_70b i86pc i386 i86pc
 
 /export/saba1 on /dev/zvol/dsk/zpool1/saba1 
 read/write/setuid/devices/intr/largefiles/logging/quota/xattr/noatime/onerror=panic/dev=2d80024
  
 on Sat Jul  5 08:48:54 2008
 
 
 One possible related bug:
 
 http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=4884138
 
 
 What would be the best solution? Go back to latest Solaris 10 and pass 
 it on to Sun support, or find a patch for this problem?
 
 
 
 Panic dump follows:
 
 
 -rw-r--r--   1 root root 2529300 Jul  5 08:48 unix.2
 -rw-r--r--   1 root root 10133225472 Jul  5 09:10 vmcore.2
 
 
 # mdb unix.2 vmcore.2
 Loading modules: [ unix genunix specfs dtrace cpu.AuthenticAMD.15 uppc 
 pcplusmp scsi_vhci ufs md ip hook neti sctp arp usba uhci s1394 qlc fctl 
 nca lofs zfs random cpc crypto fcip fcp logindmux nsctl sdbc ptm sv ii 
 sppp rdc nfs ]
 
   $c
 vpanic()
 vcmn_err+0x28(3, f783ade0, ff001e737aa8)
 real_panic_v+0xf7(0, f783ade0, ff001e737aa8)
 ufs_fault_v+0x1d0(fffed0bfb980, f783ade0, ff001e737aa8)
 ufs_fault+0xa0()
 dqput+0xce(1db26ef0)
 dqrele+0x48(1db26ef0)
 ufs_trans_dqrele+0x6f(1db26ef0)
 ufs_idle_free+0x16d(ff04f17b1e00)
 ufs_idle_some+0x152(3f60)
 ufs_thread_idle+0x1a1()
 thread_start+8()
 
 
   ::cpuinfo
   ID ADDR FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD 
 PROC
0 fbc2fc10  1b00  60   nono t-0 
 ff001e737c80 sched
1 fffec3a0a000  1f10  -1   nono t-0ff001e971c80
   (idle)
2 fffec3a02ac0  1f00  -1   nono t-1ff001e9dbc80
   (idle)
3 fffec3d60580  1f00  -1   no

Re: [zfs-discuss] X4540

2008-07-11 Thread Ross
Well, I'm not holding out much hope of Sun working with these suppliers any 
time soon.  I asked Vmetro why they don't work with Sun considering how well 
ZFS seems to fit with their products, and this was the reply I got:

Micro Memory has a long history of working with Sun, and I worked at Sun for 
almost 10 years developing Solaris x86.  We have tried to get various Sun 
Product Managers responsible for these servers (Thumper) to work with us on 
this and they have said no.  We have tried to get Sun's integration group to 
work with us (where they would integrate upon customer request, charging the 
customer for integration and support), and they have also said no.  They don't 
feel there is an adequate business case to justify it as all of the 
opportunities are so small.

This is an incredibly frustrating response for all the Sun customers who could 
have really benefited from these cards.  Why develop the ability to move the 
ZIL to nvram devices, benchmark the Thumper on one of them, and then refuse to 
work with the manufacturer to offer the card to customers?

I appreciate Sun are working on their own flash memory solutions, but surely 
it's to their benefit and ours to take advantage of the technology already on 
the market with years of tried  tested use behind it?
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Mirroring - Scenario

2008-07-11 Thread Ross
Essentially yes, the entire pool dies.  If you think of each mirror as an 
individual disk, you've just striped them together so the pool goes offline if 
any mirror fails, and each mirror can only guard against one half of the mirror 
failing.

If you want to guard against any two trays failing, you need to use some kind 
of dual parity protection.  Either dual mirrors, or raid-z2.  Given that you 
only have 8 LUN's, raid-z2 would seem to be the best option.

If you really need to use mirroring for performance, is there any way you can 
split those trays to generate two LUN's each?  That gives you 16 LUN's in 
total, enough for five dual mirror sets (using 3 LUN's each), plus one acting 
as a hot spare.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] please help with raid / failure / rebuild calculations

2008-07-11 Thread Ross
Without checking your math, I believe you may be confusing the risk of *any* 
data corruption with the risk of a total drive failure, but I do agree that the 
calculation should just be for the data on the drive, not the whole array.

My feeling on this from the various analyses I've read on the web is that 
you're reasonably likely to find some corruption on a drive during a rebuild, 
but raid-6 protects you from this nicely.  From memory, I think the stats were 
something like a 5% chance of an error on a 500GB drive, which would mean 
something like a 10% chance with your 1TB drives.  That would tie in with your 
figures if you took out the multiplier for the whole raid's data.  Instead of a 
guaranteed failure, you've calculated around 1 in 10 odds.

So, during any rebuild you've around a 1 in 10 chance of the rebuild 
encountering *some* corruption, but that's very likely going to be just a few 
bits of data, which can be easily recovered using raid-6 and the rest of the 
rebuild can carry on as normal.  

Of course there's always a risk of a second drive failing, which is why we have 
backups, but I believe that risk is miniscule in comparison, and also offset by 
the ability to regularly scrub your data, which helps to ensure that any 
problems with drives are caught early on.  Early replacement of failing drives 
means it's far less likely that you'll ever have two fail together.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS/Install related question

2008-07-11 Thread Ross
Get a cheap 5th SATA drive to act as your boot drive, install Solaris on that, 
and then let ZFS use the whole of the remaining 4 drives.

That gives you performance benefits, and it means it's very easy to recover if 
your boot drive fails - just re-install Solaris and zpool import the raid 
array.  The raid data is stored on the drives so you can even take those 4 
drives and fit them to another machine if you need the data quick.  ZFS doesn't 
even care what order the drives are attached in.

To install Solaris, just boot from the DVD and follow the prompts.  I managed 
that as a windows admin with no Linux or Solaris experience so you should be 
fine :-)
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Case study/recommended ZFS setup for home file server

2008-07-11 Thread Ross
It was posted in the CIFS forum a couple of days ago:
http://www.opensolaris.org/jive/forum.jspa?forumID=214

Thread: HEADS-UP: Please skip snv_93 if you use CIFS server:
http://www.opensolaris.org/jive/thread.jspa?threadID=65996tstart=0
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Case study/recommended ZFS setup for home file server

2008-07-11 Thread Brandon High
On Thu, Jul 10, 2008 at 1:15 AM, Fajar A. Nugraha [EMAIL PROTECTED] wrote:
 Another alternative is to use an IDE to Compact Flash adapter, and
 boot off of flash.

 Just curious, what will that flash contain?
 e.g. will it be similar to linux's /boot, or will it contain the full
 solaris root?
 How do you manage redundancy (e.g. mirror) for that boot device?

4gb is enough to hold a minimal system install. /var will go to a file
system on the raidz pool.

ZFS mirroring can be used on boot devices for redundancy.

-B

-- 
Brandon High [EMAIL PROTECTED]
The good is the enemy of the best. - Nietzsche
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS send/receive questions

2008-07-11 Thread Darren J Moffat
Will Murnane wrote:
 On Thu, Jul 10, 2008 at 12:43, Glaser, David [EMAIL PROTECTED] wrote:
 I guess what I was wondering if there was a direct method rather than the 
 overhead of ssh.
 On receiving machine:
 nc -l 12345 | zfs recv mypool/[EMAIL PROTECTED]
 and on sending machine:
 zfs send sourcepool/[EMAIL PROTECTED] | nc othermachine.umich.edu 12345
 You'll need to build your own netcat, but this is fairly simple.  If

Why ?

Pathname: /usr/bin/nc
Type: regular file
Expected mode: 0555
Expected owner: root
Expected group: bin
Expected file size (bytes): 31428
Expected sum(1) of contents: 5207
Expected last modification: Jun 16 05:58:18 2008
Referenced by the following packages:
 SUNWnetcat
Current status: installed


-- 
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] X4540

2008-07-11 Thread David Magda

On Jul 10, 2008, at 12:42, Tim wrote:

 It's the same reason you don't see HDS or EMC rushing to adjust the  
 price of
 the SYM or USP-V based on Sun releasing the thumpers.

No one ever got fired for buying EMC/HDS/NTAP

I know my company has corporate standards for various aspects of  
IT, and if someone purchases something out side of that (which is  
frowned upon) then you're on your own. If you open a service /  
trouble ticket for it they'll just close it saying not supported.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Recovering an array on Mac

2008-07-11 Thread Lee Fyock

So, does anybody have an approach to recovering this filesystem?

Is there a way to relabel the drives so that ZFS will recognize them,  
without losing the data?


Thanks,
Lee

On Jul 5, 2008, at 1:24 PM, Lee Fyock wrote:


Hi--

Here's the scoop, in probably too much detail:

I'm a sucker for new filesystems and new tech in general. For you  
old-time Mac people, I installed Sequoia when it was first seeded,  
and had to reformat my drive several times as it grew to the final  
release. I flipped the journaled flag before I even knew what it  
meant. I installed the pre-Leopard ZFS seed and have been using it  
for, what, a year?


So, I started with two 500 GB drives in a single pool, not mirrored.  
I bought a 1 TB drive and added it to the pool. I bought another 1  
TB drive, and finally had enough storage (~1.5 TB) to mirror my  
disks and be all set for the foreseeable future.


In order to migrate my data from a single pool of 500 GB + 500 GB +  
1 TB to a mirrored 500GB/500GB + 1TB/1TB pool, I was planning on  
doing this:


1) Copy everything to the New 1 TB drive (slopping what wouldn't fit  
onto another spare drive)

2) Upgrade to the latest ZFS for Mac release (117)
3) Destroy the existing pool
4) Create a pool with the two 500 GB drives
5) Copy everything from the New drive to the 500 GB x 2 pool
6) Create a mirrored pool with the two 1 TB drives
7) Copy everything from the 500 GB x 2 pool to the mirrored 1 TB pool
8) Destroy the 500 GB x 2 pool, and create it as a 500GB/500GB  
mirrored pair and add it to the 1TB/1TB pool


During step 7, while I was at work, the power failed at home,  
apparently long enough to drain my UPS.


When I rebooted my machine, both pools refused to mount: the 500+500  
pool and the 1TB/1TB mirrored pool. Just about all my data is lost.  
This was my media server containing my DVD rips, so everything is  
recoverable in that I can re-rip 1+TB, but I'd rather not.


diskutil list says this:
/dev/disk1
   #:   TYPE NAMESIZE
IDENTIFIER
   0: FDisk_partition_scheme*465.8 Gi
disk1
   1:465.8 Gi
disk1s1

/dev/disk2
   #:   TYPE NAMESIZE
IDENTIFIER
   0: FDisk_partition_scheme*465.8 Gi
disk2
   1:465.8 Gi
disk2s1

/dev/disk3
   #:   TYPE NAMESIZE
IDENTIFIER
   0: FDisk_partition_scheme*931.5 Gi
disk3
   1:931.5 Gi
disk3s1

/dev/disk4
   #:   TYPE NAMESIZE
IDENTIFIER
   0: FDisk_partition_scheme*931.5 Gi
disk4
   1:931.5 Gi
disk4s1


During step 2, I created the pools using zpool create media mirror / 
dev/disk3 /dev/disk4 then zpool upgrade, since I got warnings  
that the filesystem version was out of date. Note that I created  
zpools referring to the entire disk, not just a slice. I had  
labelled the disks using

diskutil partitiondisk /dev/disk2 GPTFormat ZFS %noformat% 100%
but now the disks indicate that they're FDisk_partition_scheme.

Googling for FDisk_partition_scheme yields http://lists.macosforge.org/pipermail/zfs-discuss/2008-March/000240.html 
, among other things, but no hint of where to go from here.


zpool import -D reports no pools available to import.

All of this is on a Mac Mini running Mac OS X 10.5.3, BTW. I own  
Parallels if using an OpenSolaris build would be of use.


So, is the data recoverable?

Thanks!
Lee

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] please help with raid / failure / rebuild calculations

2008-07-11 Thread User Name
Hello relling,

Thanks for your comments.  FWIW, I am building an actual hardware array, so een 
though I _may_ put ZFS on top of the hardware arrays 22TB drive that the OS 
sees (I may not) I am focusing purely on the controller rebuild.

So, setting aside ZFS for the moment, am I still correct in my intuition that 
there is no way a _controller_ needs to touch a disk more times than there are 
bits on the entire disk, and that this calculation people are doing is faulty ?

I will check out that blog - thanks.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] raid or mirror

2008-07-11 Thread dick hoogendijk
I'm still confused.
What is a -SAFE- way with two drives if you prepare for hardware
faulure? That is: one drive fails and the system does not go down
because the other drive takes over. Do I need raid or mirror?

-- 
Dick Hoogendijk -- PGP/GnuPG key: 01D2433D
++ http://nagual.nl/ + SunOS sxce snv91 ++
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] raid or mirror

2008-07-11 Thread Johan Hartzenberg
Hi Dick

You want Mirroring.  A Sun system with mirrored disks can be configured to
not go down due to one disk failing.  For this to be valid, you need to also
make sure that the device used for SWAP is mirrored - you won't believe how
many times I've seen this mistake being made.

To be even MORE safe, you want the two disks to be on separate controllers,
so that you can survive a controller failure too.

note: Technically, mirroring is RAID, to be specific, it is Raid level 1.

  _Johan


On Fri, Jul 11, 2008 at 2:37 PM, dick hoogendijk [EMAIL PROTECTED] wrote:

 I'm still confused.
 What is a -SAFE- way with two drives if you prepare for hardware
 faulure? That is: one drive fails and the system does not go down
 because the other drive takes over. Do I need raid or mirror?

 --
 Dick Hoogendijk -- PGP/GnuPG key: 01D2433D
 ++ http://nagual.nl/ + SunOS sxce snv91 ++
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss




-- 
Any sufficiently advanced technology is indistinguishable from magic.
Arthur C. Clarke

Afrikaanse Stap Website: http://www.bloukous.co.za

My blog: http://initialprogramload.blogspot.com

ICQ = 193944626, YahooIM = johan_hartzenberg, GoogleTalk =
[EMAIL PROTECTED], AIM = JohanHartzenberg
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Mirroring - Scenario

2008-07-11 Thread Johan Hartzenberg
Sorry, but I'm stuck at 6540.

There are so many options in how you would practically configure these that
there is no way to give a sensible answer to your question.  But the most
basic questions are: Does the racks have power from separate PDUs?  Are they
in physically remote locations?  Does your fabric switches have redundant
power from separate PDUs?

Do you want mirroring here purely for performance reasons?  Because these
systems have so much internal redundancy that I can not see why you would
want to mirror across them.

Striping would give you better performance.

On Thu, Jul 10, 2008 at 11:01 PM, Robb Snavely [EMAIL PROTECTED] wrote:

 I have a scenario (tray failure) that I am trying to predict how zfs
 will behave and am looking for some input .  Coming from the world of
 svm, ZFS is WAY different ;)

 If we have 2 racks, containing 4 trays each, 2 6540's that present 8D
 Raid5 luns to the OS/zfs and through zfs we setup a mirror config such
 that: I'm oversimplifying here but...

 Rack 1 - Tray 1 = lun 0Rack 2 - Tray 1  =  lun 4
 Rack 1 - Tray 2 = lun 1Rack 2 - Tray 2  =  lun 5
 Rack 1 - Tray 3 = lun 2Rack 2 - Tray 3  =  lun 6
 Rack 1 - Tray 4 = lun 3Rack 2 - Tray 4  =  lun 7

 so the zpool command would be:

 zpool create somepool mirror 0 4 mirror 1 5 mirror 2 6 mirror 3 7
 ---(just for ease of explanation using the supposed lun numbers)

 so a status output would look similar to:

 somepool
  mirror
  0
  4
  mirror
  1
  5
  mirror
  3
  6
  mirror
  4
  7

 Now in the VERY unlikely event that we lost the first tray in each rack
 which contain 0 and 4 respectively...

 somepool
  mirror---
  0   |
  4   |   Bye Bye
 ---
  mirror
  1
  5
  mirror
  3
  6
  mirror
  4
  7


 Would the entire somepool zpool die?  Would it affect ALL users in
 this pool or a portion of the users?  Is there a way in zfs to be able
 to tell what individual users are hosed (my group is a bunch of control
 freaks ;)?  How would zfs react to something like this?  Also any
 feedback on a better way to do this is more then welcome

 Please keep in mind I am a ZFS noob so detailed explanations would be
 awesome.

 Thanks in advance

 Robb
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss




-- 
Any sufficiently advanced technology is indistinguishable from magic.
Arthur C. Clarke

Afrikaanse Stap Website: http://www.bloukous.co.za

My blog: http://initialprogramload.blogspot.com

ICQ = 193944626, YahooIM = johan_hartzenberg, GoogleTalk =
[EMAIL PROTECTED], AIM = JohanHartzenberg
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] X4540

2008-07-11 Thread Moore, Joe
Bob Friesenhahn
 I expect that Sun is realizing that it is already 
 undercutting much of 
 the rest of its product line.  These minor updates would allow the 
 X4540 to compete against much more expensive StorageTek SAN hardware. 

Assuming, of course that the requirements for the more expensive SAN
hardware don't include, for example, surviving a controller or
motherboard failure (or gracefully a RAM chip failure) without requiring
an extensive downtime for replacement, or other extended downtime
because there's only 1 set of chips that can talk to those disks.

Real SAN storage is dual-ported to dual controller nodes so that you
can replace a motherboard without taking down access to the disk.  Or
install a new OS version without waiting for the system to POST.

 How can other products remain profitable when competing 
 against such a 
 star performer?

Features.  RAS.  Simplicity.  Corporate Inertia (having storage admins
who don't know OpenSolaris).  Executive outings with StorageTek-logo'd
golfballs.  The last 2 aren't something I'd build a business case
around, but they're a reality.

--Joe
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] X4540

2008-07-11 Thread Tim
On Fri, Jul 11, 2008 at 9:25 AM, Moore, Joe [EMAIL PROTECTED] wrote:



 Features.  RAS.  Simplicity.  Corporate Inertia (having storage admins
 who don't know OpenSolaris).  Executive outings with StorageTek-logo'd
 golfballs.  The last 2 aren't something I'd build a business case
 around, but they're a reality.

 --Joe



Why not?  There's several in the market today whom I suspect have done just
that :D  I won't name names, but for anyone in the industry I doubt I have
to.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS send/receive questions

2008-07-11 Thread Will Murnane
On Fri, Jul 11, 2008 at 05:23, Darren J Moffat [EMAIL PROTECTED] wrote:
 Why ?
 Referenced by the following packages:
SUNWnetcat

Is this in 10u5?  Weird, it's not on my media.

Will
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS problem mirror

2008-07-11 Thread BG
Hi thanks for you help in the forum help i got an answer also iam gonna try 
that. But your suggestion is also an angle with i will investigate. Is there 
maybo some diagnostic tool in opensolaris i can use, or shall i use the solaris 
bootable cd that inspects of my hw is fully compitble ?

thanks !
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS send/receive questions

2008-07-11 Thread Darren J Moffat
Will Murnane wrote:
 On Fri, Jul 11, 2008 at 05:23, Darren J Moffat [EMAIL PROTECTED] wrote:
 Why ?
 Referenced by the following packages:
SUNWnetcat
 
 Is this in 10u5?  Weird, it's not on my media.

No but this is an opensolaris.org alias not a Solaris 10 support forum. 
  So the assumption unless people say otherwise is that you are running 
a recent build of SX:CE or OpenSolaris 2008.05 (including updates).


-- 
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Mirroring - Scenario

2008-07-11 Thread Bob Friesenhahn
On Fri, 11 Jul 2008, Ross wrote:

 If you want to guard against any two trays failing, you need to use 
 some kind of dual parity protection.  Either dual mirrors, or 
 raid-z2.  Given that you only have 8 LUN's, raid-z2 would seem to be 
 the best option.

System reliability will be dominated by the reliability of the weakest 
VDEV.  If all the VDEVs have the same reliability then the reliability 
of the entire load-shared pool will be the reliability of one VDEV 
divided by the number of VDEVs.  Given sufficient individual VDEV 
reliability, it can be seen that it takes quite a lot of VDEVs in the 
load-shared pool before the number of VDEVs becomes very significant 
in the pool reliability calculation.

Bob
==
Bob Friesenhahn
[EMAIL PROTECTED], http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS send/receive questions

2008-07-11 Thread Will Murnane
On Fri, Jul 11, 2008 at 11:44, Darren J Moffat [EMAIL PROTECTED] wrote:
 No but this is an opensolaris.org alias not a Solaris 10 support forum.  So
 the assumption unless people say otherwise is that you are running a recent
 build of SX:CE or OpenSolaris 2008.05 (including updates).
Luckily, the OP mentioned he's running 10u5 in his first post ;)

Will
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Recovering corrupted root pool

2008-07-11 Thread Rainer Orth
Yesterday evening, I tried Live Upgrade on a Sun Fire V60x running SX:CE 90
to SX:CE 93 with ZFS root (mirrored root pool called root).  The LU itself
ran without problems, but before rebooting the machine, I wanted to add
some space to the root pool that had previously been in use for an UFS BE.

Both disks (c0t0d0 and c0t1d0) were partitioned as follows:

Part  TagFlag Cylinders SizeBlocks
  0   rootwm   1 - 18810   25.91GB(18810/0/0) 54342090
  1 unassignedwm   18811 - 246188.00GB(5808/0/0)  16779312
  2 backupwm   0 - 24618   33.91GB(24619/0/0) 71124291
  3 unassignedwu   00 (0/0/0)0
  4 unassignedwu   00 (0/0/0)0
  5 unassignedwu   00 (0/0/0)0
  6 unassignedwu   00 (0/0/0)0
  7 unassignedwu   00 (0/0/0)0
  8   bootwu   0 - 01.41MB(1/0/0) 2889
  9 unassignedwu   00 (0/0/0)0

Slice 0 is used by the root pool, slice 1 was used by the UFS BE.  To
achieve this, I ludeleted the now unused UFS BE and used 

# NOINUSE_CHECK=1 format

to extend slice 0 by the size of slice 1, deleting the latter afterwards.
I'm pretty sure that I've done this successfully before, even on a live
system, but this time something went wrong: I remember an FMA message about
one side of the root pool mirror being broken (something about an
inconsistent label, unfortunately I didn't write down the exact message).
Nonetheless, I rebooted the machine after luactivate sol_nv_93 (the new ZFS
BE), but the machine didn't come up:

SunOS Release 5.11 Version snv_93 32-bit
Copyright 1983-2008 Sun Microsystems, Inc.  All rights reserved.
Use is subject to license terms.
NOTICE:
spa_import_rootpool: error 22


panic[cpu0]/thread=fec1cfe0: cannot mount root path /[EMAIL 
PROTECTED],0/pci8086,[EMAIL PROTECTED]/pci8086,[EMAIL PROTECTED]/pci8086,[EMAIL 
PROTECTED],1/[EMAIL PROTECTED],0:a /[EMAIL PROTECTED],0/pci8086,[EMAIL 
PROTECTED]/pci8086,[EMAIL PROTECTED]/pci8086,[EMAIL PROTECTED],1/[EMAIL 
PROTECTED],0:a

fec351ac genunix:rootconf+10b (c0f040, 1, fec1c750)
fec351d0 genunix:vfs_mountroot+54 (fe800010, fec30fd8,)
fec351e4 genunix:main+b4 ()

panic: entering debugger (no dump device, continue to reboot)
skipping system dump - no dump device configured
rebooting...

I've managed a failsafe boot (from the same pool), and zpool import reveals

  pool: root
id: 14475053522795106129
 state: UNAVAIL
status: The pool was last accessed by another system.
action: The pool cannot be imported due to damaged devices or data.
   see: http://www.sun.com/msg/ZFS-8000-EY
config:

root  UNAVAIL  insufficient replicas
  mirror  UNAVAIL  corrupted data
c0t1d0s0  ONLINE
c0t0d0s0  ONLINE

Even restoring slice 1 on both disks to its old size and shrinking slice 0
accordingly doesn't help.  I'm sure I've done this correctly since I could
boot from the old sol_nv_b90_ufs BE, which was still on c0t0d0s1.

I didn't have much success to find out what's going on here: I tried to
remove either of the disks in case both sides of the mirror are
inconsistent, but to no avail.  I didn't have much luck with zdb either.
Here's the output of zdb -l /dev/rdsk/c0t0d0s0 and /dev/rdsk/c0t1d0s0:

c0t0d0s0:


LABEL 0

version=10
name='root'
state=0
txg=14643945
pool_guid=14475053522795106129
hostid=336880771
hostname='erebus'
top_guid=17627503873514720747
guid=6121143629633742955
vdev_tree
type='mirror'
id=0
guid=17627503873514720747
whole_disk=0
metaslab_array=13
metaslab_shift=28
ashift=9
asize=36409180160
is_log=0
children[0]
type='disk'
id=0
guid=1526746004928780410
path='/dev/dsk/c0t1d0s0'
devid='id1,[EMAIL PROTECTED]/a'
phys_path='/[EMAIL PROTECTED],0/pci8086,[EMAIL 
PROTECTED]/pci8086,[EMAIL PROTECTED]/pci8086,[EMAIL PROTECTED],1/[EMAIL 
PROTECTED],0:a'
whole_disk=0
DTL=160
children[1]
type='disk'
id=1
guid=6121143629633742955
path='/dev/dsk/c0t0d0s0'
devid='id1,[EMAIL PROTECTED]/a'
phys_path='/[EMAIL PROTECTED],0/pci8086,[EMAIL 
PROTECTED]/pci8086,[EMAIL PROTECTED]/pci8086,[EMAIL PROTECTED],1/[EMAIL 
PROTECTED],0:a'
whole_disk=0
DTL=272

LABEL 1

version=10
name='root'
state=0

Re: [zfs-discuss] please help with raid / failure / rebuild calculations

2008-07-11 Thread Richard Elling
User Name wrote:
 Hello relling,

 Thanks for your comments.  FWIW, I am building an actual hardware array, so 
 een though I _may_ put ZFS on top of the hardware arrays 22TB drive that 
 the OS sees (I may not) I am focusing purely on the controller rebuild.

 So, setting aside ZFS for the moment, am I still correct in my intuition that 
 there is no way a _controller_ needs to touch a disk more times than there 
 are bits on the entire disk, and that this calculation people are doing is 
 faulty ?
   

I think the calculation is correct, at least for the general case.
At FAST this year there was an interesting paper which tried to
measure this exposure in a large field sample by using checksum
verifications.  I like this paper and it validates what we see in the
field -- the most common failure mode is unrecoverable read.
http://www.usenix.org/event/fast08/tech/ 
full_papers/bairavasundaram/bairavasundaram.pdf

I should also point out that ZFS is already designed to offer some
diversity which should help guard against spatially clustered
media failures.  hmmm... another blog topic in my queue...
 -- richard


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS problem mirror

2008-07-11 Thread Ross
There's nothing I know of I'm afraid, I'm too new to Solaris to have looked 
into things that deeply.

If you have access to any spare parts, the easiest way to test is to swop 
things over and see if the problem is reproducable.  It could even be something 
as simple as a struggling power supply.

Running a compatibility check does sound like a good first step though.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Largest (in number of files) ZFS instance tested

2008-07-11 Thread Sean Cochrane - Storage Architect


I need to find out what is the largest ZFS file system - in numbers of 
files, NOT CAPACITY that has been tested.


Looking to scale to billions of files and would like to know if anyone 
has tested anything close and what the performance ramifications are.


Has anyone tested a ZFS file system with at least 100 million + files?
What were the performance characteristics?

Thanks!

Sean

--
http://www.sun.com  * Sean Cochrane *
Global Storage Architect
*Sun Microsystems, Inc.*
525 South 1100 East
Salt Lake City, UT 84102 US
Phone +1877 255 5756
Mobile +1801-949-4799
Fax +1877.255.5756
Email [EMAIL PROTECTED]

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Largest (in number of files) ZFS instance tested

2008-07-11 Thread Rich Teer
On Fri, 11 Jul 2008, Sean Cochrane - Storage Architect wrote:

 I need to find out what is the largest ZFS file system - in numbers of files,
 NOT CAPACITY that has been tested.
 
 Looking to scale to billions of files and would like to know if anyone has
 tested anything close and what the performance ramifications are.

Wow.  Just curious, what sort of application is this?

-- 
Rich Teer, SCSA, SCNA, SCSECA

CEO,
My Online Home Inventory

URLs: http://www.rite-group.com/rich
  http://www.linkedin.com/in/richteer
  http://www.myonlinehomeinventory.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] previously mentioned J4000 released

2008-07-11 Thread Miles Nordin
 bf == Bob Friesenhahn [EMAIL PROTECTED] writes:

bf since the dawn of time

since the dawn of time Sun has been playing these games with hard
drive ``sleds''.  I still have sparc32 stuff on the shelf with
missing/extra sleds.

bf POTS line
bf cell phone
bf You are free to select products from a different vendor.

what?  So, this means he *shouldn't* feel like he's being ripped off
if he buys from Sun?  blinks

bf Sun's pricing likely reflects the high cost of product
bf development, warranty, service, and quality control.

You are talking about cost here, but the pricing reflects ``market
forces''.

The blog makes it sound like Sun engineers have come up with this
sneaky plan to achieve a certain tier of reliability at a tier below
in cost, but what they really mean by low cost is, low cost _to Sun_,
not to customers.  The price you pay is determined by what other
vendors charge for the same tier of reliability---knowing this, while
reading the blog you would already be thinking, ``oh fantastic, a tiny
~$10 chip and a plastic carrier that's practically free, but has
incredible market value.  They've come up with a scheme for ripping me
off.  What smooth and adept capitalists they are!  What merit, what
admiration I have for their schemes!  too bad it helps them, not me.''

If you're a stockholder, get excited about the blogs, but for
customers, without Sun's price list and their competitors' price lists
in front of you, there's apparently not much point in discussing
anything (except maybe, whether we can swap drives out of the tray and
have the thing still work or whether there is some ``sled DRM'' in the
closed-source LSI Logic SATA driver, and how much we save by not
buying a support contract which I assume is pointless after said
swapping).


pgpvZssYVw8bz.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] raid or mirror

2008-07-11 Thread Miles Nordin
 jh == Johan Hartzenberg [EMAIL PROTECTED] writes:

jh To be even MORE safe, you want the two disks to be on separate
jh controllers, so that you can survive a controller failure too.

or a controller-driver-failure.  At least on Linux, when a disk goes
bad, Linux starts resetting controllers and xATA busses and stuff, and
often takes out any nearby drives.  It's often hard to determine which
drive is actually bad.  Depending on how well-integrated your hardware
is with Solaris and how the drive fails, I suspect this sort of thing
could imagineably happen there, too.


pgpgboNQVlAQg.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] please help with raid / failure / rebuild calculations

2008-07-11 Thread Akhilesh Mritunjai
 Thanks for your comments.  FWIW, I am building an
 actual hardware array, so een though I _may_ put ZFS
 on top of the hardware arrays 22TB drive that the
 OS sees (I may not) I am focusing purely on the
 controller rebuild.

Not letting ZFS handle (at least one level of) redundancy is a bad idea. Don't 
do that!
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS problem mirror

2008-07-11 Thread Akhilesh Mritunjai
Hi

I too strongly suspect that some HW component is failing. It is rare to see all 
drives (in your case both drives in mirror and the boot drive) reporting errors 
at same time.

zfs clear just resets the error counters. You still have got errors in there.

Start with following components (in this order):

1. Memory: Use memtest86+ (use any live CD.. it is very common)
2. Power supply - search the forums, it is very common
3. Your mobo/disk controller - (??? try another one maybe)

Have you also experienced any kernel panics or strange random software crashes 
on this box ?
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Mirroring - Scenario

2008-07-11 Thread Robb Sanvely
Thank you for all the feedback!  It's appreciated!

@hartz
Does the racks have power from separate PDUs?

Yes

Are they in physically remote locations?

No, the racks are side by side

Does your fabric switches have redundant power from separate PDUs?

yes

Do you want mirroring here purely for performance reasons?

Goals would be data integrity, and REDUNDANCY - while not throwing performance 
completely out of the window.

Because these systems have so much internal redundancy that I can not see why 
you would want to mirror across them.

This is due to the fact that we have data that we just can't afford to lose.  
Our hardware/power setup is pretty good and we have good backups but want to 
try to make sure all of our customers data is protected from every 
angle...and as I said in the initial post I know this scenario is possible but 
not probable especially with the redundant power etcso in short, mirroring 
was put in there to account for one of the racks failing (again, unlikely in 
our setup..hope for the best...prepare for the worst)

Striping would give you better performance.

So how would this be setup in ZFS

zpool create somepool 0 1 2 3 4 5 6 7

So essentially a raid 0 on the zfs side? and leave all the redundancy on the 
hardware?shakes nervously  Am I understanding that correctly?
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Recovering an array on Mac

2008-07-11 Thread Akhilesh Mritunjai
This shouldn't have happened. Do you have zdb on Mac ? If yes you can try it. 
It is (intentionally?) undocumented, so you'll need to search for various 
scripts on blogs.sun.com and here. Something might just work. But do check what 
apple is actually shipping. You may want to use dtrace to find out why it can't 
find any pools. I doubt it is due to labelling mistake as that should have 
been flushed long back if you were copying data when you lost power. ZFS 
transactional property guarantees that.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS problem mirror

2008-07-11 Thread BG
Hi 

running all kinds of tools now even a tool for my hd from WD, so we will she 
what the results are.
I ordered another mobo this morning and if that  doesn't work then i will ask a 
fellow sysop to punt my disk in his solaris array.

No i didn't notice anything of kernel panics the only thing i noticed was this 
line popping up when i did a shutdown 
The machine himself is just used as a storage array nothing else is running on 
it, and i use CIFS to share and that works great.


gzip: kernel/misc/qlc/qlc_fw_2400: I/O error 

keep you posted thanks for everything already :)
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS problem mirror

2008-07-11 Thread Ross
Trying the disks in another machine is a great step, it will eliminate those 
quickly.  Use your own cables too so you can eliminate them from suspicion.

If this is hardware related, from my own experience I would say it's most 
likely to be (in order):
- Power Supply
- Memory  (especially if ever handled without anti-static precautions)
- Bad driver / disk controller
- Bad cpu / motherboard
- other component

When you get your new board, just set it up for troubleshooting with the bare 
minimum components:
- Power supply
- Motherboard
- CPU
- Memory
- Disks
- Power button

Don't even connect the reset switch or the case LED's.  It's by far the 
quickest way to eliminate items from suspicion.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Largest (in number of files) ZFS instance tested

2008-07-11 Thread Bob Friesenhahn
On Fri, 11 Jul 2008, Sean Cochrane - Storage Architect wrote:

 I need to find out what is the largest ZFS file system - in numbers of files, 
 NOT CAPACITY that has been tested.

In response to an earlier such question (from you?) I created a 
directory with a million files.  I forgot about it since then so the 
million files are still there without impacting anything for a month 
now.

The same simple script (with a small enhancement) could be used to 
create a million directories containing a million files but it might 
take a while to complete.  It seems that a Storage Architect should be 
willing to test this for himself and see what happens.

 Looking to scale to billions of files and would like to know if anyone has 
 tested anything close and what the performance ramifications are.

There are definitely issues with programs like 'ls' when listing a 
directory with a million files since 'ls' sorts its output by default. 
My Windows system didn't like it at all when accessing it with CIFS 
and the file browser since it wants to obtain all file information 
before doing anything else.  System backup with hundreds of millions 
of files sounds like fun.

 Has anyone tested a ZFS file system with at least 100 million + files?
 What were the performance characteristics?

I think that there are more issues with file fragmentation over a long 
period of time than the sheer number of files.

Bob
==
Bob Friesenhahn
[EMAIL PROTECTED], http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] please help with raid / failure / rebuild calculations

2008-07-11 Thread Bob Friesenhahn
On Fri, 11 Jul 2008, Akhilesh Mritunjai wrote:

 Thanks for your comments.  FWIW, I am building an
 actual hardware array, so een though I _may_ put ZFS
 on top of the hardware arrays 22TB drive that the
 OS sees (I may not) I am focusing purely on the
 controller rebuild.

 Not letting ZFS handle (at least one level of) redundancy is a bad 
 idea. Don't do that!

Agreed.

A further issue to consider is mean time to recover/restore.  This has 
quite a lot to do with actual uptime.  For example, if you decide to 
create two huge 22TB LUNs and mirror across them, if ZFS needs to 
resilver one of the LUNs it will take a *long* time.  A good design 
will try to keep any storage area which needs to be resilvered small 
enough that it may be restored quickly and risk of secondary failure 
is minimized.

Bob
==
Bob Friesenhahn
[EMAIL PROTECTED], http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] previously mentioned J4000 released

2008-07-11 Thread Ian Collins
Will Murnane wrote:
 If the prices on disks were lower on these, they would be interesting
 for low-end businesses or even high-end home users.  The chassis is
 within reach of reasonable, but the disk prices look ludicrously high
 from where I sit.  An empty one only costs $3k, sure, but fill it with
 twelve disks and it's up to $20k.  Are there some extra electronics
 required for larger disks that help explain this steep slope of cost?
 I can't think of any reasons off the top of my head (other than the
 understandable profit motive).

   
I guess most large customers only compare storage costs against other
storage vendors.  Most shops I've worked with only buy fully populated
shelves and none of them pay list!

Ian

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] previously mentioned J4000 released

2008-07-11 Thread David Schairer
The admin user doesn't have any access to customer data; just could  
kill off sessions, etc.

---

World-class email, DNS, web- and app-hosting services, www.concentric.com 
.



On Jul 11, 2008, at 2:05 PM, Ian Collins wrote:

 Will Murnane wrote:
 If the prices on disks were lower on these, they would be interesting
 for low-end businesses or even high-end home users.  The chassis is
 within reach of reasonable, but the disk prices look ludicrously high
 from where I sit.  An empty one only costs $3k, sure, but fill it  
 with
 twelve disks and it's up to $20k.  Are there some extra electronics
 required for larger disks that help explain this steep slope of cost?
 I can't think of any reasons off the top of my head (other than the
 understandable profit motive).


 I guess most large customers only compare storage costs against other
 storage vendors.  Most shops I've worked with only buy fully populated
 shelves and none of them pay list!

 Ian

 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] X4540

2008-07-11 Thread Ian Collins
Richard Elling wrote:

 The best news, for many folks, is that you can boot from an
 (externally pluggable) CF card, so that you don't have to burn
 two disks for the OS.
   
Can these be mirrored?  I've been bitten by these cards failing (in a
camera).

Ian

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] previously mentioned J4000 released

2008-07-11 Thread Bob Friesenhahn
On Fri, 11 Jul 2008, Tim wrote:

 20k list gets you into a decked out storevault with FCP/iSCSI/NFS...  For
 being just a jbod this thing is ridiculously overpriced, sorry.

 I'm normally the first one to defend Sun when it come to decisions made due
 to an enterprise customer base, but this will not be one of those
 situations.

You are not required to purchase a Sun product.  Just purchase a 
similar IBM or Adaptec JBOD product.  They will work fine with ZFS. 
If Sun's product is over-priced, they will find out soon enough and 
adjust their prices.  It may be that Sun initially sets the prices 
very high so that after they start shipping they can reduce the price 
and advertise the new bargian.

Bob
==
Bob Friesenhahn
[EMAIL PROTECTED], http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] previously mentioned J4000 released

2008-07-11 Thread Ian Collins
Tim wrote:
 On Fri, Jul 11, 2008 at 4:05 PM, Ian Collins [EMAIL PROTECTED]
 mailto:[EMAIL PROTECTED] wrote:

 Will Murnane wrote:
  If the prices on disks were lower on these, they would be
 interesting
  for low-end businesses or even high-end home users.  The chassis is
  within reach of reasonable, but the disk prices look ludicrously
 high
  from where I sit.  An empty one only costs $3k, sure, but fill
 it with
  twelve disks and it's up to $20k.  Are there some extra electronics
  required for larger disks that help explain this steep slope of
 cost?
  I can't think of any reasons off the top of my head (other than the
  understandable profit motive).
 
 
 I guess most large customers only compare storage costs against other
 storage vendors.  Most shops I've worked with only buy fully populated
 shelves and none of them pay list!

 20k list gets you into a decked out storevault with FCP/iSCSI/NFS... 
 For being just a jbod this thing is ridiculously overpriced, sorry.

 I'm normally the first one to defend Sun when it come to decisions
 made due to an enterprise customer base, but this will not be one of
 those situations.

OK, one client of mine has just installed an IBM DS3200 shelf.   Pop
over to IBM's site
(http://www-03.ibm.com/systems/storage/disk/ds3000/ds3200/browse.html)
and compare prices with a J4200.  For starters, the IBM sourced 1TB
drives are $249 more...

Ian
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] X4540

2008-07-11 Thread Richard Elling
Ian Collins wrote:
 Richard Elling wrote:
   
 The best news, for many folks, is that you can boot from an
 (externally pluggable) CF card, so that you don't have to burn
 two disks for the OS.
   
 
 Can these be mirrored?  I've been bitten by these cards failing (in a
 camera).
   

Yes, of course.  But there is only one CF slot.

If you are worried about data loss, zfs set copies=2.
If you are worried about CF loss, mirror to something else.
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Largest (in number of files) ZFS instance tested

2008-07-11 Thread Jonathan Edwards

On Jul 11, 2008, at 4:59 PM, Bob Friesenhahn wrote:


 Has anyone tested a ZFS file system with at least 100 million +  
 files?
 What were the performance characteristics?

 I think that there are more issues with file fragmentation over a long
 period of time than the sheer number of files.

actually it's a similar problem .. with a maximum blocksize of 128KB  
and the COW nature of the filesytem you get indirect block pointers  
pretty quickly on a large ZFS filesystem as the size of your tree  
grows .. in this case a large constantly modified file (eg: /u01/data/ 
*.dbf) is going to behave over time like a lot of random access to  
files spread across the filesystem .. the only real difference is that  
you won't walk it every time someone does a getdirent() or an lstat64()

so ultimately the question could be framed as what's the maximum  
manageable tree size you can get to with ZFS while keeping in mind  
that there's no real re-layout tool (by design) .. the number i'm  
working with until i hear otherwise is probably about 20M, but in the  
relativistic sense - it *really* does depend on how balanced your tree  
is and what your churn rate is .. we know on QFS we can go up to 100M,  
but i trust the tree layout a little better there, can separate the  
metadata out if i need to and have planned on it, and know that we've  
got some tools to relayout the metadata or dump/restore for a tape  
backed archive

jonathan

(oh and btw - i believe this question is a query for field data ..  
architect != crash test dummy .. but some days it does feel like it)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Largest (in number of files) ZFS instance tested

2008-07-11 Thread Peter Tribble
On Fri, Jul 11, 2008 at 5:33 PM, Sean Cochrane - Storage Architect
[EMAIL PROTECTED] wrote:

 I need to find out what is the largest ZFS file system - in numbers of
 files, NOT CAPACITY that has been tested.

 Looking to scale to billions of files and would like to know if anyone has
 tested anything close and what the performance ramifications are.

 Has anyone tested a ZFS file system with at least 100 million + files?

I've got a thumper with a pool that has over a hundred million files. I think
the most in a single filesystem is currently just under 30 million (we've got
plenty of those). It just works, although it's going to get a lot bigger before
we're done.

 What were the performance characteristics?

Not brilliant...

Although I suspect raid-z isn't exactly the ideal choice. Still, performance
generally is adequate for our needs, although backup performance isn't.

(The backup problem is the real stumbling block. And backup is an area ripe
for disruptive innovation.)

-- 
-Peter Tribble
http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Largest (in number of files) ZFS instance tested

2008-07-11 Thread Ian Collins
Peter Tribble wrote:
 On Fri, Jul 11, 2008 at 5:33 PM, Sean Cochrane - Storage Architect
 [EMAIL PROTECTED] wrote:
 What were the performance characteristics?
 

 Not brilliant...

 Although I suspect raid-z isn't exactly the ideal choice. Still, performance
 generally is adequate for our needs, although backup performance isn't.

 (The backup problem is the real stumbling block. And backup is an area ripe
 for disruptive innovation.)

   
Is down to volume of data, or many small files? 

I'm look into a problem with slow backup of a filesystem with many
thousands for small files.  We see high CPU load and miserable
performance on restores and I've been wondering if we can tune the
filesystem, or just zip the files.

I guess working with many small files and tape is more of an issue with
filesystem aware backups than block device ones (ufsdump).

Ian.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Largest (in number of files) ZFS instance tested

2008-07-11 Thread Mike Gerdts
On Fri, Jul 11, 2008 at 3:59 PM, Bob Friesenhahn
[EMAIL PROTECTED] wrote:
 There are definitely issues with programs like 'ls' when listing a
 directory with a million files since 'ls' sorts its output by default.
 My Windows system didn't like it at all when accessing it with CIFS
 and the file browser since it wants to obtain all file information
 before doing anything else.  System backup with hundreds of millions
 of files sounds like fun.

Millions of files in a directory has historically been the path to big
performance problems.  Even if zfs can handle millions, other tools
(ls, backup programs, etc.) will choke.  Create a hierarchy and you
will be much happier.

FWIW, I created 10+ million files and the necessary directories to
make it so that no directory had more than 10 entries (dirs or files)
in it.  I found the creation time to be quite steady at about 2500
file/directory creations per second over the entire exercise.  I saw
the kernel memory usage (kstat -p unix:0:system_pages:pp_kernel)
slowly and steadily increase while arc_c slowly decreased.  Out of
curiosity I crashed the box then ran ::findleaks to find that there
was just over 32KB leaked.  I've not dug in further to see where the
rest of the memory was used.

In the past when I was observing file creations on UFS, VxFS, and NFS
with millions of files in a single directory, the file operation time
was measured in seconds per operation, rather than operations per
second.  This was with several (100) processes contending for reading
directory contents, file creations, and file deletions.  This is where
I found the script that though that touch $dir/test.$$ (followed by
rm) was the right way to check to see if a directory is writable.

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] scrub failing to initialise

2008-07-11 Thread Jeff Bonwick
If the cabling outage was transient, the disk driver would simply retry
until they came back.  If it's a hotplug-capable bus and the disks were
flagged as missing, ZFS would by default wait until the disks came back
(see zpool get failmode pool), and complete the I/O then.  There would
be no missing disk writes, hence nothing to resilver.

Jeff

On Mon, Jul 07, 2008 at 06:55:02PM +0200, Justin Vassallo wrote:
 Hi,
 
  
 
 I've got a zpool made up of 2 mirrored vdevs. For one moment i had a cabling
 problem and lost all disks... i reconnected and onlined the disks. No
 resilvering kicked in, so i tried to force a scrub, but nothing's happening.
 I issue the command and it's as if i never did.
 
  
 
 Any suggestions?
 
  
 
 Thanks
 
 justin
 



 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss