Re: [zfs-discuss] [ZIL device brainstorm] intel x25-M G2 has ram cache?

2010-05-24 Thread Fred Liu
Yeah. It is also not so easy to capture the possible data loss during.
There is no reliable way to figure it out.

Thanks.

Fred.

-Original Message-
From: rwali...@washdcmail.com [mailto:rwali...@washdcmail.com] 
Sent: 星期二, 五月 25, 2010 11:42
To: Erik Trimble
Cc: Fred Liu; ZFS Discussions
Subject: Re: [zfs-discuss] [ZIL device brainstorm] intel x25-M G2 has ram cache?

On May 24, 2010, at 4:28 AM, Erik Trimble wrote:

> yes, both the X25-M (both G1 and G2) plus the X25-E have a DRAM buffer on the 
> controller, and neither has a supercapacitor (or other battery) to back it 
> up, so there is the potential for data loss (but /not/ data corruption) in a 
> power-loss scenario.
> 
> Sadly, we're pretty much at the point where no current retail-available SSD 
> has battery backup for it's on-controller DRAM cache (and, they /all/ use 
> DRAM caches).

I haven't seen where anyone has tested this, but the MemoRight SSD (sold by 
RocketDisk in the US) seems to claim all the right things:

http://www.rocketdisk.com/vProduct.aspx?ID=1

pdf specs:

http://www.rocketdisk.com/Local/Files/Product-PdfDataSheet-1_MemoRight%20SSD%20GT%20Specification.pdf

They claim to support the cache flush command, and with respect to DRAM cache 
backup they say (p. 14/section 3.9 in that pdf):

> The MemoRight’s NSSD have an on-drive backup power system. It saves energy 
> when the power supply is applied to drive. When power-off occurring, the 
> saved energy will be released to keep the drive working for a while. The 
> saved energy ensures the data in the cache can be flushed to the nonvolatile 
> flash media, which prevents the data loss to happen.
> It will take about 5 seconds to save enough energy for discharge at lease 1 
> second. The write cache will be disabled automatically before the backup 
> power system saved enough energy.

Which certainly sounds like an on-board capacitor to flush the cache and that 
the cache is disabled while charging the capacitor.  But I can't see where 
anyone has tested this on ZFS.

--Ware
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] [ZIL device brainstorm] intel x25-M G2 has ram cache?

2010-05-24 Thread rwalists
On May 24, 2010, at 4:28 AM, Erik Trimble wrote:

> yes, both the X25-M (both G1 and G2) plus the X25-E have a DRAM buffer on the 
> controller, and neither has a supercapacitor (or other battery) to back it 
> up, so there is the potential for data loss (but /not/ data corruption) in a 
> power-loss scenario.
> 
> Sadly, we're pretty much at the point where no current retail-available SSD 
> has battery backup for it's on-controller DRAM cache (and, they /all/ use 
> DRAM caches).

I haven't seen where anyone has tested this, but the MemoRight SSD (sold by 
RocketDisk in the US) seems to claim all the right things:

http://www.rocketdisk.com/vProduct.aspx?ID=1

pdf specs:

http://www.rocketdisk.com/Local/Files/Product-PdfDataSheet-1_MemoRight%20SSD%20GT%20Specification.pdf

They claim to support the cache flush command, and with respect to DRAM cache 
backup they say (p. 14/section 3.9 in that pdf):

> The MemoRight’s NSSD have an on-drive backup power system. It saves energy 
> when the power supply is applied to drive. When power-off occurring, the 
> saved energy will be released to keep the drive working for a while. The 
> saved energy ensures the data in the cache can be flushed to the nonvolatile 
> flash media, which prevents the data loss to happen.
> It will take about 5 seconds to save enough energy for discharge at lease 1 
> second. The write cache will be disabled automatically before the backup 
> power system saved enough energy.

Which certainly sounds like an on-board capacitor to flush the cache and that 
the cache is disabled while charging the capacitor.  But I can't see where 
anyone has tested this on ZFS.

--Ware
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] aliase for MPxIO path

2010-05-24 Thread Fred Liu
Hi,

   1): Is it possible to do it?
   2): What is the backplane hardware requirement for "luxadm led_blink" to 
work  to bring Disk LED to the Blink Mode.

Thanks.

Fred



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] New SSD options

2010-05-24 Thread Thomas Burgess
>
>
>
> From earlier in the thread, it sounds like none of the SF-1500 based
> drives even have a supercap, so it doesn't seem that they'd necessarily
> be a better choice than the SLC-based X-25E at this point unless you
> need more write IOPS...
>
> Ray
>

I think the upcoming OCZ Vertex 2 Pro will have a supercap.

I just bought a ocz vertex le, it doesn't have a supercap but it DOES have
some awesome specs otherwise..
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] questions about zil

2010-05-24 Thread Thomas Burgess
>
>
> Not familiar with that model
>
>
It's a sandforce sf-1500 model but without a supercapheres some info on
it:



Maximum Performance

   - Max Read: up to 270MB/s
   - Max Write: up to 250MB/s
   - Sustained Write: up to 235MB/s
   - Random Write 4k: 15,000 IOPS
   - Max 4k IOPS: 50,000



per
http://www.ocztechnology.com/products/solid-state-drives/2-5--sata-ii/performance-enterprise-solid-state-drives/ocz-vertex-limited-edition-sata-ii-2-5--ssd.html


>
>
> Wow.  That's a pretty huge improvement. :-)
>
> - Garrett (newly of Nexenta)
>
>
>
yes, i love it.  I'm really impressed with this ssd for the money160 usd
(180 - 20 rebate)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] questions about zil

2010-05-24 Thread Thomas Burgess
>
>
>  ZFS is always consistent on-disk, by design. Loss of the ZIL will result
> in loss of the data in the ZIL which hasn't been flushed out to the hard
> drives, but otherwise, the data on the hard drives is consistent and
> uncorrupted.
>
>
>
> This is what i thought.  I have read this list on and off for awhile now
but i'm not a guruI see a lot of stuff about the intel ssd and disabling
the write cacheso i just wasn't sure...This is good news.





>
>  It avoids the scenario of losing data in your ZIL due to power loss (and,
> of course, the rest of your system).  So, yes, if you actually care about
> your system, I'd recommend at least a minimal UPS to allow for quick
> shutdown after a power loss.
>
>
> yes, i have a nice little UPS.  I've tested it a few times and it seems to
work well.  It gives me about 20 minutes of power and can even send commands
via a script to shut down the system before the battery goes dry.




> That's going to pretty much be the best-case use for the ZIL - NFS writes
> being synchronous.  Of course, using the rest of the SSD for L2ARC is likely
> to be almost (if not more) helpful for performance for a wider variety of
> actions.
>
>
> yes, i have another machine without a zil (i bought a kingston 64 gb ssd on
sale and intended to try it as a zil but ultimately decided to just use it
as l2arc because of the performance numbers...)  but the l2arc helps a ton
for my uses.  I did slice this ssd...i used 9 gb for zil and the rest for
l2arc (about 36 gb)   I'm really impressed with this ssdfor only 160
dollars (180 - 20 mail in rebate) it's a killer deal.

it can do 235 MB/s sustained writes and has soemthing like 15,000 iops





> --
> Erik Trimble
> Java System Support
> Mailstop:  usca22-123
> Phone:  x17195
> Santa Clara, CA
>
>
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] questions about zil

2010-05-24 Thread Garrett D'Amore

On 5/24/2010 2:48 PM, Thomas Burgess wrote:

I recently got a new SSD (ocz vertex LE 50gb)


Not familiar with that model



It seems to work really well as a ZIL performance wise.  My question 
is, how safe is it?  I know it doesn't have a supercap so lets' say 
dataloss occursis it just dataloss or is it pool loss?


It should just be dataloss -- you'd potentially lose any filesystem 
changes made since the last time data was committed to the actual 
storage device (not the ZIL.)  It still sounds like losing the ZIL is a 
bad idea if you really care about data integrity data though.





also, does the fact that i have a UPS matter?



It might!  If you can guarantee that the actual drive won't lose power 
then you have much more security.  *But*, if the drive is powered from 
the computer's power supply or internal bus, then you could have an 
issue -- power loss to the drive can occur in this situation when the 
system reboots, even though the mains still had power available to them 
all the time.


(Hmmm... do the typical SATA power plugs from computer power supplies 
have power removed when the system takes a full reset?  I would not 
*think* so, but with software controlled power supplies I'm not 100% 
certain.  It might be an interesting test to measure the power from some 
common power supplies -- as applied to the SATA power connector -- 
during a system panic or full reset (reboot -p for example).  Its 
possible that power is supplied continuously to the peripherals even in 
the event of this kind of failure.


DDRdrive X1 solves this with an external power brick.  I'm not sure what 
other storage devices do here.


The supercap solution is nice because it doesn't need the external 
power, but can supply power to the media even when the host removes 
power from the device itself.





the numbers i'm seeing are really nicethese are some nfs tar times 
before zil:


Wow.  That's a pretty huge improvement. :-)

- Garrett (newly of Nexenta)



real 2m21.498s

user 0m5.756s

sys 0m8.690s


real 2m23.870s

user 0m5.756s

sys 0m8.739s



and these are the same ones after.




real 0m32.739s

user 0m5.708s

sys 0m8.515s



real 0m35.580s

user 0m5.707s

sys 0m8.526s




I also sliced iti have 16 gb ram so i used a 9 gb slice for zil 
and the rest for L2ARC




this is for a single 10 drive raidz2 vdev so fari'm really 
impressed with the performance gains




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
   


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] questions about zil

2010-05-24 Thread Erik Trimble

On 5/24/2010 2:48 PM, Thomas Burgess wrote:

I recently got a new SSD (ocz vertex LE 50gb)

It seems to work really well as a ZIL performance wise.  My question 
is, how safe is it?  I know it doesn't have a supercap so lets' say 
dataloss occursis it just dataloss or is it pool loss?


ZFS is always consistent on-disk, by design. Loss of the ZIL will result 
in loss of the data in the ZIL which hasn't been flushed out to the hard 
drives, but otherwise, the data on the hard drives is consistent and 
uncorrupted.






also, does the fact that i have a UPS matter?

It avoids the scenario of losing data in your ZIL due to power loss 
(and, of course, the rest of your system).  So, yes, if you actually 
care about your system, I'd recommend at least a minimal UPS to allow 
for quick shutdown after a power loss.





the numbers i'm seeing are really nicethese are some nfs tar times 
before zil:



real 2m21.498s

user 0m5.756s

sys 0m8.690s


real 2m23.870s

user 0m5.756s

sys 0m8.739s



and these are the same ones after.




real 0m32.739s

user 0m5.708s

sys 0m8.515s



real 0m35.580s

user 0m5.707s

sys 0m8.526s




I also sliced iti have 16 gb ram so i used a 9 gb slice for zil 
and the rest for L2ARC




this is for a single 10 drive raidz2 vdev so fari'm really 
impressed with the performance gains




That's going to pretty much be the best-case use for the ZIL - NFS 
writes being synchronous.  Of course, using the rest of the SSD for 
L2ARC is likely to be almost (if not more) helpful for performance for a 
wider variety of actions.



--
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] questions about zil

2010-05-24 Thread Nicolas Williams
On Mon, May 24, 2010 at 05:48:56PM -0400, Thomas Burgess wrote:
> I recently got a new SSD (ocz vertex LE 50gb)
> 
> It seems to work really well as a ZIL performance wise.  My question is, how
> safe is it?  I know it doesn't have a supercap so lets' say dataloss
> occursis it just dataloss or is it pool loss?

Just dataloss.

> also, does the fact that i have a UPS matter?

Relative to power loss, yes.

Nico
-- 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] questions about zil

2010-05-24 Thread Thomas Burgess
I recently got a new SSD (ocz vertex LE 50gb)

It seems to work really well as a ZIL performance wise.  My question is, how
safe is it?  I know it doesn't have a supercap so lets' say dataloss
occursis it just dataloss or is it pool loss?


also, does the fact that i have a UPS matter?


the numbers i'm seeing are really nicethese are some nfs tar times
before zil:


real 2m21.498s

user 0m5.756s

sys 0m8.690s


real 2m23.870s

user 0m5.756s

sys 0m8.739s



and these are the same ones after.




real 0m32.739s

user 0m5.708s

sys 0m8.515s



real 0m35.580s

user 0m5.707s

sys 0m8.526s




I also sliced iti have 16 gb ram so i used a 9 gb slice for zil and the
rest for L2ARC



this is for a single 10 drive raidz2 vdev so fari'm really impressed
with the performance gains
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS no longer working with FC devices.

2010-05-24 Thread Andrew Daugherity
I had a similar problem with a RAID shelf (switched to JBOD mode, with each 
physical disk presented as a LUN) connected via FC (qlc driver, but no MPIO).  
Running a scrub would eventually generate I/O errors and many messages like 
this:

Sep  6 15:12:53 imsfs scsi: [ID 107833 kern.warning] WARNING: 
/p...@0,0/pci10de,5...@e/pci1077,1...@0/f...@0,0/d...@w2100
0004d960cdec,e (sd4):
Sep  6 15:12:53 imsfs   Request Sense couldn't get sense data

and eventually one or more disks would get marked as faulted by ZFS.  This was 
under s10u6 (10/08, I think)  but I imagine it still holds for u8.  I did not 
have these problems with just one or two LUNs presented from the array, but I 
prefer to run ZFS in the recommended configuration where it manages the disks.

My storage vendor (3rd-party, not Sun) recommended that in /etc/system I add 
'set ssd:ssd_max_throttle = 23' or less and 'set ssd:ssd_io_time = 0x60' or 
0x78.   The default 0x20 (in what version of Solaris?) is apparently not enough 
in many cases.

In my case (x64) I discovered I needed sd:sd_max_throttle, etc. (not ssd, which 
is apparently only for sparc), and that the default sd_io_time on recent 
Solaris 10 already is 0x60.  Apparently the general rule for max_throttle is 
256/# of LUNs, but my vendor found that 23 was the maximum reliable setting for 
16 LUNs.

This may or may not help you but it's something to try.  Without the 
max_throttle setting, I would get errors somewhere between 30 minutes and 4 
hours into a scrub, and with it scrubs run successfully.

-Andrew


>>> Demian Phillips  5/23/2010 8:01 AM >>> 
On Sat, May 22, 2010 at 11:33 AM, Bob Friesenhahn
 wrote:
> On Fri, 21 May 2010, Demian Phillips wrote:
>
>> For years I have been running a zpool using a Fibre Channel array with
>> no problems. I would scrub every so often and dump huge amounts of
>> data (tens or hundreds of GB) around and it never had a problem
>> outside of one confirmed (by the array) disk failure.
>>
>> I upgraded to sol10x86 05/09 last year and since then I have
>> discovered any sufficiently high I/O from ZFS starts causing timeouts
>> and off-lining disks. This leads to failure (once rebooted and cleaned
>> all is well) long term because you can no longer scrub reliably.
>
> The problem could be with the device driver, your FC card, or the array
> itself.  In my case, issues I thought were to blame on my motherboard or
> Solaris were due to a defective FC card and replacing the card resolved the
> problem.
>
> If the problem is that your storage array is becoming overloaded with
> requests, then try adding this to your /etc/system file:
>
> * Set device I/O maximum concurrency
> *
> http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide#Device_I.2FO_Queue_Size_.28I.2FO_Concurrency.29
> set zfs:zfs_vdev_max_pending = 5
>
> Bob
> --
> Bob Friesenhahn
> bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
> GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
>

I've gone back to Solaris 10 11/06.
It's working fine, but I notice some differences in performance that
are I think key to the problem.

With the latest Solaris 10 (u8) throughput according to zpool iostat
was hitting about 115MB/sec sometimes a little higher.

With 11/06 it maxes out at 40MB/sec.

Both setups are using mpio devices as far as I can tell.

Next is to go back to u8 and see if the tuning you suggested will
help. It really looks to me that the OS is asking too much of the FC
chain I have.

The really puzzling thing is I just got told about a brand new Dell
Solaris x86 production box using current and supported FC devices and
a supported SAN get the same kind of problems when a scrub is run. I'm
going to investigate that and see if we can get a fix from Oracle as
that does have a support contract. It may shed some light on the issue
I am seeing on the older hardware.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] zfs/lofi/share panic

2010-05-24 Thread Frank Middleton

Many many moons ago, I submitted a CR into bugs about a
highly reproducible panic that occurs if you try to re-share
a  lofi mounted image. That CR has AFAIK long since
disappeared - I even forget what it was called.

This server is used for doing network installs. Let's say
you have a 64 bit iso lofi-mounted and shared. You do the
install, and then wish to switch to a 32 bit iso. You unshare,
umount, delete the loopback, and then lofiadm the new iso,
mount it and then share it. Panic, every time.

Is this such a rare use-case that no one is interested? I have
the backtrace and cores if anyone wants them, although
such were submitted with the original CR. This is pretty
frustrating since you start to run out of ideas for mountpoint
names after a while unless you forget and get the panic.

FWIW (even on a freshly booted system after a panic)
# lofiadm zyzzy.iso /dev/lofi/1
# mount -F hsfs /dev/lofi/1 /mnt
mount: /dev/lofi/1 is already mounted or /mnt is busy
# mount -O -F hsfs /dev/lofi/1 /mnt
# share /mnt
#

If you unshare /mnt and then do this again, it will panic.
This has been a bug since before Open Solaris came out.

It doesn't happen if the iso is originally on UFS, but
UFS really isn't an option any more.  FWIW the dataset
containing the isos has the sharenfs attribute set,
although it doesn;t have to be actually mounted by
any remote NFS for this panic to occur.

Suggestions for a workaround most welcome!

Thanks

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] zpool export takes too long time in build-134

2010-05-24 Thread autumn Wang
Hi,
I did the zpool import/export performance testing on opensolaris build-134:
1). Create 100 zfs and 100 snapshots, then do zpool export/import
 export takes about 5 seconds
 import takes about 5 seconds
2). Create 200 zfs and 200 snapshots, then do zpool export/import
 export takes about 80 seconds
 import takes about 12 seconds
3). Create 300 zfs and 300 snapshots, then do zpool export/import
 export takes about 110 seconds
 import takes about 12 seconds

But on solaris 10 with the latest recommended patch(zfs is also V4), even with 
400 zfs and 400 snapshots, the export time is still in 5 seconds, and import 
time is still in 20 seconds.

It seems the zpool import/export performance was dropped so much on new 
OpenSolaris build?

Regards,
Autumn Wang.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Removing disks from a ZRAID config?

2010-05-24 Thread Carson Gaspar

Forrest Aldrich wrote:

I've seen this product mentioned before - the problem is, we use
Veritas heavily on a public network and adding yet another software
dependency would be a hard sell.  :(


Be very certain that you need synchronous replication before you do 
this. For some ACID systems it really is required. And, sadly, it's also 
required for rebootless client NFS failover (devs, will we ever get a 
zfs send/recv + nfs server that can preserver nfs file handles?)


For most use cases async replication is just fine, and is far less 
fragile (block level replication happily replicates corruption). At 
$office we have 2 systems replicating on SRDF, everything else is async 
replication via NetApp SnapMirror or ZFS send/recv.


To answer your original question, no, you can't yet shrink a pool (it's 
been coming Real Soon Now for a couple of years - no clue how much 
longer before it actually arrives). So you'd need enough temporary 
storage to replicate your data while you destroy and re-create your pool.


--
Carson
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Removing disks from a ZRAID config?

2010-05-24 Thread iMx


> > Thanks for the pointer, I will look into it.
> >
> > The first thing that comes to mind is a possible performance hit,
> > somewhere with the VxFS code. I could be wrong, tho.


No worries, certainly worth looking into though - if performance is acceptable, 
it could be a good solution.  Let me know how it goes ;)

--
iMx
i...@streamvia.net
www.slashdevslashnull.com


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Removing disks from a ZRAID config?

2010-05-24 Thread Forrest Aldrich
I've seen this product mentioned before - the problem is, we use Veritas 
heavily on a public network and adding yet another software dependency would be 
a hard sell.  :(
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] New SSD options

2010-05-24 Thread Ray Van Dolson
On Mon, May 24, 2010 at 11:30:20AM -0700, Ray Van Dolson wrote:
> This thread has grown giant, so apologies for screwing up threading
> with an out of place reply. :)
> 
> So, as far as SF-1500 based SSD's, the only ones currently in existence
> are the Vertex 2 LE and Vertex 2 EX, correct (I understand the Vertex 2
> Pro was never mass produced)?
> 
> Both of these are based on MLC and not SLC -- why isn't that an issue
> for longevity?
> 
> Any other SF-1500 options out there?
> 
> We continue to use UPS-backed Intel X-25E's for ZIL.

>From earlier in the thread, it sounds like none of the SF-1500 based
drives even have a supercap, so it doesn't seem that they'd necessarily
be a better choice than the SLC-based X-25E at this point unless you
need more write IOPS...

Ray
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] New SSD options

2010-05-24 Thread Ray Van Dolson
This thread has grown giant, so apologies for screwing up threading
with an out of place reply. :)

So, as far as SF-1500 based SSD's, the only ones currently in existence
are the Vertex 2 LE and Vertex 2 EX, correct (I understand the Vertex 2
Pro was never mass produced)?

Both of these are based on MLC and not SLC -- why isn't that an issue
for longevity?

Any other SF-1500 options out there?

We continue to use UPS-backed Intel X-25E's for ZIL.

Ray
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] iSCSI confusion

2010-05-24 Thread Scott Meilicke
VMware will properly handle sharing a single iSCSI volume across multiple ESX 
hosts. We have six ESX hosts sharing the same iSCSI volumes - no problems.

-Scott
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Removing disks from a ZRAID config?

2010-05-24 Thread iMx


> > Can you elaborate?
> >
> > Veritas has it's own filesystem -- we need the block-level
> > replication functionality to backup our data (live) over the WAN to
> > a disaster
> > recover location. Therefore, you wouldn't be able to use Veritas
> > with ZFS filesystem.



zfs create -V 10G test/testvol
newfs /dev/zvol/rdsk/test/testvol

Or format the zvol however you need to for veritas, the above would create ufs, 
the key being you pass it the device rather than mount the file system; this 
would in theory then allow you to snapshot, locally, the volume etc.  Check out 
the below article:

http://www.markround.com/archives/37-ZFS-as-a-volume-manager.html

"While browsing the ZFS man page recently, I made an interesting discovery: ZFS 
can export block devices from a zpool, which means you can separate "ZFS the 
volume manager" from "ZFS the filesystem". This may well be old news to many; 
however I haven't seen many references to this on the web, so thought I'd post 
a quick blog update. "
 
--
iMx
i...@streamvia.net
www.slashdevslashnull.com


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] New SSD options

2010-05-24 Thread Miles Nordin
> "d" == Don   writes:
> "hk" == Haudy Kazemi  writes:

 d> You could literally split a sata cable and add in some
 d> capacitors for just the cost of the caps themselves.

no, this is no good.  The energy only flows in and out of the
capacitor when the voltage across it changes.  In this respect they
are different from batteries.  It's normal to use (non-super)
capacitors as you describe for filters next to things drawing power in
a high-frequency noisy way, but to use them for energy storage across
several seconds you need a switching supply to drain the energy from
it.  the step-down and voltage-pump kinds of switchers are
non-isolated and might do fine, and are cheaper than full-fledged
DC-DC that are isolated (meaning the input and output can float wrt
each other).

you can charge from 12V and supply 5V if that's cheaper.  :)

hope it works.

hk> "okay, we've waited 5 seconds for additional data to arrive to
hk> be written.  None has arrived in the last 5 seconds, so we're
hk> going to write what we already have to better ensure data
hk> integrity,

yeah, I am worried about corner cases like this.  ex: input power to
the SSD becomes scratchy or sags, but power to the host and controller
remain fine.  Writes arrive continuously.  The SSD sees nothing wrong
with its power and continues to accept and acknowledge writes.
Meanwhile you burn through your stored power hiding the sagging supply
until you can't, then the SSD loses power suddenly and drops a bunch
of writes on the floor.  That is why I drew that complicated state
diagram in which the pod disables and holds-down the SATA connection
once it's running on reserve power.  Probably y'all don't give a fuck
about such corners though, nor do many of the manufacturers selling
this stuff, so, whatever.


pgpYM02z6LZ58.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Removing disks from a ZRAID config?

2010-05-24 Thread Richard Elling
On May 24, 2010, at 10:47 AM, Forrest Aldrich wrote:

> We have a Sun thumper 34 terabyte, with 24T free.   I've been asked to find 
> out whether we can remove some disks from the zpool/ZRAID config (say about 
> 10T) and install Veritas volumes on those, then migrate some data to it for 
> block-level replication over a WAN.
> 
> I know, horrifying - but the problem is there is no network/block-level 
> replication inherent in (production Solaris) ZFS and we need it.

http://www.oracle.com/us/products/servers-storage/storage/storage-software/031764.htm
 -- richard

-- 
ZFS and NexentaStor training, Rotterdam, July 13-15, 2010
http://nexenta-rotterdam.eventbrite.com/






___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Removing disks from a ZRAID config?

2010-05-24 Thread iMx


- Original Message -
> From: "Forrest Aldrich" 
> To: zfs-discuss@opensolaris.org
> Sent: Monday, 24 May, 2010 6:47:40 PM
> Subject: [zfs-discuss] Removing disks from a ZRAID config?

> We have a Sun thumper 34 terabyte, with 24T free. I've been asked to
> find out whether we can remove some disks from the zpool/ZRAID config
> (say about 10T) and install Veritas volumes on those, then migrate
> some data to it for block-level replication over a WAN.
> 
> I know, horrifying - but the problem is there is no
> network/block-level replication inherent in (production Solaris) ZFS
> and we need it. I hate Veritas with a passion. But that's another
> discussion :-)
> 
> Thanks...

I dont know much about Veritas, but couldnt you install it on a zvol?

--
iMx
i...@streamvia.net



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Removing disks from a ZRAID config?

2010-05-24 Thread Forrest Aldrich
We have a Sun thumper 34 terabyte, with 24T free.   I've been asked to find out 
whether we can remove some disks from the zpool/ZRAID config (say about 10T) 
and install Veritas volumes on those, then migrate some data to it for 
block-level replication over a WAN.

I know, horrifying - but the problem is there is no network/block-level 
replication inherent in (production Solaris) ZFS and we need it.   I hate 
Veritas with a passion.  But that's another discussion :-)

Thanks...
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs recordsize change improves performance

2010-05-24 Thread Miles Nordin
> "ai" == Asif Iqbal  writes:

 >>  If you disable the ZIL for locally run Oracle and you have an
 >> unscheduled outage, then it is highly probable that you will
 >> lose data.

ai> yep. that is why I am not doing it until we replace the
ai> battery

no, wait please, you still need the ZIL to be on, even with the battery.

disabling the cache flush command is what the guide says is allowed
and sometimes helpful for people who have NVRAM's, but disabling the
cache flush command and disabling the ZIL are different.  Disabling
the ZIL means the write can be cached in DRAM until the next txg flush
and not issued to the disks at all, so even if you have a disk array
with an NVRAM that effectively writes everything as if it were sync,
the disk array will not even see the write until txg commit time with
ZIL disabled.

If you have working NVRAM, I think disabling the ZIL is likely not to
give much speed-up, so if you are going to try disabling it, now when
your battery is dead is the time to do it.  Once the battery's fixed
theory says your testing will probably show things are just as fast
with ZIL enabled.

AIUI if you disable the ZIL, the database should still come back in a
crash-consisent state after a cord-yank, but it will be an older state
than it should be, so if you have several RDBMS behind some kind of
tiered middleware the different databases won't be in sync with each
other so you can lose integrity.  If you have only one RDBMS I think
you will lose only durability through this monkeybusiness, and
integrity will survive.  I'm not an expert of anything, but that's my
understanding for now.


pgpFapbkFrlFR.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] cannot import pool from another system, device-ids different! please help!

2010-05-24 Thread Mark J Musante



On Mon, 24 May 2010, h wrote:


but...wait..that cant be.
i disconnected the 1TB drives and plugged in the 2TB's before doing replace 
command. no information could be written to the 1TBs at all since it is 
physically offline.


Do the labels still exist?  What does 'zdb -l /dev/rdsk/' show?



Regards,
markm
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] cannot import pool from another system, device-ids different! please help!

2010-05-24 Thread hmmmm
but...wait..that cant be.
i disconnected the 1TB drives and plugged in the 2TB's before doing replace 
command. no information could be written to the 1TBs at all since it is 
physically offline.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] hybrid drive: flash and platters

2010-05-24 Thread David Magda
Seagate is planning on releasing a disk that's part spinning rust and  
part flash:


http://www.theregister.co.uk/2010/05/21/seagate_momentus_xt/

The design will have the flash be transparent to the operating system,  
but I wish they would have some way to access the two components  
separately. ZFS could certainly make use of it, and Linux is also  
working on a capability:


http://kernelnewbies.org/KernelProjects/DmCache
http://lwn.net/Articles/385442/

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] can you recover a pool if you lose the zil (b134+)

2010-05-24 Thread R. Eulenberg
I even have this problem on my (productive) backup server. I lost my system-hdd 
and my separate ZIL-device while the system crashes and now I'm in trouble. The 
old system was running under the least version of osol/dev with zfs v22. 
10 days ago after the servers crashs I was very optimistc of solving the 
problems the same day. It's a long time ago.
After seting up a new systen (osol 2009.06 and updating to the lastest version 
of osol/dev with deduplication) I tried to import my backup zpool, but I can't.
# zpool import
  pool: tank1
id: 5048704328421749681
 state: UNAVAIL
status: The pool was last accessed by another system.
action: The pool cannot be imported due to damaged devices or data.
   see: http://www.sun.com/msg/ZFS-8000-EY
config:

tank1UNAVAIL  missing device
  raidz2-0   ONLINE
c7t5d0   ONLINE
c7t0d0   ONLINE
c7t6d0   ONLINE
c7t3d0   ONLINE
c7t1d0   ONLINE
c7t4d0   ONLINE
c7t2d0   ONLINE

# zpool import -f tank1
cannot import 'tank1': one or more devices is currently unavailable
Destroy and re-create the pool from
a backup source

Any other option (-F, -X, -V, -D) and any combination of them doesn't helps.
I can not add / attach / detach / remove a vdev and the ZIL-device either.
In the last ten days I read a lot of threads, guides to solve problems with ZFS 
and so on, but not found a solution for my problem. I created a fake-zpool with 
separate ZIL-device to combine the ZIL-file with my old zpool for importing 
them, but it doesn't work in course of the different GUID and checksum (the 
name I was modifiing by an binary editor).
The output of:
e...@opensolaris:~# zdb -e tank1

Configuration for import:
vdev_children: 2
version: 22
pool_guid: 5048704328421749681
name: 'tank1'
state: 0
hostid: 946038
hostname: 'opensolaris'
vdev_tree:
type: 'root'
id: 0
guid: 5048704328421749681
children[0]:
type: 'raidz'
id: 0
guid: 16723866123388081610
nparity: 2
metaslab_array: 23
metaslab_shift: 30
ashift: 9
asize: 7001340903424
is_log: 0
create_txg: 4
children[0]:
type: 'disk'
id: 0
guid: 6858138566678362598
phys_path: 
'/p...@0,0/pci8086,2...@1e/pci11ab,1...@9/d...@0,0:a'
whole_disk: 1
DTL: 4345
create_txg: 4
path: '/dev/dsk/c7t5d0s0'
devid: 
'id1,s...@sata_samsung_hd103uj___s13pj1bq709050/a'
children[1]:
type: 'disk'
id: 1
guid: 16136237447458434520
phys_path: 
'/p...@0,0/pci8086,2...@1e/pci11ab,1...@9/d...@1,0:a'
whole_disk: 1
DTL: 4344
create_txg: 4
path: '/dev/dsk/c7t0d0s0'
devid: 
'id1,s...@sata_samsung_hd103uj___s13pjdwq317311/a'
children[2]:
type: 'disk'
id: 2
guid: 10876853602231471126
phys_path: 
'/p...@0,0/pci8086,2...@1e/pci11ab,1...@9/d...@2,0:a'
whole_disk: 1
DTL: 4343
create_txg: 4
path: '/dev/dsk/c7t6d0s0'
devid: 
'id1,s...@sata_hitachi_hdt72101__stf604mh14s56w/a'
children[3]:
type: 'disk'
id: 3
guid: 2384677379114262201
phys_path: 
'/p...@0,0/pci8086,2...@1e/pci11ab,1...@9/d...@3,0:a'
whole_disk: 1
DTL: 4342
create_txg: 4
path: '/dev/dsk/c7t3d0s0'
devid: 
'id1,s...@sata_samsung_hd103uj___s13pj1nq811135/a'
children[4]:
type: 'disk'
id: 4
guid: 15143849195434333247
phys_path: 
'/p...@0,0/pci8086,2...@1e/pci11ab,1...@9/d...@4,0:a'
whole_disk: 1
DTL: 4341
create_txg: 4
path: '/dev/dsk/c7t1d0s0'
devid: 
'id1,s...@sata_hitachi_hdt72101__stf604mh16v73w/a'
children[5]:
type: 'disk'
id: 5
guid: 11627603446133164653
phys_path: 
'/p...@0,0/pci8086,2...@1e/pci11ab,1...@9/d...@5,0:a'
whole_disk: 1
DTL: 4340
create_txg: 4
path: '/d

Re: [zfs-discuss] ZFS no longer working with FC devices.

2010-05-24 Thread Richard Elling
On May 24, 2010, at 4:06 AM, Demian Phillips wrote:
> On Sun, May 23, 2010 at 12:02 PM, Torrey McMahon  wrote:
>>  On 5/23/2010 11:49 AM, Richard Elling wrote:
>>> 
>>> FWIW, the A5100 went end-of-life (EOL) in 2001 and end-of-service-life
>>> (EOSL) in 2006. Personally, I  hate them with a passion and would like to
>>> extend an offer to use my tractor to bury the beast:-).
>> 
>> I'm sure I can get some others to help. Can I smash the gbics? Those were my
>> favorite. :-)
>> ___
>> zfs-discuss mailing list
>> zfs-discuss@opensolaris.org
>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>> 
> 
> I'd be more then happy to take someone up on the offer but I'd need a
> good deal on more current FC array. Since this is my home environment
> I am limited by my insignificant pay and the wife factor (who does
> indulge me from time to time). Without a corporate IT budget I make do
> with everything from free to what I can afford used.
> 
> To be honest I'd rather be using an IBM DS4K series array.
> 
> Current stress test is creating 700 (50% of array capacity) 1GB files
> from /dev/urandom and then I will scrub.

Unfortunately, /dev/urandom is too slow for direct stress testing. It can be
used as a seed for random data files that are then used for stress testing.
 -- richard

-- 
ZFS and NexentaStor training, Rotterdam, July 13-15, 2010
http://nexenta-rotterdam.eventbrite.com/






___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] cannot import pool from another system, device-ids different! please help!

2010-05-24 Thread hmmmm
yes i used "zpool replace".
why is one drive recognized? 
shouldnt the labels be wiped on all of them?

am i screwed?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] cannot import pool from another system, device-ids different! please help!

2010-05-24 Thread Mark J Musante

On Mon, 24 May 2010, h wrote:

i had 6 disks in a raidz1 pool that i replaced from 1TB drives to 2TB 
drives. i have installed the older 1TB drives in another system and 
would like to import the old pool to access some files i accidentally 
deleted from the new pool.


Did you use the 'zpool replace' command to do the replace?  If so, once 
the replace completes, the ZFS label on the original disk is overwritten 
to make it available for new pools.



Regards,
markm
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] cannot import pool from another system, device-ids different! please help!

2010-05-24 Thread hmmmm
Hi!
i had 6 disks in a raidz1 pool that i replaced from 1TB drives to 2TB drives.
i have installed the older 1TB drives in another system and would like to import
the old pool to access some files i accidentally deleted from the new pool.

the first system (with the 2TB's) is a Opensolaris system and the other is 
running
EON solaris (based on snv 130)

I think the problem is that in the EON system, the drives get different ID's
and when i replaced the 1TB drives i didnt export the pool.
only one drive show up as online, is this because it is the only one 
connected "in the right order"? i dont remember which order the drives where 
connected to the controller in the Opensolaris system.


what can i do to import this pool
HELP!!!

eon:1:~#uname -a
SunOS eon 5.11 snv_130 i86pc i386 i86pc

eon:2:~#format
Searching for disks...done

AVAILABLE DISK SELECTIONS:
   0. c1d0 
  /p...@0,0/pci-...@d/i...@0/c...@0,0
   1. c2d0 
  /p...@0,0/pci-...@d/i...@1/c...@0,0
   2. c3d0 
  /p...@0,0/pci-...@d,1/i...@0/c...@0,0
   3. c4d0 
  /p...@0,0/pci-...@d,1/i...@1/c...@0,0
   4. c5d0 
  /p...@0,0/pci-...@d,2/i...@0/c...@0,0
   5. c6d0 
  /p...@0,0/pci-...@d,2/i...@1/c...@0,0
Specify disk (enter its number):

eon:3:~#zpool import
  pool: videodrome
id: 5063071388564101079
 state: UNAVAIL
status: The pool was last accessed by another system.
action: The pool cannot be imported due to damaged devices or data.
   see: http://www.sun.com/msg/ZFS-8000-EY
config:

videodrome   UNAVAIL  insufficient replicas
  raidz1-0   UNAVAIL  insufficient replicas
c1t0d0   UNAVAIL  cannot open
c1t1d0   UNAVAIL  cannot open
c10t0d0  UNAVAIL  cannot open
c0t1d0   UNAVAIL  cannot open
c11t0d0  UNAVAIL  cannot open
c1d0 ONLINE
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS no longer working with FC devices.

2010-05-24 Thread Demian Phillips
On Sun, May 23, 2010 at 12:02 PM, Torrey McMahon  wrote:
>  On 5/23/2010 11:49 AM, Richard Elling wrote:
>>
>> FWIW, the A5100 went end-of-life (EOL) in 2001 and end-of-service-life
>> (EOSL) in 2006. Personally, I  hate them with a passion and would like to
>> extend an offer to use my tractor to bury the beast:-).
>
> I'm sure I can get some others to help. Can I smash the gbics? Those were my
> favorite. :-)
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>

I'd be more then happy to take someone up on the offer but I'd need a
good deal on more current FC array. Since this is my home environment
I am limited by my insignificant pay and the wife factor (who does
indulge me from time to time). Without a corporate IT budget I make do
with everything from free to what I can afford used.

To be honest I'd rather be using an IBM DS4K series array.

Current stress test is creating 700 (50% of array capacity) 1GB files
from /dev/urandom and then I will scrub.

If all goes well it's back to u8 and tuning it.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] [ZIL device brainstorm] intel x25-M G2 has ram cache?

2010-05-24 Thread Fred Liu
Yes. I mentioned this in my thread. And also I contacted Chris, ;-)

-Original Message-
From: zfs-discuss-boun...@opensolaris.org 
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of J.P. King
Sent: 星期一, 五月 24, 2010 18:41
To: Andrew Gabriel
Cc: ZFS Discussions
Subject: Re: [zfs-discuss] [ZIL device brainstorm] intel x25-M G2 has ram cache?


> What you probably want is a motherboard which has a small area of main
> memory protected by battery, and a ramdisk driver which knows how to use it.
> Then you'd get the 1,000,000 IOPS. No idea if anyone makes such a thing.
> 
> You are correct that ZFS gets an enormous benefit from even tiny amounts if
> NV ZIL. Trouble is that no other operating systems or filesystems work this
> well with such relatively tiny amounts of NV storage, so such a hardware
> solution is very ZFS-specific.

No comment on how good or otherwise it is, but I just came across this:

http://www.ddrdrive.com/

Which appears to be looking to provide something for OpenSolaris...
(And some other minority interest OS called Windows).

Julian
--
Julian King
Computer Officer, University of Cambridge, Unix Support
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] [ZIL device brainstorm] intel x25-M G2 has ram cache?

2010-05-24 Thread Fred Liu
Yeah, If plus the ability to backup the data to the BIOS/EPROM on the 
motherboard, that should be the utmost solution….

From: Andrew Gabriel [mailto:andrew.gabr...@oracle.com]
Sent: 星期一, 五月 24, 2010 18:37
To: Erik Trimble
Cc: Fred Liu; ZFS Discussions
Subject: Re: [zfs-discuss] [ZIL device brainstorm] intel x25-M G2 has ram cache?

Erik Trimble wrote:
Frankly, I'm really surprised that there's no solution, given that the *amount* 
of NVRAM needed for ZIL (or similar usage) is really quite small. a dozen GB is 
more than sufficient, and really, most systems do fine with just a couple of GB 
(3-4 or so).  Producing a small, DRAM-based device in a 3.5" HD form-factor 
with built-in battery shouldn't be hard, and I'm kinda flabberghasted nobody is 
doing it.  Well, at least in the sub-$1000 category.  I mean, it's 2 SODIMMs, a 
AAA-NiCad battery, a PCI-E->DDR2 memory controller, a PCI-E to SATA6Gbps 
controller, and that's it.

It's a bit of a wonky design. The DRAM could do something of the order 
1,000,000 IOPS, and is then throttled back to a tiny fraction of that by the 
SATA bottleneck. Disk interfaces like SATA/SAS really weren't designed for this 
type of use.

What you probably want is a motherboard which has a small area of main memory 
protected by battery, and a ramdisk driver which knows how to use it. Then 
you'd get the 1,000,000 IOPS. No idea if anyone makes such a thing.

You are correct that ZFS gets an enormous benefit from even tiny amounts if NV 
ZIL. Trouble is that no other operating systems or filesystems work this well 
with such relatively tiny amounts of NV storage, so such a hardware solution is 
very ZFS-specific.
--
[cid:image001.gif@01CAFB70.D80EADB0]Andrew
 Gabriel | Solaris Systems Architect
Email: andrew.gabr...@oracle.com
Mobile: +44 7720 598213
Oracle Pre-Sales
Guillemont Park | Minley Road | Camberley | GU17 9QG | United Kingdom

ORACLE Corporation UK Ltd is a company incorporated in England & Wales | 
Company Reg. No. 1782505 | Reg. office: Oracle Parkway, Thames Valley Park, 
Reading RG6 1RA
[cid:image002.gif@01CAFB70.D80EADB0]Oracle is 
committed to developing practices and products that help protect the environment
<><>___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] [ZIL device brainstorm] intel x25-M G2 has ram cache?

2010-05-24 Thread J.P. King



What you probably want is a motherboard which has a small area of main
memory protected by battery, and a ramdisk driver which knows how to use it.
Then you'd get the 1,000,000 IOPS. No idea if anyone makes such a thing.

You are correct that ZFS gets an enormous benefit from even tiny amounts if
NV ZIL. Trouble is that no other operating systems or filesystems work this
well with such relatively tiny amounts of NV storage, so such a hardware
solution is very ZFS-specific.


No comment on how good or otherwise it is, but I just came across this:

http://www.ddrdrive.com/

Which appears to be looking to provide something for OpenSolaris...
(And some other minority interest OS called Windows).

Julian
--
Julian King
Computer Officer, University of Cambridge, Unix Support
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] [ZIL device brainstorm] intel x25-M G2 has ram cache?

2010-05-24 Thread Andrew Gabriel




Erik Trimble wrote:

  
Frankly, I'm really surprised that there's no solution, given that the
*amount* of NVRAM needed for ZIL (or similar usage) is really quite
small. a dozen GB is more than sufficient, and really, most systems do
fine with just a couple of GB (3-4 or so).  Producing a small,
DRAM-based device in a 3.5" HD form-factor with built-in battery
shouldn't be hard, and I'm kinda flabberghasted nobody is doing it. 
Well, at least in the sub-$1000 category.  I mean, it's 2 SODIMMs, a
AAA-NiCad battery, a PCI-E->DDR2 memory controller, a PCI-E to
SATA6Gbps controller, and that's it.  


It's a bit of a wonky design. The DRAM could do something of the order
1,000,000 IOPS, and is then throttled back to a tiny fraction of that
by the SATA bottleneck. Disk interfaces like SATA/SAS really weren't
designed for this type of use.

What you probably want is a motherboard which has a small area of main
memory protected by battery, and a ramdisk driver which knows how to
use it.
Then you'd get the 1,000,000 IOPS. No idea if anyone makes such a thing.

You are correct that ZFS gets an enormous benefit from even tiny
amounts if NV ZIL. Trouble is that no other operating systems or
filesystems work this well with such relatively tiny amounts of NV
storage, so such a hardware solution is very ZFS-specific.

-- 

Andrew Gabriel |
Solaris Systems Architect
Email: andrew.gabr...@oracle.com
Mobile: +44 7720 598213
Oracle Pre-Sales
Guillemont Park | Minley Road | Camberley | GU17 9QG | United Kingdom

ORACLE Corporation UK Ltd is a
company incorporated in England & Wales | Company Reg. No. 1782505
| Reg. office: Oracle Parkway, Thames Valley Park, Reading RG6 1RA


Oracle is committed to developing practices and products that
help protect the environment




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] [ZIL device brainstorm] intel x25-M G2 has ram cache?

2010-05-24 Thread Fred Liu


From: Erik Trimble [mailto:erik.trim...@oracle.com]
Sent: 星期一, 五月 24, 2010 16:28
To: Fred Liu
Cc: ZFS Discussions
Subject: Re: [zfs-discuss] [ZIL device brainstorm] intel x25-M G2 has ram cache?

On 5/23/2010 11:30 PM, Fred Liu wrote:
Hi,

I have hit the synchronous NFS writing wall just like many people do.
There also have lots of discussion about the solutions here.
I want to post all of my exploring fighting done recently to discuss and share:

1): using the normal SATA-SSDs(intel/ocz) as ZIL device. For intel just EOLed
 50nm SSDs product line and the spec of the x25-M G2(34nm) 160G is getting
 more decent, I used this one. I get good performance boost compared with 
no slog.
 I also manually did unplugging power server times, the outcome seems good  
-- I got
 no data corruption. But there is always potential risk �C does it have 
built-in ram cache?
We get no clear answer from intel.

yes, both the X25-M (both G1 and G2) plus the X25-E have a DRAM buffer on the 
controller, and neither has a supercapacitor (or other battery) to back it up, 
so there is the potential for data loss (but /not/ data corruption) in a 
power-loss scenario.

Sadly, we're pretty much at the point where no current retail-available SSD has 
battery backup for it's on-controller DRAM cache (and, they /all/ use DRAM 
caches).


[Fred]: Yes. It is correct. I totally agree.




2): using PCIE-SSD(i.e. fusionio’s ioDrive). For the universal standard of 
PCIE-SSD is still on the way, the responding driver
 under solaris become apparent. And also the cost is very high.

Yes, you do have to find those with Solaris drivers. Fortunately, pretty much 
all the manufacturers recognize the big market that ZFS/Solaris has for these 
devices, so most already have Solaris drivers, and the rest are almost 
universally working to produce them.


[Fred]: Yes. Hope we can see the very first standard soon just like USB…


3): using PCIE-DRAM-with(out)-SSD(i.e. marvell’s write acceleration module 
(WAM), ddrdrive X1, curtisssd’s
   HyperCache. It has the higest spec but the same driver and high cost issue.

Yup.



4): using SATA--DRAM-with(out)-SSD(i.e acard’s 9010, curtisssd’s HyperHD). No 
driver issue and price is in the middle.
   I also tried acard 9010, the result is good enough.

Frankly, these are about the best available solution, without breaking the 
bank. The bad news is that they're not quite read-for-prime-time yet in terms 
of support and packaging.

[Fred]: Agree.

So it seems we have to wait awhile to get a really ideal ZIL device solution. 
If intel/ocz can confirm no ram cache, that will be good news.

Thanks.

Fred
Nope, ALL SSDs use DRAM caches for their controllers. So far, I'm only aware 
that the Zeus stuff (plus a couple of industrial/military-only products) have 
battery backup for their controllers.  Idiots - it's like $0.50 in parts, and a 
one-time $10,000 in engineer design/qa time to put the damned thing on, and 
they could charge a 25% premium and we'd pay it.


Frankly, I'm really surprised that there's no solution, given that the *amount* 
of NVRAM needed for ZIL (or similar usage) is really quite small. a dozen GB is 
more than sufficient, and really, most systems do fine with just a couple of GB 
(3-4 or so).  Producing a small, DRAM-based device in a 3.5" HD form-factor 
with built-in battery shouldn't be hard, and I'm kinda flabberghasted nobody is 
doing it.  Well, at least in the sub-$1000 category.  I mean, it's 2 SODIMMs, a 
AAA-NiCad battery, a PCI-E->DDR2 memory controller, a PCI-E to SATA6Gbps 
controller, and that's it.


[Fred]:  Agree. I am struggling so hard to procure that kind of device, but it 
is so hard to find for an individual.


--

Erik Trimble

Java System Support

Mailstop:  usca22-123

Phone:  x17195

Santa Clara, CA
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs replace multiple drives

2010-05-24 Thread Ragnar Sundblad

On 24 maj 2010, at 10.26, Brandon High wrote:

> On Mon, May 24, 2010 at 1:02 AM, Ragnar Sundblad  wrote:
>> Is that really true if you use the "zpool replace" command with both
>> the old and the new drive online?
> 
> Yes.

(Don't you mean "no" then? :-)

> zpool replace [-f] pool old_device [new_device]
> 
> Replaces old_device with new_device. This is  equivalent
> to attaching new_device, waiting for it to resilver, and
> then detaching old_device.
> 
> 
>> operation, and you should be able to replace all disks in a vdev at
>> once, shouldn't you?
> 
> Yes, if you have that many connections available.

Yes, otherwise it would be tricky of course.

/ragge

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] [ZIL device brainstorm] intel x25-M G2 has ram cache?

2010-05-24 Thread Erik Trimble

On 5/23/2010 11:30 PM, Fred Liu wrote:


Hi,

I have hit the synchronous NFS writing wall just like many people do.

There also have lots of discussion about the solutions here.

I want to post all of my exploring fighting done recently to discuss 
and share:


1): using the normal SATA-SSDs(intel/ocz) as ZIL device. For intel 
just EOLed


 50nm SSDs product line and the spec of the x25-M G2(34nm) 160G is 
getting


 more decent, I used this one. I get good performance boost 
compared with no slog.


 I also manually did unplugging power server times, the outcome 
seems good  -- I got


 no data corruption. But there is always potential risk -- does it 
have built-in ram cache?


We get no clear answer from intel.

yes, both the X25-M (both G1 and G2) plus the X25-E have a DRAM buffer 
on the controller, and neither has a supercapacitor (or other battery) 
to back it up, so there is the potential for data loss (but /not/ data 
corruption) in a power-loss scenario.


Sadly, we're pretty much at the point where no current retail-available 
SSD has battery backup for it's on-controller DRAM cache (and, they 
/all/ use DRAM caches).


2): using PCIE-SSD(i.e. fusionio's ioDrive). For the universal 
standard of PCIE-SSD is still on the way, the responding driver


 under solaris become apparent. And also the cost is very high.

Yes, you do have to find those with Solaris drivers. Fortunately, pretty 
much all the manufacturers recognize the big market that ZFS/Solaris has 
for these devices, so most already have Solaris drivers, and the rest 
are almost universally working to produce them.


3): using PCIE-DRAM-with(out)-SSD(i.e. marvell's write acceleration 
module (WAM), ddrdrive X1, curtisssd's


   HyperCache. It has the higest spec but the same driver and high 
cost issue.




Yup.

4): using SATA--DRAM-with(out)-SSD(i.e acard's 9010, curtisssd's 
HyperHD). No driver issue and price is in the middle.


   I also tried acard 9010, the result is good enough.



Frankly, these are about the best available solution, without breaking 
the bank. The bad news is that they're not quite read-for-prime-time yet 
in terms of support and packaging.


So it seems we have to wait awhile to get a really ideal ZIL device 
solution. If intel/ocz can confirm no ram cache, that will be good news.


Thanks.

Fred

Nope, ALL SSDs use DRAM caches for their controllers. So far, I'm only 
aware that the Zeus stuff (plus a couple of industrial/military-only 
products) have battery backup for their controllers.  Idiots - it's like 
$0.50 in parts, and a one-time $10,000 in engineer design/qa time to put 
the damned thing on, and they could charge a 25% premium and we'd pay it.



Frankly, I'm really surprised that there's no solution, given that the 
*amount* of NVRAM needed for ZIL (or similar usage) is really quite 
small. a dozen GB is more than sufficient, and really, most systems do 
fine with just a couple of GB (3-4 or so).  Producing a small, 
DRAM-based device in a 3.5" HD form-factor with built-in battery 
shouldn't be hard, and I'm kinda flabberghasted nobody is doing it.  
Well, at least in the sub-$1000 category.  I mean, it's 2 SODIMMs, a 
AAA-NiCad battery, a PCI-E->DDR2 memory controller, a PCI-E to SATA6Gbps 
controller, and that's it.




--
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs replace multiple drives

2010-05-24 Thread Brandon High
On Mon, May 24, 2010 at 1:02 AM, Ragnar Sundblad  wrote:
> Is that really true if you use the "zpool replace" command with both
> the old and the new drive online?

Yes.

 zpool replace [-f] pool old_device [new_device]

 Replaces old_device with new_device. This is  equivalent
 to attaching new_device, waiting for it to resilver, and
 then detaching old_device.


> operation, and you should be able to replace all disks in a vdev at
> once, shouldn't you?

Yes, if you have that many connections available.

-B

-- 
Brandon High : bh...@freaks.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs replace multiple drives

2010-05-24 Thread Ragnar Sundblad

On 24 maj 2010, at 02.44, Erik Trimble wrote:

> On 5/23/2010 5:00 PM, Andreas Iannou wrote:
>> Is it safe or possible to do a zpool replace for multiple drives at once? I 
>> think I have one of the troublesome WD Green drives as replacing it has 
>> taken 39hrs and only reslivered 58Gb, I have another two I'd like to replace 
>> but I'm wondering whether I should do that now as the other is being 
>> resilvered or wait till (a month) till this has finished.
>>  
>> Cheers,
>> Andre
> 
> You can replace drives up to the number of redundant drives in your vdev, 
> without loosing data.
> 
> Thus:
> 
> 2-way mirror:1 drive
> 3-way mirror:2 drives
> N-way mirror:N-1 drives
> RaidZ1:  1 drive
> RaidZ2:  2 drives
> RaidZ3:  3 drives
> 
> 
> If you have more than 1 vdev in a pool, then the above rule apply to EACH 
> vdev.
> 
> 
> Of course, if you replace the maximum number above, you have no protection 
> against failure for the period it takes to rebuild/resilver the vdev.

Is that really true if you use the "zpool replace" command with both
the old and the new drive online?

Since you then have both the original disk(s) and the new replacment
disk(s) online, you should actually have more redundancy during the
operation, and you should be able to replace all disks in a vdev at
once, shouldn't you?

If you are replacing by yanking out an old drive and put in a new
in it's place, your rules above should apply.

/ragge

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss