Re: [zfs-discuss] SSDs adequate ZIL devices?

2010-06-16 Thread Arve Paalsrud
Not to forget the The Deneva Reliability disks from OCZ that just got
released. See
http://www.oczenterprise.com/details/ocz-deneva-reliability-2-5-emlc-ssd.htm
l

The Deneva Reliability family features built-in supercapacitor (SF-1500
models) that acts as a temporary power backup in the event of sudden power
loss, and enables the drive to complete its task ensuring no data loss.

-Arve

 -Original Message-
 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of Christopher George
 Sent: 16. juni 2010 00:47
 To: zfs-discuss@opensolaris.org
 Subject: Re: [zfs-discuss] SSDs adequate ZIL devices?
 
  So why buy SSD for ZIL at all?
 
 For the record, not all SSDs ignore cache flushes.  There are at
 least
 two SSDs sold today that guarantee synchronous write semantics; the
 Sun/Oracle LogZilla and the DDRdrive X1.  Also, I believe it is more
 accurate to describe the root cause as not power protecting on-board
 volatile caches.  As the X25-E does implement the ATA FLUSH
 CACHE command, but does not have the required power protection to
 avoid transaction (data) loss.
 
 Best regards,
 
 Christopher George
 Founder/CTO
 www.ddrdrive.com
 --
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SSDs adequate ZIL devices?

2010-06-16 Thread Arne Jansen
Christopher George wrote:
 So why buy SSD for ZIL at all?
 
 For the record, not all SSDs ignore cache flushes.  There are at least 
 two SSDs sold today that guarantee synchronous write semantics; the 
 Sun/Oracle LogZilla and the DDRdrive X1.  Also, I believe it is more 

LogZilla? Are these those STEC-thingies? For the price of those I can
buy a battery backed-up RAID-controller and a few conventional drives.
For ZIL this will probably do better at a lower price than STEC.

The DDRdrive I wouldn't call a flash drive but rather a NVRAM-Card.
NVRAM-cards are the proper way to go for ZIL. Someone should build
one for  $600, PCIe x1 would be sufficient. Xilinx has some nice
Spartans :)

 accurate to describe the root cause as not power protecting on-board 
 volatile caches.  As the X25-E does implement the ATA FLUSH 
 CACHE command, but does not have the required power protection to 
 avoid transaction (data) loss.

You could say the same about hard drives. They also just need a proper
protection for their volatile cache...

--Arne

 
 Best regards,
 
 Christopher George
 Founder/CTO
 www.ddrdrive.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SSDs adequate ZIL devices?

2010-06-16 Thread Arne Jansen
Arve Paalsrud wrote:
 Not to forget the The Deneva Reliability disks from OCZ that just got
 released. See
 http://www.oczenterprise.com/details/ocz-deneva-reliability-2-5-emlc-ssd.html
 
 The Deneva Reliability family features built-in supercapacitor (SF-1500
 models) that acts as a temporary power backup in the event of sudden power
 loss, and enables the drive to complete its task ensuring no data loss.
 

This one looks really interesting. No price to find though, and no detail about
how many write cycles they can stand.

--Arne
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SSDs adequate ZIL devices?

2010-06-16 Thread David Magda
On Wed, June 16, 2010 03:03, Arne Jansen wrote:
 Christopher George wrote:

 For the record, not all SSDs ignore cache flushes.  There are at least
 two SSDs sold today that guarantee synchronous write semantics; the
 Sun/Oracle LogZilla and the DDRdrive X1.  Also, I believe it is more

 LogZilla? Are these those STEC-thingies? For the price of those I can
 buy a battery backed-up RAID-controller and a few conventional drives.
 For ZIL this will probably do better at a lower price than STEC.

I'm not sure you'd get the same latency and IOps with disk that you can
with a good SSD:

http://blogs.sun.com/brendan/entry/slog_screenshots

You're also talking about using more power (and cooling), and more moving
parts, which can affect reliability numbers. TANSTAAFL.

Towards the bottom of that post Brendan Gregg configures eight Logzillas
(which I'm sure has the cost of a small car), and got 114,000 synchronous
write ops/sec over NFS; 85% of which were done in under 1.21 ms. I'm not
sure how many spindles you'd need to purchase to get numbers like that in
a more traditional configuration.

Whether it's worth the cash is another matter entirely.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SSDs adequate ZIL devices?

2010-06-16 Thread Kyle McDonald
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

I've very in-frequently seen the RAMSAN devices mentioned here. Probably
due to price.

However a long time ago I think I remember someone suggesting a build it
yourself RAMSAN.

Where is the down side of one or 2 OS boxes with a whole lot of RAM
(and/or SSD's) exporting either RAMdisks or zVOLs out over iSCSI, FCoE,
or direct FC (can OS do that?)

If the RAM and/or SSD's (or even HD's) ere large enough this box might
be able to serve several other ZFS servers. A dedicated Network, or
direct connections if there are enough ports, should eliminate the net
from the being a bottle neck.

A sub $100 UPS (or 2) could protect the whole thing.

I'm sure I'm missing something, but I'm not seeing it at the moment.
Anyone else have any ideas?

 -Kyle
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.14 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJMGN7DAAoJEEADRM+bKN5w35EIAKX5T96Ls4wNQUMEtHKp1qpM
cu3TlS+h+2vRGMYq0ZMnudiEvGlvxOldifSUHkHWWVMqOsPZplMcBJMoDXOQgChU
i4NPSMTnjPT3zRxLeOm6ZCrfHv4/rYr4RNYjN2DUcaXHrfGdMXg0aYFAoJxObnwx
zMNB8xLqqlXDIkSo3i9ONZAbvVbHehs8V3az63j/P+AyyQcyhu96xR3wjJZpfDnI
N7kE3id9o8WNufw35KyQy3w/bOAvhh8dXsuZm81rpaq6VQ1wS5AnRVQ48mhbYua9
kZNy8eLrobOBR2YCZZFoLrXVQWYfSVMV/pL0fYUf2J12P7EETk6LHKnr3Hy7W2E=
=XDQw
-END PGP SIGNATURE-
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SSDs adequate ZIL devices?

2010-06-16 Thread David Magda
On Wed, June 16, 2010 10:44, Arne Jansen wrote:
 David Magda wrote:

 I'm not sure you'd get the same latency and IOps with disk that you can
 with a good SSD:

 http://blogs.sun.com/brendan/entry/slog_screenshots
[...]
 Please keep in mind I'm talking about a usage as ZIL, not as L2ARC or main
 pool. Because ZIL issues nearly sequential writes, due to the
 NVRAM-protection
 of the RAID-controller the disk can leave the write cache enabled. This
 means
 the disk can write essentially with full speed, meaning 150MB/s for a 15k
 drive.
 114000 4k writes/s are 456MB/s, so 3 spindles should do.

Yes, I understood it as suck, and that link is for ZIL. For L2ARC SSD
numbers see:

http://blogs.sun.com/brendan/entry/l2arc_screenshots


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SSDs adequate ZIL devices?

2010-06-16 Thread David Magda
On Wed, June 16, 2010 11:02, David Magda wrote:
[...]
 Yes, I understood it as suck, and that link is for ZIL. For L2ARC SSD
 numbers see:

s/suck/such/

:)

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SSDs adequate ZIL devices?

2010-06-16 Thread Arne Jansen
David Magda wrote:
 On Wed, June 16, 2010 11:02, David Magda wrote:
 [...]
 Yes, I understood it as suck, and that link is for ZIL. For L2ARC SSD
 numbers see:
 
 s/suck/such/

ah, I tried to make sense from 'suck' in the sense of 'just writing
sequentially' or something like that ;)

 
 :)

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SSDs adequate ZIL devices?

2010-06-16 Thread Arne Jansen
David Magda wrote:
 On Wed, June 16, 2010 10:44, Arne Jansen wrote:
 David Magda wrote:

 I'm not sure you'd get the same latency and IOps with disk that you can
 with a good SSD:

 http://blogs.sun.com/brendan/entry/slog_screenshots
 [...]
 Please keep in mind I'm talking about a usage as ZIL, not as L2ARC or main
 pool. Because ZIL issues nearly sequential writes, due to the
 NVRAM-protection
 of the RAID-controller the disk can leave the write cache enabled. This
 means
 the disk can write essentially with full speed, meaning 150MB/s for a 15k
 drive.
 114000 4k writes/s are 456MB/s, so 3 spindles should do.
 
 Yes, I understood it as suck, and that link is for ZIL. For L2ARC SSD
 numbers see:
 
 http://blogs.sun.com/brendan/entry/l2arc_screenshots
 

oops, sorry, I should at least scrolled down a bit on your link... Nevertheless
I don't find it improbable to reach numbers like that for a proper RAID-setup.
Of cause it will take more space and power. Maybe someone has done some testing
on this.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SSDs adequate ZIL devices?

2010-06-16 Thread Bob Friesenhahn

On Wed, 16 Jun 2010, Arne Jansen wrote:


Please keep in mind I'm talking about a usage as ZIL, not as L2ARC or main
pool. Because ZIL issues nearly sequential writes, due to the NVRAM-protection
of the RAID-controller the disk can leave the write cache enabled. This means
the disk can write essentially with full speed, meaning 150MB/s for a 15k drive.
114000 4k writes/s are 456MB/s, so 3 spindles should do.


Huh?  What does the battery backed memory of a RAID-controller have to 
do with the unprotected memory of a hard drive?  This does not 
compute.  The flushes that the RAID-controller acks need to be 
ultimately delivered to the disk or else there WILL be data loss. 
The RAID controller should not purge its own record until the disk 
reports that it has flushed its cache.  Once the RAID controller's 
cache is full, then it should start stalling writes.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SSDs adequate ZIL devices?

2010-06-16 Thread Carson Gaspar

Arne Jansen wrote:

David Magda wrote:

On Wed, June 16, 2010 10:44, Arne Jansen wrote:

David Magda wrote:


I'm not sure you'd get the same latency and IOps with disk that you can
with a good SSD:

http://blogs.sun.com/brendan/entry/slog_screenshots

[...]

Please keep in mind I'm talking about a usage as ZIL, not as L2ARC or main
pool. Because ZIL issues nearly sequential writes, due to the
NVRAM-protection
of the RAID-controller the disk can leave the write cache enabled. This
means
the disk can write essentially with full speed, meaning 150MB/s for a 15k
drive.
114000 4k writes/s are 456MB/s, so 3 spindles should do.

Yes, I understood it as suck, and that link is for ZIL. For L2ARC SSD
numbers see:

http://blogs.sun.com/brendan/entry/l2arc_screenshots



oops, sorry, I should at least scrolled down a bit on your link... Nevertheless
I don't find it improbable to reach numbers like that for a proper RAID-setup.
Of cause it will take more space and power. Maybe someone has done some testing
on this.


You don't need a fast disk. It just needs to be at least as large as the 
write cache on your RAID controller, and that needs to be large enough 
to handle your SLOG needs.


For example, you can get an Areca RAID controller with 4 GB of cache for 
about USD$1k. Hook any 4GB disk to it, and you have a _very_ fast 4GB 
SLOG device with battery back up.


Of course this is less attractive now that other, less astronomically 
expensive options are becoming available.


I'm not sure how that compares in performance to an Acard ANS-9010, 
which you can populate with 16GB of RAM + flash backup for about the 
same price.


--
Carson
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SSDs adequate ZIL devices?

2010-06-16 Thread Arne Jansen

Bob Friesenhahn wrote:

On Wed, 16 Jun 2010, Arne Jansen wrote:


Please keep in mind I'm talking about a usage as ZIL, not as L2ARC or 
main
pool. Because ZIL issues nearly sequential writes, due to the 
NVRAM-protection
of the RAID-controller the disk can leave the write cache enabled. 
This means
the disk can write essentially with full speed, meaning 150MB/s for a 
15k drive.

114000 4k writes/s are 456MB/s, so 3 spindles should do.


Huh?  What does the battery backed memory of a RAID-controller have to 
do with the unprotected memory of a hard drive?  This does not compute.  


You're right, I took a wrong turn there. Of course the RAID-controller
disables the write cache of the disks. But because the controller ACKs
each write immediately (as long as it has buffer left), the requests can
be queued in the disk. This enables the disk to write continously.
I double checked before posting: I can nearly saturate a 15k disk if I
make full use of the 32 queue slots giving 137 MB/s or 34k IOPS/s. Times
3 nearly matches the above mentioned 114k IOPS :)

Thanks,
Arne

The flushes that the RAID-controller acks need to be ultimately 
delivered to the disk or else there WILL be data loss. The RAID 
controller should not purge its own record until the disk reports that 
it has flushed its cache.  Once the RAID controller's cache is full, 
then it should start stalling writes.





Bob


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SSDs adequate ZIL devices?

2010-06-16 Thread David Magda
On Wed, June 16, 2010 15:15, Arne Jansen wrote:

 I double checked before posting: I can nearly saturate a 15k disk if I
 make full use of the 32 queue slots giving 137 MB/s or 34k IOPS/s. Times
 3 nearly matches the above mentioned 114k IOPS :)

34K*3 = 102K. 12K isn't anything to sneeze at :)

So you'll need six disks to do what one SSD does: three spindles, and two
(mirrored) disks on each spindle for redundancy (drives are riskier than
SSDs).


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SSDs adequate ZIL devices?

2010-06-16 Thread Arne Jansen

David Magda wrote:

On Wed, June 16, 2010 15:15, Arne Jansen wrote:


I double checked before posting: I can nearly saturate a 15k disk if I
make full use of the 32 queue slots giving 137 MB/s or 34k IOPS/s. Times
3 nearly matches the above mentioned 114k IOPS :)


34K*3 = 102K. 12K isn't anything to sneeze at :)

So you'll need six disks to do what one SSD does: three spindles, and two
(mirrored) disks on each spindle for redundancy (drives are riskier than
SSDs).



ok, 4 spindles, we already have a raid controller available :) But personally
I trust drives more than SSDs.
Are the 114k with mirrored or striped logzillas? In any case there are two of
them, so I'd double that raid-controller setup also, being still cheaper than
the STEC devices.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SSDs adequate ZIL devices?

2010-06-16 Thread Nicolas Williams
On Wed, Jun 16, 2010 at 04:44:07PM +0200, Arne Jansen wrote:
 Please keep in mind I'm talking about a usage as ZIL, not as L2ARC or main
 pool. Because ZIL issues nearly sequential writes, due to the NVRAM-protection
 of the RAID-controller the disk can leave the write cache enabled. This means
 the disk can write essentially with full speed, meaning 150MB/s for a 15k 
 drive.
 114000 4k writes/s are 456MB/s, so 3 spindles should do.

You'd still have to flush those caches at the end of each transaction,
which would tend to come every few seconds, so you'd need to factor that
in.  You can definitely do with disk what you can do with SSDs, but not
necessarily with the same SWAP (space, wattage and price), and you'd
have a more complex system no matter what.

Nico
-- 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SSDs adequate ZIL devices?

2010-06-15 Thread Bob Friesenhahn

On Tue, 15 Jun 2010, Arne Jansen wrote:

In case of a power failure I will likely lose about as many writes 
as I do with SSDs, a few milliseconds.


I agree with your concerns, but the data loss may span as much as 30 
seconds rather than just a few milliseconds.


Using an SSD as the ZIL allows zfs to turn a synchronous write into a 
normal batched async write which is scheduled for the next TXG.  Zfs 
intentionally postpones writes.


Without the SSD, zfs needs to write to an intent log in the main pool 
(consuming precious IOPS) or write directly to the main pool 
(consuming precious response latency).  Battery-backed RAM in the 
adaptor card or storage array can do almost as well as the SSD as long 
as the amount of data does not overrun the limited write cache.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SSDs adequate ZIL devices?

2010-06-15 Thread Christopher George
 So why buy SSD for ZIL at all?

For the record, not all SSDs ignore cache flushes.  There are at least 
two SSDs sold today that guarantee synchronous write semantics; the 
Sun/Oracle LogZilla and the DDRdrive X1.  Also, I believe it is more 
accurate to describe the root cause as not power protecting on-board 
volatile caches.  As the X25-E does implement the ATA FLUSH 
CACHE command, but does not have the required power protection to 
avoid transaction (data) loss.

Best regards,

Christopher George
Founder/CTO
www.ddrdrive.com
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SSDs adequate ZIL devices?

2010-06-15 Thread Przemyslaw Ceglowski



On 15/06/2010 23:46, Christopher George cgeo...@ddrdrive.com wrote:

 So why buy SSD for ZIL at all?
 
 For the record, not all SSDs ignore cache flushes.  There are at least
 two SSDs sold today that guarantee synchronous write semantics; the
 Sun/Oracle LogZilla and the DDRdrive X1.  Also, I believe it is more
 accurate to describe the root cause as not power protecting on-board
 volatile caches.  As the X25-E does implement the ATA FLUSH
 CACHE command, but does not have the required power protection to
 avoid transaction (data) loss.
 
 Best regards,
 
 Christopher George
 Founder/CTO
 www.ddrdrive.com

Often forgotten (most probably due the price) are the latest Pliant SSDs.

--
Przem

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SSDs adequate ZIL devices?

2010-06-15 Thread Arne Jansen

Bob Friesenhahn wrote:

On Tue, 15 Jun 2010, Arne Jansen wrote:

In case of a power failure I will likely lose about as many writes as 
I do with SSDs, a few milliseconds.


I agree with your concerns, but the data loss may span as much as 30 
seconds rather than just a few milliseconds.


Wait, I'm talking about using SSD for ZIL vs. using a dedicated hard drive
for ZIL which is configured to ignore cache flushes. Do you say I can lose
30 seconds also if I use a badly behaving SSD?



Using an SSD as the ZIL allows zfs to turn a synchronous write into a 
normal batched async write which is scheduled for the next TXG.  Zfs 
intentionally postpones writes.


Without the SSD, zfs needs to write to an intent log in the main pool 
(consuming precious IOPS) or write directly to the main pool (consuming 
precious response latency).  Battery-backed RAM in the adaptor card or 
storage array can do almost as well as the SSD as long as the amount of 
data does not overrun the limited write cache.


Bob


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss