Re: [zfs-discuss] New SSD options

2010-05-24 Thread Miles Nordin
 d == Don  d...@blacksun.org writes:
 hk == Haudy Kazemi kaze0...@umn.edu writes:

 d You could literally split a sata cable and add in some
 d capacitors for just the cost of the caps themselves.

no, this is no good.  The energy only flows in and out of the
capacitor when the voltage across it changes.  In this respect they
are different from batteries.  It's normal to use (non-super)
capacitors as you describe for filters next to things drawing power in
a high-frequency noisy way, but to use them for energy storage across
several seconds you need a switching supply to drain the energy from
it.  the step-down and voltage-pump kinds of switchers are
non-isolated and might do fine, and are cheaper than full-fledged
DC-DC that are isolated (meaning the input and output can float wrt
each other).

you can charge from 12V and supply 5V if that's cheaper.  :)

hope it works.

hk okay, we've waited 5 seconds for additional data to arrive to
hk be written.  None has arrived in the last 5 seconds, so we're
hk going to write what we already have to better ensure data
hk integrity,

yeah, I am worried about corner cases like this.  ex: input power to
the SSD becomes scratchy or sags, but power to the host and controller
remain fine.  Writes arrive continuously.  The SSD sees nothing wrong
with its power and continues to accept and acknowledge writes.
Meanwhile you burn through your stored power hiding the sagging supply
until you can't, then the SSD loses power suddenly and drops a bunch
of writes on the floor.  That is why I drew that complicated state
diagram in which the pod disables and holds-down the SATA connection
once it's running on reserve power.  Probably y'all don't give a fuck
about such corners though, nor do many of the manufacturers selling
this stuff, so, whatever.


pgpYM02z6LZ58.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] New SSD options

2010-05-24 Thread Ray Van Dolson
This thread has grown giant, so apologies for screwing up threading
with an out of place reply. :)

So, as far as SF-1500 based SSD's, the only ones currently in existence
are the Vertex 2 LE and Vertex 2 EX, correct (I understand the Vertex 2
Pro was never mass produced)?

Both of these are based on MLC and not SLC -- why isn't that an issue
for longevity?

Any other SF-1500 options out there?

We continue to use UPS-backed Intel X-25E's for ZIL.

Ray
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] New SSD options

2010-05-24 Thread Ray Van Dolson
On Mon, May 24, 2010 at 11:30:20AM -0700, Ray Van Dolson wrote:
 This thread has grown giant, so apologies for screwing up threading
 with an out of place reply. :)
 
 So, as far as SF-1500 based SSD's, the only ones currently in existence
 are the Vertex 2 LE and Vertex 2 EX, correct (I understand the Vertex 2
 Pro was never mass produced)?
 
 Both of these are based on MLC and not SLC -- why isn't that an issue
 for longevity?
 
 Any other SF-1500 options out there?
 
 We continue to use UPS-backed Intel X-25E's for ZIL.

From earlier in the thread, it sounds like none of the SF-1500 based
drives even have a supercap, so it doesn't seem that they'd necessarily
be a better choice than the SLC-based X-25E at this point unless you
need more write IOPS...

Ray
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] New SSD options

2010-05-24 Thread Thomas Burgess



 From earlier in the thread, it sounds like none of the SF-1500 based
 drives even have a supercap, so it doesn't seem that they'd necessarily
 be a better choice than the SLC-based X-25E at this point unless you
 need more write IOPS...

 Ray


I think the upcoming OCZ Vertex 2 Pro will have a supercap.

I just bought a ocz vertex le, it doesn't have a supercap but it DOES have
some awesome specs otherwise..
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] New SSD options

2010-05-22 Thread Ragnar Sundblad

On 22 maj 2010, at 07.40, Don wrote:

 The SATA power connector supplies 3.3, 5 and 12v. A complete
 solution will have all three. Most drives use just the 5v, so you can
 probably ignore 3.3v and 12v.
 I'm not interested in building something that's going to work for every 
 possible drive config- just my config :) Both the Intel X25-e and the OCZ 
 only uses the 5V rail.
 
 You'll need to use a step up DC-DC converter and be able to supply ~
 100mA at 5v.
 It's actually easier/cheaper to use a LiPoly battery  charger and get a
 few minutes of power than to use an ultracap for a few seconds of
 power. Most ultracaps are ~ 2.5v and LiPoly is 3.7v, so you'll need a
 step up converter in either case.
 Ultracapacitors are available in voltage ratings beyond 12volts so there is 
 no reason to use a boost converter with them. That eliminates high frequency 
 switching transients right next to our SSD which is always helpful.
 
 In this case- we have lots of room. We have a 3.5 x 1 drive bay, but a 2.5 
 x 1/4 hard drive. There is ample room for several of the 6.3V ELNA 1F 
 capacitors (and our SATA power rail is a 5V regulated rail so they should 
 suffice)- either in series or parallel (Depending on voltage or runtime 
 requirements).
 http://www.elna.co.jp/en/capacitor/double_layer/catalog/pdf/dk_e.pdf 
 
 You could 2 caps in series for better voltage tolerance or in parallel for 
 longer runtimes. Either way you probably don't need a charge controller, a 
 boost or buck converter, or in fact any IC's at all. It's just a small board 
 with some caps on it.

I know they have a certain internal resistance, but I am not familiar
with the characteristics; is it high enough so you don't need to
limit the inrush current, and is it low enough so that you don't need
a voltage booster for output?

 Cost for a 5v only system should be $30 - $35 in one-off
 prototype-ready components with a 1100mAH battery (using prices from
 Sparkfun.com),
 You could literally split a sata cable and add in some capacitors for just 
 the cost of the caps themselves. The issue there is whether the caps would 
 present too large a current drain on initial charge up- If they do then you 
 need to add in charge controllers and you've got the same problems as with a 
 LiPo battery- although without the shorter service life.
 
 At the end of the day the real problem is whether we believe the drives 
 themselves will actually use the quiet period on the now dead bus to write 
 out their caches. This is something we should ask the manufacturers, and test 
 for ourselves.

Indeed!

/ragge

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] New SSD options

2010-05-22 Thread taemun
Basic electronics, go!

The linked capacitor from Elna (
http://www.elna.co.jp/en/capacitor/double_layer/catalog/pdf/dk_e.pdf) has an
internal resistance of 30 ohms.

Intel rate their 32GB X25-E at 2.4W active (we aren't interested in idle
power usage, if its idle, we don't need the capacitor in the first place) on
the +5V rail, thats 0.48A. (P=VI)

V=IR, supply is 5V, current through load is 480mA, hence R=10.4 ohms.
The resistance of the X25-E under load is 10.4 ohms.

Now if you have a capacitor discharge circuit with the charged Elna
DK-6R3D105T - the largest and most suitable from that datasheet - you have
40.4 ohms around the loop (cap and load). +5V over 40.4 ohms. The maximum
current you can pull from that is I=V/R = 124mA. Around a quarter what the
X25-E wants in order to write.

The setup won't work.

I'd suggest something more along the lines of:
http://www.cap-xx.com/products/products.htm
Which have an ESR around 3 orders of magnitude lower.

t

On 22 May 2010 18:58, Ragnar Sundblad ra...@csc.kth.se wrote:


 On 22 maj 2010, at 07.40, Don wrote:

  The SATA power connector supplies 3.3, 5 and 12v. A complete
  solution will have all three. Most drives use just the 5v, so you can
  probably ignore 3.3v and 12v.
  I'm not interested in building something that's going to work for every
 possible drive config- just my config :) Both the Intel X25-e and the OCZ
 only uses the 5V rail.
 
  You'll need to use a step up DC-DC converter and be able to supply ~
  100mA at 5v.
  It's actually easier/cheaper to use a LiPoly battery  charger and get a
  few minutes of power than to use an ultracap for a few seconds of
  power. Most ultracaps are ~ 2.5v and LiPoly is 3.7v, so you'll need a
  step up converter in either case.
  Ultracapacitors are available in voltage ratings beyond 12volts so there
 is no reason to use a boost converter with them. That eliminates high
 frequency switching transients right next to our SSD which is always
 helpful.
 
  In this case- we have lots of room. We have a 3.5 x 1 drive bay, but a
 2.5 x 1/4 hard drive. There is ample room for several of the 6.3V ELNA 1F
 capacitors (and our SATA power rail is a 5V regulated rail so they should
 suffice)- either in series or parallel (Depending on voltage or runtime
 requirements).
  http://www.elna.co.jp/en/capacitor/double_layer/catalog/pdf/dk_e.pdf
 
  You could 2 caps in series for better voltage tolerance or in parallel
 for longer runtimes. Either way you probably don't need a charge controller,
 a boost or buck converter, or in fact any IC's at all. It's just a small
 board with some caps on it.

 I know they have a certain internal resistance, but I am not familiar
 with the characteristics; is it high enough so you don't need to
 limit the inrush current, and is it low enough so that you don't need
 a voltage booster for output?

  Cost for a 5v only system should be $30 - $35 in one-off
  prototype-ready components with a 1100mAH battery (using prices from
  Sparkfun.com),
  You could literally split a sata cable and add in some capacitors for
 just the cost of the caps themselves. The issue there is whether the caps
 would present too large a current drain on initial charge up- If they do
 then you need to add in charge controllers and you've got the same problems
 as with a LiPo battery- although without the shorter service life.
 
  At the end of the day the real problem is whether we believe the drives
 themselves will actually use the quiet period on the now dead bus to write
 out their caches. This is something we should ask the manufacturers, and
 test for ourselves.

 Indeed!

 /ragge

 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] New SSD options

2010-05-22 Thread Bob Friesenhahn

On Fri, 21 May 2010, Don wrote:

You could literally split a sata cable and add in some capacitors 
for just the cost of the caps themselves. The issue there is whether 
the caps would present too large a current drain on initial charge 
up- If they do then you need to add in charge controllers and you've 
got the same problems as with a LiPo battery- although without the 
shorter service life.


Electricity does run both directions down a wire and the capacitor 
would look like a short circuit to the supply when it is first turned 
on.  You would need some circuitry which delays applying power to the 
drive before the capacitor is sufficiently charged, and some circuitry 
which shuts off the flow of energy back into the power supply when the 
power supply shuts off (could be a silicon diode if you don't mind the 
0.7 V drop).


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] New SSD options

2010-05-22 Thread Haudy Kazemi

Bob Friesenhahn wrote:

On Fri, 21 May 2010, Don wrote:

You could literally split a sata cable and add in some capacitors for 
just the cost of the caps themselves. The issue there is whether the 
caps would present too large a current drain on initial charge up- If 
they do then you need to add in charge controllers and you've got the 
same problems as with a LiPo battery- although without the shorter 
service life.


Electricity does run both directions down a wire and the capacitor 
would look like a short circuit to the supply when it is first turned 
on.  You would need some circuitry which delays applying power to the 
drive before the capacitor is sufficiently charged, and some circuitry 
which shuts off the flow of energy back into the power supply when the 
power supply shuts off (could be a silicon diode if you don't mind the 
0.7 V drop).


Bob


You can also use an appropriately wired field effect transistor (FET) / 
MOSFET of sufficient current carrying capacity as a one-way valve 
(diode) that has minimal voltage drop.

More:
http://electronicdesign.com/article/power/fet-supplies-low-voltage-reverse-polarity-protecti.aspx
http://www.electro-tech-online.com/general-electronics-chat/32118-using-mosfet-diode-replacement.html


In regard to how long do you need to continue supplying power...that 
comes down to how long does the SSD wait before flushing cache to 
flash.  If you can identify the maximum write cache flush interval, and 
size the battery or capacitor to exceed that maximum interval, you 
should be okay.  The maximum write cache flush interval is determined by 
a timer that says something like okay, we've waited 5 seconds for 
additional data to arrive to be written.  None has arrived in the last 5 
seconds, so we're going to write what we already have to better ensure 
data integrity, even though it is suboptimal from a absolute performance 
perspective.  In conventional terms of filling city buses...the bus 
leaves when it is full of people, or 15 minutes has passed since the 
last bus left.


Does anyone know if there is a way to directly or indirectly measure the 
write caching flush interval?  I know cache sizes can be found via 
performance testing, but what about write intervals?

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] New SSD options

2010-05-21 Thread thomas
On the PCIe side, I noticed there's a new card coming from LSI that claims 
150,000 4k random writes. Unfortunately this might end up being an OEM-only 
card.

I also notice on the ddrdrive site that they now have an opensolaris driver and 
are offering it in a beta program.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] New SSD options

2010-05-21 Thread Ross Walker

On May 20, 2010, at 7:17 PM, Ragnar Sundblad ra...@csc.kth.se wrote:



On 21 maj 2010, at 00.53, Ross Walker wrote:


On May 20, 2010, at 6:25 PM, Travis Tabbal tra...@tabbal.net wrote:


use a slog at all if it's not durable?  You should
disable the ZIL
instead.



This is basically where I was going. There only seems to be one  
SSD that is considered working, the Zeus IOPS. Even if I had the  
money, I can't buy it. As my application is a home server, not a  
datacenter, things like NFS breaking if I don't reboot the clients  
is a non-issue. As long as the on-disk data is consistent so I  
don't have to worry about the entire pool going belly-up, I'm  
happy enough. I might lose 30 seconds of data, worst case, as a  
result of running without ZIL. Considering that I can't buy a  
proper ZIL at a cost I can afford, and an improper ZIL is not  
worth much, I don't see a reason to bother with ZIL at all. I'll  
just get a cheap large SSD for L2ARC, disable ZIL, and call it a  
day.


For my use, I'd want a device in the $200 range to even consider  
an slog device. As nothing even remotely close to that price range  
exists that will work properly at all, let alone with decent  
performance, I see no point in ZIL for my application. The  
performance hit is just too severe to continue using it without an  
slog, and there's no slog device I can afford that works properly,  
even if I ignore performance.


Just buy a caching RAID controller and run it in JBOD mode and have  
the ZIL integrated with the pool.


A 512MB-1024MB card with battery backup should do the trick. It  
might not have the capacity of an SSD, but in my experience it  
works well in the 1TB data moderately loaded range.


Have more data/activity then try more cards and more pools,  
otherwise pony up the  for a capacitor backed SSD.


It - again - depends on what problem you are trying to solve.

If the RAID controller goes bad on you so that you loose the
data in the write cache, your file system could be in pretty bad
shape. Most RAID controllers can't be mirrored. That would hardly
make a good replacement for a mirrored ZIL.

As far as I know, there is no single silver bullet to this issue.


That is true, and there at finite budgets as well and as all things in  
life one must make a trade-off somewhere.


If you have 2 mirrored SSDs that don't support cache flush and your  
power goes out your file system will be in the same bad shape.  
Difference is in the first place you paid a lot less to have your data  
hosed.


-Ross
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] New SSD options

2010-05-21 Thread Brandon High
On Thu, May 20, 2010 at 8:46 PM, Don d...@blacksun.org wrote:
 I'm kind of flabbergasted that no one has simply stuck a capacitor on a more 
 reasonable drive. I guess the market just isn't big enough- but I find that 
 hard to believe.

I just spoke with a co-worker about doing something about it.

He says he can design a small in-line UPS that will deliver 20-30
seconds of 3.3V, 5V, and 12V to the SATA power connector for about $50
in parts. It would be even less if only one voltage was needed. That
should be enough for most any SSD to finish any pending writes.

Any design that we come up with will be made publicly available under
a Creative Commons or other similar license.

-B

-- 
Brandon High : bh...@freaks.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] New SSD options

2010-05-21 Thread Don
 I just spoke with a co-worker about doing something about it.
 
 He says he can design a small in-line UPS that will deliver 20-30
 seconds of 3.3V, 5V, and 12V to the SATA power connector for about $50
 in parts. It would be even less if only one voltage was needed. That
 should be enough for most any SSD to finish any pending writes.
Oh I wasn't kidding when I said I was going to have to try this with my home 
server. I actually do some circuit board design and this would be an amusing 
project. All you probably need is 5v- I'll look into it.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] New SSD options

2010-05-21 Thread Ian Collins

On 05/22/10 12:31 PM, Don wrote:

I just spoke with a co-worker about doing something about it.

He says he can design a small in-line UPS that will deliver 20-30
seconds of 3.3V, 5V, and 12V to the SATA power connector for about $50
in parts. It would be even less if only one voltage was needed. That
should be enough for most any SSD to finish any pending writes.
 

Oh I wasn't kidding when I said I was going to have to try this with my home 
server. I actually do some circuit board design and this would be an amusing 
project. All you probably need is 5v- I'll look into it.
   

Two Supercaps should do the trick.  Dive connectors only have 5 and 12v.

--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] New SSD options

2010-05-21 Thread Brandon High
On Fri, May 21, 2010 at 5:31 PM, Don d...@blacksun.org wrote:
 Oh I wasn't kidding when I said I was going to have to try this with my home 
 server. I actually do some circuit board design and this would be an amusing 
 project. All you probably need is 5v- I'll look into it.

The SATA power connector supplies 3.3, 5 and 12v. A complete
solution will have all three. Most drives use just the 5v, so you can
probably ignore 3.3v and 12v.

You'll need to use a step up DC-DC converter and be able to supply ~
100mA at 5v. (I can't find any specific numbers on power consumption.
Intel claims 75mW - 150mW for the X25-M. USB is rated at 500mA at 5v,
and all drives that I've seen can run in an un-powered USB case.) It's
actually easier/cheaper to use a LiPoly battery  charger and get a
few minutes of power than to use an ultracap for a few seconds of
power. Most ultracaps are ~ 2.5v and LiPoly is 3.7v, so you'll need a
step up converter in either case.

If you're supplying more than one voltage, you should use a
microcontroller to shut off all the charge pumps at once when the
battery / ultracap runs low. If you're only supplying 5V, it doesn't
matter.

Cost for a 5v only system should be $30 - $35 in one-off
prototype-ready components with a 1100mAH battery (using prices from
Sparkfun.com), plus the cost for an enclosure, etc. A larger buy, a
custom PCB, and a smaller battery would probably reduce the cost
20-50%.

-B

-- 
Brandon High : bh...@freaks.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] New SSD options

2010-05-21 Thread Don
 The SATA power connector supplies 3.3, 5 and 12v. A complete
 solution will have all three. Most drives use just the 5v, so you can
 probably ignore 3.3v and 12v.
I'm not interested in building something that's going to work for every 
possible drive config- just my config :) Both the Intel X25-e and the OCZ only 
uses the 5V rail.

 You'll need to use a step up DC-DC converter and be able to supply ~
 100mA at 5v.
 It's actually easier/cheaper to use a LiPoly battery  charger and get a
 few minutes of power than to use an ultracap for a few seconds of
 power. Most ultracaps are ~ 2.5v and LiPoly is 3.7v, so you'll need a
 step up converter in either case.
Ultracapacitors are available in voltage ratings beyond 12volts so there is no 
reason to use a boost converter with them. That eliminates high frequency 
switching transients right next to our SSD which is always helpful.

In this case- we have lots of room. We have a 3.5 x 1 drive bay, but a 2.5 x 
1/4 hard drive. There is ample room for several of the 6.3V ELNA 1F capacitors 
(and our SATA power rail is a 5V regulated rail so they should suffice)- either 
in series or parallel (Depending on voltage or runtime requirements).
http://www.elna.co.jp/en/capacitor/double_layer/catalog/pdf/dk_e.pdf 

You could 2 caps in series for better voltage tolerance or in parallel for 
longer runtimes. Either way you probably don't need a charge controller, a 
boost or buck converter, or in fact any IC's at all. It's just a small board 
with some caps on it.

 Cost for a 5v only system should be $30 - $35 in one-off
 prototype-ready components with a 1100mAH battery (using prices from
 Sparkfun.com),
You could literally split a sata cable and add in some capacitors for just the 
cost of the caps themselves. The issue there is whether the caps would present 
too large a current drain on initial charge up- If they do then you need to add 
in charge controllers and you've got the same problems as with a LiPo battery- 
although without the shorter service life.

At the end of the day the real problem is whether we believe the drives 
themselves will actually use the quiet period on the now dead bus to write out 
their caches. This is something we should ask the manufacturers, and test for 
ourselves.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] New SSD options

2010-05-20 Thread Ragnar Sundblad

On 20 maj 2010, at 00.20, Don wrote:

 You can lose all writes from the last committed transaction (i.e., the
 one before the currently open transaction).
 
 And I don't think that bothers me. As long as the array itself doesn't go 
 belly up- then a few seconds of lost transactions are largely irrelevant- all 
 of the QA virtual machines are going to have to be rolled back to their 
 initial states anyway.

Ok - then you are in the dream situation, and your solution could be
free of charge, a one-liner command, and perform better than any
SSD on the market:

Disable the ZIL. You will loose up to 30 seconds of the lastly
written data, and if you use it as a NFS server your clients may
get confused after a crash since the server is not in the state
it should be in. 
You could also turn down the ZFS transaction timeout to loose
less than 30 seconds if you want.
Your pool will always be in a consistent shape on disk (if you
have hardware that behaves).

Remember to NEVER use this pool to anything that actually want
better data persistency, that this is a pool tuned specifically
for a very special case.

In very recent opensolaris there is a zpool property for this,
earlier you had to set a kernel flag when mounting the pool
(and having it unset when mounting other pools, if you want
them to have ZIL enabled).

/ragge

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] New SSD options

2010-05-20 Thread Travis Tabbal
 On May 19, 2010, at 2:29 PM, Don wrote:
 
 The data risk is a few moments of data loss. However,
 if the order of the
 uberblock updates is not preserved (which is why the
 caches are flushed)
 then recovery from a reboot may require manual
 intervention.  The amount
 of manual intervention could be significant for
 builds prior to b128.


This risk is mostly mitigated by UPS backup and auto-shutdown when the UPS 
detects power loss, correct? Outside of pulling the plug that should solve 
power related problems. Kernel panics should only be caused by hardware issues, 
which might corrupt the disk data anyway. Obviously software can and does fail, 
but the biggest problem I hear about with ZIL devices is behavior in a sudden 
power loss situation. It seems to me that UPS backup along with starting a 
shutdown cycle before complete power failure should prevent most issues. 

Seems like that should help with issues like the X25-E not honoring cache flush 
as well, the UPS would give it time to finish the writes. Again, without a 
firmware issue in the drive itself. Should be about the same as a supercap 
anyway.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] New SSD options

2010-05-20 Thread Ragnar Sundblad

On 20 maj 2010, at 20.35, David Magda wrote:

 On Thu, May 20, 2010 14:12, Travis Tabbal wrote:
 On May 19, 2010, at 2:29 PM, Don wrote:
 
 The data risk is a few moments of data loss. However,
 if the order of the
 uberblock updates is not preserved (which is why the
 caches are flushed)
 then recovery from a reboot may require manual
 intervention.  The amount
 of manual intervention could be significant for
 builds prior to b128.
 
 
 This risk is mostly mitigated by UPS backup and auto-shutdown when the UPS
 detects power loss, correct?
 
 Unless you have a contractor working in the server room that bumps into
 the UPS and causes a power glitch which causes a whole bunch of equipment
 to cycle.
 
 Happened at $WORK (in another office) just two weeks ago.

Or, a zillion of other problem modes with that setup, all from problems
with the UPS, to the auto-shutdown communication signaling system,
a problem with the computer system, the electrical distribution, or
anything else.

Building complex solutions to solve critical issues is IMHO seldom a
very good solution. If you care about data integrity, buy stuff
that do what they are supposed to do, and keep everything simple.
Redundancy is often good, but keep the switchover mechanisms as
simple and as few as possible. Choose mechanisms that can and will
be tested regularly - and don't use systems that are almost never
used and/or tested.
Complex systems tend to fail, especially after some time when things
have changed a bit, or even cause more outages in themselves. They
are hard to test, maintain and understand, and they are often costly
to buy too. KISS, you know.

In the Intel X25 case - bug them until they release new firmware - they
have sold you a defect product that they still haven't fixed.
If they don't fix it and you need it, get another drive.

 It all depends on your level of paranoia.

Either that, or you may have some kind of protocol, policy,
contract, SLA or similar that you have to follow.

(In any case it is often really hard to even guess how much a
certain change gives or takes in availability numbers.)

Just my 5 öre.

/ragge

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] New SSD options

2010-05-20 Thread Miles Nordin
 d == Don  d...@blacksun.org writes:

 d Since it ignores Cache Flush command and it doesn't have any
 d persistant buffer storage, disabling the write cache is the
 d best you can do.  This actually brings up another question I
 d had: What is the risk, beyond a few seconds of lost writes, if
 d I lose power, there is no capacitor and the cache is not
 d disabled?

why use a slog at all if it's not durable?  You should disable the ZIL
instead.  Compared to a slog that ignores cache flush, disabling the
ZIL will provide the same guarantees to the application w.r.t. write
ordering preserved, and the same problems with NFS server reboots,
replicated databases, mail servers.  It'll be faster than the
fake-slog.  It'll be less risk of losing the pool because the slog
went bad and then you accidentally exported the pool while trying to
fix things.

The only case where you are ahead with the fake-slog, is the host's
going down because of kernel panics rather than power loss.

I don't know, though, what to do about these reports of devices that
almost respect cache flushes but seem to lose exactly one transaction.
AFAICT this should be a works/doesntwork situation, not a continuum.


pgp4xXGJ3xew4.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] New SSD options

2010-05-20 Thread Bill Sommerfeld

On 05/20/10 12:26, Miles Nordin wrote:

I don't know, though, what to do about these reports of devices that
almost respect cache flushes but seem to lose exactly one transaction.
AFAICT this should be a works/doesntwork situation, not a continuum.


But there's so much brokenness out there.  I've seen similar tail drop 
behavior before -- last write or two before a hardware reset goes into 
the bit bucket, but ones before that are durable.


So, IMHO, a cheap consumer ssd used as a zil may still be worth it (for 
some use cases) to narrow the window of data loss from ~30 seconds to a 
sub-second value.


- Bill
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] New SSD options

2010-05-20 Thread Travis Tabbal
 use a slog at all if it's not durable?  You should
 disable the ZIL
 instead. 


This is basically where I was going. There only seems to be one SSD that is 
considered working, the Zeus IOPS. Even if I had the money, I can't buy it. 
As my application is a home server, not a datacenter, things like NFS breaking 
if I don't reboot the clients is a non-issue. As long as the on-disk data is 
consistent so I don't have to worry about the entire pool going belly-up, I'm 
happy enough. I might lose 30 seconds of data, worst case, as a result of 
running without ZIL. Considering that I can't buy a proper ZIL at a cost I can 
afford, and an improper ZIL is not worth much, I don't see a reason to bother 
with ZIL at all. I'll just get a cheap large SSD for L2ARC, disable ZIL, and 
call it a day. 

For my use, I'd want a device in the $200 range to even consider an slog 
device. As nothing even remotely close to that price range exists that will 
work properly at all, let alone with decent performance, I see no point in ZIL 
for my application. The performance hit is just too severe to continue using it 
without an slog, and there's no slog device I can afford that works properly, 
even if I ignore performance.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] New SSD options

2010-05-20 Thread Ross Walker

On May 20, 2010, at 6:25 PM, Travis Tabbal tra...@tabbal.net wrote:


use a slog at all if it's not durable?  You should
disable the ZIL
instead.



This is basically where I was going. There only seems to be one SSD  
that is considered working, the Zeus IOPS. Even if I had the  
money, I can't buy it. As my application is a home server, not a  
datacenter, things like NFS breaking if I don't reboot the clients  
is a non-issue. As long as the on-disk data is consistent so I don't  
have to worry about the entire pool going belly-up, I'm happy  
enough. I might lose 30 seconds of data, worst case, as a result of  
running without ZIL. Considering that I can't buy a proper ZIL at a  
cost I can afford, and an improper ZIL is not worth much, I don't  
see a reason to bother with ZIL at all. I'll just get a cheap large  
SSD for L2ARC, disable ZIL, and call it a day.


For my use, I'd want a device in the $200 range to even consider an  
slog device. As nothing even remotely close to that price range  
exists that will work properly at all, let alone with decent  
performance, I see no point in ZIL for my application. The  
performance hit is just too severe to continue using it without an  
slog, and there's no slog device I can afford that works properly,  
even if I ignore performance.


Just buy a caching RAID controller and run it in JBOD mode and have  
the ZIL integrated with the pool.


A 512MB-1024MB card with battery backup should do the trick. It might  
not have the capacity of an SSD, but in my experience it works well in  
the 1TB data moderately loaded range.


Have more data/activity then try more cards and more pools, otherwise  
pony up the  for a capacitor backed SSD.


-Ross

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] New SSD options

2010-05-20 Thread Ragnar Sundblad

On 21 maj 2010, at 00.53, Ross Walker wrote:

 On May 20, 2010, at 6:25 PM, Travis Tabbal tra...@tabbal.net wrote:
 
 use a slog at all if it's not durable?  You should
 disable the ZIL
 instead.
 
 
 This is basically where I was going. There only seems to be one SSD that is 
 considered working, the Zeus IOPS. Even if I had the money, I can't buy 
 it. As my application is a home server, not a datacenter, things like NFS 
 breaking if I don't reboot the clients is a non-issue. As long as the 
 on-disk data is consistent so I don't have to worry about the entire pool 
 going belly-up, I'm happy enough. I might lose 30 seconds of data, worst 
 case, as a result of running without ZIL. Considering that I can't buy a 
 proper ZIL at a cost I can afford, and an improper ZIL is not worth much, I 
 don't see a reason to bother with ZIL at all. I'll just get a cheap large 
 SSD for L2ARC, disable ZIL, and call it a day.
 
 For my use, I'd want a device in the $200 range to even consider an slog 
 device. As nothing even remotely close to that price range exists that will 
 work properly at all, let alone with decent performance, I see no point in 
 ZIL for my application. The performance hit is just too severe to continue 
 using it without an slog, and there's no slog device I can afford that works 
 properly, even if I ignore performance.
 
 Just buy a caching RAID controller and run it in JBOD mode and have the ZIL 
 integrated with the pool.
 
 A 512MB-1024MB card with battery backup should do the trick. It might not 
 have the capacity of an SSD, but in my experience it works well in the 1TB 
 data moderately loaded range.
 
 Have more data/activity then try more cards and more pools, otherwise pony up 
 the  for a capacitor backed SSD.

It - again - depends on what problem you are trying to solve.

If the RAID controller goes bad on you so that you loose the
data in the write cache, your file system could be in pretty bad
shape. Most RAID controllers can't be mirrored. That would hardly
make a good replacement for a mirrored ZIL.

As far as I know, there is no single silver bullet to this issue.

/ragge

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] New SSD options

2010-05-20 Thread Richard Elling
On May 20, 2010, at 1:12 PM, Bill Sommerfeld wrote:

 On 05/20/10 12:26, Miles Nordin wrote:
 I don't know, though, what to do about these reports of devices that
 almost respect cache flushes but seem to lose exactly one transaction.
 AFAICT this should be a works/doesntwork situation, not a continuum.
 
 But there's so much brokenness out there.  I've seen similar tail drop 
 behavior before -- last write or two before a hardware reset goes into the 
 bit bucket, but ones before that are durable.
 
 So, IMHO, a cheap consumer ssd used as a zil may still be worth it (for some 
 use cases) to narrow the window of data loss from ~30 seconds to a sub-second 
 value.

+1
 -- richard

-- 
ZFS and NexentaStor training, Rotterdam, July 13-15, 2010
http://nexenta-rotterdam.eventbrite.com/






___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] New SSD options

2010-05-20 Thread Don
 So, IMHO, a cheap consumer ssd used as a zil may still be worth it (for
 some use cases) to narrow the window of data loss from ~30 seconds to a
 sub-second value.
There are lots of reasons to enable the ZIL now- I can throw four very 
inexpensive SSD's in there now in a pair of mirrors, and then when a better 
drive comes along I can replace each half of the mirror without bringing 
anything down. My slots are already allocated and it would be nice to save a 
few extra seconds of writes- just in case. It's not a great solution- but 
nothing is. I don't have access to a ZEUS- and even if I did- I wouldn't pay 
that kind of money for what amounts to a Vertex 2 Pro but with SLC flash.

I'm kind of flabbergasted that no one has simply stuck a capacitor on a more 
reasonable drive. I guess the market just isn't big enough- but I find that 
hard to believe.

Right now it seems like the options are all or nothing. There's just no %^$#^ 
middle ground.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] New SSD options

2010-05-19 Thread thomas
40k IOPS sounds like best in case, you'll never see it in the real world 
marketing to me. There are a few benchmarks if you google and they all seem to 
indicate the performance is probably +/- 10% of an intel x25-e. I would 
personally trust intel over one of these drives.

Is it even possible to buy a zeus iops anywhere? I haven't been able to find 
one. I get the impression they mostly sell to other vendors like sun? I'd be 
curious what the price is on a 9GB zeus iops is these days?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] New SSD options

2010-05-19 Thread sensille
Don wrote:
 
 With that in mind- Is anyone using the new OCZ Vertex 2 SSD's as a ZIL?
 
 They're claiming 50k IOPS (4k Write- Aligned), 2 million hour MTBF, TRIM 
 support, etc. That's more write IOPS than the ZEUS (40k IOPS, $) but at 
 half the price of an Intel X25-E (3.3k IOPS, $400).
 
 Needless to say I'd love to know if anyone has evaluated these drives to see 
 if they make sense as a ZIL- for example- do they honor cache flush requests? 
 Are those sustained IOPS numbers?

In my understanding nearly the only relevant number is the number
of cache flushes a drive can handle per second, as this determines
my single thread performance.
Has anyone an idea what numbers I can expect from an Intel X25-E or
an OCZ Vertex 2?

-Arne
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] New SSD options

2010-05-19 Thread Brandon High
On Tue, May 18, 2010 at 4:28 PM, Don d...@blacksun.org wrote:
 With that in mind- Is anyone using the new OCZ Vertex 2 SSD's as a ZIL?

The current Sandforce drives out don't have an ultra-capacitor on
them, so they could lose data if the system crashed. There are
supposed to be enterprise class drives based on the chipset out that
do have an ultra-cap released any day now.

 Needless to say I'd love to know if anyone has evaluated these drives to see 
 if they make sense as a ZIL- for example- do they honor cache flush requests? 
 Are those sustained IOPS numbers?

I don't think they do, the chipset was designed to use an ultra-cap to
avoid having to honor flushes. Then again, the X25-E has the same
problem.

-B

-- 
Brandon High : bh...@freaks.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] New SSD options

2010-05-19 Thread Ragnar Sundblad

On 2010-05-19 08.32, sensille wrote:

Don wrote:


With that in mind- Is anyone using the new OCZ Vertex 2 SSD's as a ZIL?

They're claiming 50k IOPS (4k Write- Aligned), 2 million hour MTBF, TRIM 
support, etc. That's more write IOPS than the ZEUS (40k IOPS, $) but at 
half the price of an Intel X25-E (3.3k IOPS, $400).

Needless to say I'd love to know if anyone has evaluated these drives to see if 
they make sense as a ZIL- for example- do they honor cache flush requests? Are 
those sustained IOPS numbers?


In my understanding nearly the only relevant number is the number
of cache flushes a drive can handle per second, as this determines
my single thread performance.
Has anyone an idea what numbers I can expect from an Intel X25-E or
an OCZ Vertex 2?


I don't know about OCZ Vertex 2, but the Intel X25-E
roughly halves it's IOPS number when you disable it's
write cache (IIRC, it was in the range 1300-1600
writes/s or so).
Since it ignores Cache Flush command and it doesn't
have any persistant buffer storage, disabling the write
cache is the best you can do.
Note that there were reports of the Intel X25-E loosing
a write even though you had the write cache disabled!
Since they still haven't fixed this, after more than a
year on the market, I believe it rather qualifies into
the hardly usable toy class. I am very disappointed,
I had hopes for a new class of cheap but usable flash
drives. Maybe some day...

/ragge
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] New SSD options

2010-05-19 Thread Don
Well- 40k IOPS is the current claim from ZEUS- and they're the benchmark. They 
use to be 17k IOPS. How real any of these numbers are from any manufacturer is 
a guess.

Given the Intel's refusal to honor a cache flush, and their performance 
problems with the cache disabled- I don't trust them any more than anyone else 
right now.

As for the Vertex drives- if they are within +-10% of the Intel they're still 
doing it for half of what the Intel drive costs- so it's an option- not a great 
option- but still an option.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] New SSD options

2010-05-19 Thread Yuri Vorobyev



As for the Vertex drives- if they are within +-10% of the Intel they're still 
doing it for half of what the Intel drive costs- so it's an option- not a great 
option- but still an option.

Yes, but Intel is SLC. Much more endurance.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] New SSD options

2010-05-19 Thread David Magda
On Wed, May 19, 2010 02:09, thomas wrote:

 Is it even possible to buy a zeus iops anywhere? I haven't been able to
 find one. I get the impression they mostly sell to other vendors like sun?
 I'd be curious what the price is on a 9GB zeus iops is these days?

Correct, their Zeus products are only available to OEMs.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] New SSD options

2010-05-19 Thread Don
Well the larger size of the Vertex, coupled with their smaller claimed write 
amplification should result in sufficient service life for my needs. Their 
claimed MTBF also matches the Intel X25-E's.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] New SSD options

2010-05-19 Thread Don
Since it ignores Cache Flush command and it doesn't have any persistant buffer 
storage, disabling the write cache is the best you can do.

This actually brings up another question I had: What is the risk, beyond a few 
seconds of lost writes, if I lose power, there is no capacitor and the cache is 
not disabled?

My ZFS system is shared storage for a large VMWare based QA farm. If I lose 
power then a few seconds of writes are the least of my concerns. All of the QA 
tests will need to be restarted and all of the file systems will need to be 
checked. A few seconds of writes won't make any difference unless it has the 
potential to affect the integrity of the pool itself.

Considering the performance trade-off, I'd happily give up a few seconds worth 
of writes for significantly improved IOPS.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] New SSD options

2010-05-19 Thread Richard Elling
On May 19, 2010, at 2:29 PM, Don wrote:

 Since it ignores Cache Flush command and it doesn't have any persistant 
 buffer storage, disabling the write cache is the best you can do.
 
 This actually brings up another question I had: What is the risk, beyond a 
 few seconds of lost writes, if I lose power, there is no capacitor and the 
 cache is not disabled?

The data risk is a few moments of data loss. However, if the order of the
uberblock updates is not preserved (which is why the caches are flushed)
then recovery from a reboot may require manual intervention.  The amount
of manual intervention could be significant for builds prior to b128.

 My ZFS system is shared storage for a large VMWare based QA farm. If I lose 
 power then a few seconds of writes are the least of my concerns. All of the 
 QA tests will need to be restarted and all of the file systems will need to 
 be checked. A few seconds of writes won't make any difference unless it has 
 the potential to affect the integrity of the pool itself.
 
 Considering the performance trade-off, I'd happily give up a few seconds 
 worth of writes for significantly improved IOPS.

Space, dependability, performance: pick two :-)
 -- richard

-- 
Richard Elling
rich...@nexenta.com   +1-760-896-4422
ZFS and NexentaStor training, Rotterdam, July 13-15, 2010
http://nexenta-rotterdam.eventbrite.com/




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] New SSD options

2010-05-19 Thread Nicolas Williams
On Wed, May 19, 2010 at 02:29:24PM -0700, Don wrote:
 Since it ignores Cache Flush command and it doesn't have any
 persistant buffer storage, disabling the write cache is the best you
 can do.
 
 This actually brings up another question I had: What is the risk,
 beyond a few seconds of lost writes, if I lose power, there is no
 capacitor and the cache is not disabled?

You can lose all writes from the last committed transaction (i.e., the
one before the currently open transaction).  (You also lose writes from
the currently open transaction, but that's unavoidable in any system.)

Nowadays the system will let you know at boot time that the last
transaction was not committed properly and you'll have a chance to go
back to the previous transaction.

For me, getting much-better-than-disk performance out of an SSD with
cache disabled is enough to make that SSD worthwhile, provided the price
is right of course.

Nico
-- 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] New SSD options

2010-05-19 Thread Don
You can lose all writes from the last committed transaction (i.e., the
one before the currently open transaction).

And I don't think that bothers me. As long as the array itself doesn't go belly 
up- then a few seconds of lost transactions are largely irrelevant- all of the 
QA virtual machines are going to have to be rolled back to their initial states 
anyway.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] New SSD options

2010-05-19 Thread Don
You can lose all writes from the last committed transaction (i.e., the
one before the currently open transaction).

I'll pick one- performance :)

Honestly- I wish I had a better grasp on the real world performance of these 
drives. 50k IOPS is nice- and considering the incredible likelihood of data 
duplication in my environment- the SandForce controller seems like a win. That 
said- does anyone have a good set of real world performance numbers for these 
drives that you can link to?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] New SSD options

2010-05-18 Thread Don
I'm looking for alternatives SSD options to the Intel X25-E and the ZEUS IOPS.

The ZEUS IOPS would probably cost as much as my entire current disk system (80 
15k SAS drives)- and that's just silly.

The Intel is much less expensive, and while fast- pales in comparison to the 
ZEUS.

I've allocated 4 disk slots in my array for ZIL SSD's and I'm trying to find 
the best performance for my dollar.

With that in mind- Is anyone using the new OCZ Vertex 2 SSD's as a ZIL?

http://www.ocztechnology.com/products/solid-state-drives/2-5--sata-ii/performance-enterprise-solid-state-drives/ocz-vertex-2-sata-ii-2-5--ssd.html

They're claiming 50k IOPS (4k Write- Aligned), 2 million hour MTBF, TRIM 
support, etc. That's more write IOPS than the ZEUS (40k IOPS, $) but at 
half the price of an Intel X25-E (3.3k IOPS, $400).

Needless to say I'd love to know if anyone has evaluated these drives to see if 
they make sense as a ZIL- for example- do they honor cache flush requests? Are 
those sustained IOPS numbers?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss