Let me add one observation to this thread. If you check slide 69 of this
presentation:
https://cwiki.apache.org/confluence/download/attachments/24193445/keynote-hic-2011-web.pdf
the graph shows that not writing to disk (net only) does not actually improve
write latency much, unless your disk write buffer is turned off. Unless there
has been some important performance improvement I missed, it doesn't look like
a faster device for the transaction log would be able to improve latency much
at this point. Does it sound right?
-Flavio
On Oct 4, 2012, at 6:05 PM, Ben Bangert wrote:
> On Oct 3, 2012, at 6:13 PM, Patrick Hunt <[email protected]> wrote:
>
>> My experience with SSDs and ZK has been discouraging. SSDs have some
>> really terrible corner cases for latency. I've seen them take 40+
>> seconds (that's not a mistake - seconds) for fsync to complete. When
>> this happened (every few hours) all of the sessions would timeout.
>>
>> See this article:
>> http://storagemojo.com/2012/06/07/the-ssd-write-cliff-in-real-life/
>
> It's worth noting that these tests are all on Enterprise SSD products, which
> have actually been lagging some of the advances the SSD controller folks have
> been making. I've had the same corner case on my own desktop SSD in the past
> with a huge write cliff, but this has gone away with some of the later
> heavily over-provisioned SSD's I've bought, such as this OWC 6G one I'm using.
>
> Course, these Enterprise folks are the same that prefer to scale vertically
> than horizontally using cheaper commodity hardware. The most useful factors
> to look at when choosing the SSD are the write amplification factor
> (http://www.anandtech.com/show/5719/ocz-vertex-4-review-256gb-512gb), and how
> it handles the case when the drive runs out of free space (and thus has to
> garbage collect resulting in the write cliff). An over-provisioned drive can
> avoid the write-cliff because a chunk of the drive is reserved in advance to
> prevent it from ever getting completely full. See results here:
> Over-provisioned SSD:
> http://macperformanceguide.com/SSD-RealWorld-BeforeAfter-OWC.html
>
> Non-overprovisioned SSD:
> http://macperformanceguide.com/SSD-RealWorld-BeforeAfter-CrucialRealSSD.html
>
> If you look through, there's some very worrying write-cliffs that are very
> apparent in SSD's that aren't over-provisioned, and they easily fail to
> perform as well as a RAID of platter drives.
>
> The other thing about the storagemojo article worth thinking about is whether
> you're actually going to buy a 12+ disk array for a faster ZK log... or are
> actually comparing a single platter disk vs. a single SSD.
>
> Cheers,
> Ben