On Tue, Mar 04, 2014 at 04:15:29PM -0600, Nick wrote: > - The Intel drive writes 3-4 times slower with cache flushing enabled. I > understand and appreciate the comments about not disabling cache flushing > unless you have an intimate understanding of the hardware. So for me that > is no. I would like to contact Intel and ask them about this, as I would > think they could just NOOP the cache flush request but apparently they are > not doing so. I'm afraid that I am too small of player to probably be able > to talk to anyone there, but I can try. I was under the impression that > this drive is one of the most popular new datacenter drives so I was hoping > that someone who has purchased a large quantity has some more info.
We use the DC-S3700 in our all-SSD pools, not the 3500. The 3700 provides good delivered performance for us, well into the tens of thousands of delivered 8k write IOPS. We don't disable cache flushing. It's important as well to understand what your workload is doing. The fact that you are seeing small transactions written to the SSDs suggests that you're doing a single-threaded test (we don't use iozone so maybe that's obvious from the arguments you gave it). If you run more threads, ZFS will have an opportunity to aggregate larger I/Os and provide dramatically greater throughput. I don't know whether that makes sense for your application, though; some programs care only about single-threaded latency so if you cannot run other processes as well your total application can end up bound by single-thread latency. In our case, we expect that multiple customers will be doing a high rate of small-block I/O, especially reads (since a slog and a disk pool usually provide plenty of write performance). Those conditions can be approximated by testing with many threads doing I/O. I'd suggest testing with your actual workload instead of synthetic benchmarks unless either (a) synthetic benchmarks *are* your workload (it takes all kinds), or (b) you have analysed your workload enough to be highly confident that the synthetic benchmark models it extremely well. Topics like this are covered very well on Brendan Gregg's blog. See especially http://www.brendangregg.com/activebenchmarking.html. > - Not surprising that sync disabled makes things really fast as it > basically makes it so you don't need to actually write 4K blocks to disk > but can group them together and write them more sequentially. Note that you get a similar benefit if you have lots of threads all doing 4K writes. It's also been my experience that most applications don't actually open files O_DSYNC and proceed to do tiny writes to them. I've observed MySQL, for instance, doing lots of 87-byte writes to its log file, but it opens it without O_DSYNC and instead issues fsync(3c) calls when required for transactional consistency. That pattern allows the OS to aggregate small I/O and also to do writes before they have to be committed if the device is otherwise idle. This pattern usually results in much better apparent performance than writing O_DSYNC. Again, whether this is relevant depends on your application. > - The slowness of the Seagate 600 Pro SSD really surprised me. This is one > of the fastest drives on the market, and you can see that with the cache > flushing disabled it outpaces the Intel SSD. It is supposed to be an > "Enterprise" drive with power protection, so I thought it could ignore the > flush commands, but that is clearly not the case. When doing 4K sync > writes, it seems to be somewhere around 50 times slower than the Intel > drive and it is actually slower than a mechnical hard drive, which really > shocked me. It's an interesting result, for sure. Ask Seagate. FWIW, I tested their Pulsar.2 SAS drive for slog use and found it pretty good, very similar in fact to the STEC Mach16 SLC that we use today. I don't think that's the same device, though, and we never seriously considered either for primary storage. If you want to get further into this, you can look at using DTrace to break down the slowness by attributes of the SCSI packets being sent by the HBA. That might yield some interesting data; i.e., is it a particular size, a particular LBA range, a particular SCSI command, etc. If it's always SYNCHRONIZE CACHE that's slow, does it matter what commands were issued prior to that? > These benchmarks are obviously all synthetic, but I wanted to get those > done first so that I could better understand the drive's performance before > moving on to more real-world, and therefore more complicated, benchmarks. Yep. ------------------------------------------- smartos-discuss Archives: https://www.listbox.com/member/archive/184463/=now RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00 Modify Your Subscription: https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb Powered by Listbox: http://www.listbox.com
