Re: IBM blade server abysmal disk write performances

2013-01-25 Thread Karim Fodil-Lemelin
Hi, Quick follow up on this. As I mentioned in a previous email we have moved to SATA drives and the SAS drives have been shelved for now. The current project will be using those so further tests on SAS have been postponed to an undefined date. Thanks, Karim. PS: I'll keep the SAS tests

Re: IBM blade server abysmal disk write performances

2013-01-21 Thread Wojciech Puchar
Interesting. Is there a way to tell, other than coming up with some way to actually test it, whether a particular drive waits until my crappy laptop hard drive behave the same no matter if i turn write cache on, off or leave default. seems like it is always on.

Re: IBM blade server abysmal disk write performances

2013-01-21 Thread Wojciech Puchar
With SATA vs SAS, the gap is much narrower. The TCQ command set (still used by SAS) is still better than the NCQ command set, but the in what point TCQ is exactly better than SATA NCQ. ___ freebsd-hackers@freebsd.org mailing list

Re: IBM blade server abysmal disk write performances

2013-01-21 Thread Wojciech Puchar
I've had my share of sudden UPS failures over the years. Probably more everything can fail. That's why serious sysadmins do proper backup, no matter what safety features are used in their servers. ___ freebsd-hackers@freebsd.org mailing list

Re: IBM blade server abysmal disk write performances

2013-01-19 Thread Wojciech Puchar
Turning the write cache off eliminates the risk of having the write cache on. this sentence sounds like not having a car eliminates a risks of driving. ___ freebsd-hackers@freebsd.org mailing list

Re: IBM blade server abysmal disk write performances

2013-01-19 Thread Stefan Esser
Am 19.01.2013 00:32, schrieb Karim Fodil-Lemelin: * Although no one has reported problems with the 2 gig * version of the DCAS drive, the assumption is that it * has the same problems as the 4 gig version. Therefore * this

Re: IBM blade server abysmal disk write performances

2013-01-19 Thread Wojciech Puchar
I remember those drives from some 20 years ago. Before that time, SCSI and IDE drives were independently developed and SCSI drives offered way yes. 20 years ago it was true. even in 1995, when i had SCSI controller in my 486 and it was great compared to ATA. today SATA and SAS are mostly

Re: IBM blade server abysmal disk write performances

2013-01-19 Thread Dieter BSD
Stefan writes: I seem to remember, that drives of that time required the write cache to be enabled to get any speed-up from tagged commands. This was no risk with SCSI drives, since the cache did not make the drives lye about command completion (i.e. the status for the write was only returned

Re: IBM blade server abysmal disk write performances

2013-01-19 Thread Wojciech Puchar
to be enabled to get any speed-up from tagged commands. This was no risk with SCSI drives, since the cache did not make the drives lye i see no correlation between interface type and possibility of lying about command completion. ___

Re: IBM blade server abysmal disk write performances

2013-01-19 Thread Scott Long
On Jan 19, 2013, at 4:33 PM, Wojciech Puchar woj...@wojtek.tensor.gdynia.pl wrote: to be enabled to get any speed-up from tagged commands. This was no risk with SCSI drives, since the cache did not make the drives lye i see no correlation between interface type and possibility of lying

Re: IBM blade server abysmal disk write performances

2013-01-19 Thread Don Lewis
On 18 Jan, Wojciech Puchar wrote: If computer have UPS then write caching is fine. even if FreeBSD crash, disk would write data I've had my share of sudden UPS failures over the years. Probably more than half have been during an automatic battery self test. UPS goes on battery, and then

Re: IBM blade server abysmal disk write performances

2013-01-19 Thread Don Lewis
On 19 Jan, Stefan Esser wrote: I seem to remember, that drives of that time required the write cache to be enabled to get any speed-up from tagged commands. This was no risk with SCSI drives, since the cache did not make the drives lye about command completion (i.e. the status for the write

Re: IBM blade server abysmal disk write performances

2013-01-18 Thread Mark Felder
On Thu, 17 Jan 2013 16:12:17 -0600, Karim Fodil-Lemelin fodillemlinka...@gmail.com wrote: SAS controllers may connect to SATA devices, either directly connected using native SATA protocol or through SAS expanders using SATA Tunneled Protocol (STP). The systems is currently put in place

Re: IBM blade server abysmal disk write performances

2013-01-18 Thread Scott Long
From: Dieter BSD dieter...@gmail.com To: freebsd-hackers@freebsd.org Cc: mja...@freebsd.org; gi...@freebsd.org; sco...@freebsd.org Sent: Thursday, January 17, 2013 9:03 PM Subject: Re: IBM blade server abysmal disk write performances I am thinking that something

Re: IBM blade server abysmal disk write performances

2013-01-18 Thread Wojciech Puchar
The default value, -1, instructs the driver to leave the STA drives at their configuration default.  Often times this means that the MPT BIOS will turn off the write cache on every system boot sequence.  IT DOES THIS FOR A GOOD REASON!  An enabled write cache is counter to data reliability.  

Re: IBM blade server abysmal disk write performances

2013-01-18 Thread Scott Long
...@freebsd.org mja...@freebsd.org Sent: Friday, January 18, 2013 11:10 AM Subject: Re: IBM blade server abysmal disk write performances The default value, -1, instructs the driver to leave the STA drives at their configuration default.  Often times this means that the MPT BIOS

Re: IBM blade server abysmal disk write performances

2013-01-18 Thread Wojciech Puchar
disk would write data I suspect that I'm encountering situations right now at netflix where this advice is not true.  I have drives that are seeing intermittent errors, then being forced into reset after a timeout, and then coming back up with filesystem problems.  It's only a suspicion at

Re: IBM blade server abysmal disk write performances

2013-01-18 Thread Dieter BSD
Wojciech writes: If computer have UPS then write caching is fine. even if FreeBSD crash, disk would write data That is incorrect. A UPS reduces the risk, but does not eliminate it. It is impossible to completely eliminate the risk of having the write cache on. If you care about your data you

Re: IBM blade server abysmal disk write performances

2013-01-18 Thread Ian Lepore
On Fri, 2013-01-18 at 20:37 +0100, Wojciech Puchar wrote: disk would write data I suspect that I'm encountering situations right now at netflix where this advice is not true. I have drives that are seeing intermittent errors, then being forced into reset after a timeout, and then

Re: IBM blade server abysmal disk write performances

2013-01-18 Thread Wojciech Puchar
That is incorrect. A UPS reduces the risk, but does not eliminate it. nothing eliminate all risks. But for most applications, you must have the write cache off, and you need queuing (e.g. TCQ or NCQ) for performance. If you have queuing, there is no need to turn the write cache on. did you

Re: IBM blade server abysmal disk write performances

2013-01-18 Thread Scott Long
On Jan 18, 2013, at 1:12 PM, Dieter BSD dieter...@gmail.com wrote: It is inexcusable that FreeBSD defaults to leaving the write cache on for SATA PATA drives. This was completely driven by the need to satisfy idiotic benchmarkers, tech writers, and system administrators. It was a huge deal

Re: IBM blade server abysmal disk write performances

2013-01-18 Thread Wojciech Puchar
and anyone who enabled SATA WC or complained about I/O slowness would be forced into Siberian salt mines for the remainder of their lives. so reserve a place for me there. ___ freebsd-hackers@freebsd.org mailing list

Re: IBM blade server abysmal disk write performances

2013-01-18 Thread Ian Lepore
On Fri, 2013-01-18 at 22:18 +0100, Wojciech Puchar wrote: and anyone who enabled SATA WC or complained about I/O slowness would be forced into Siberian salt mines for the remainder of their lives. so reserve a place for me there. Yeah, me too. I prefer to go for all-out performance with

Re: IBM blade server abysmal disk write performances

2013-01-18 Thread Peter Jeremy
On 2013-Jan-18 12:12:11 -0800, Dieter BSD dieter...@gmail.com wrote: adding hw.ata.wc=0 to /boot/loader.conf. The bigger problem is that FreeBSD does not support queuing on all controllers that support it. Not something that admins can fix, and inexcusable for an OS that claims to care about

Re: IBM blade server abysmal disk write performances

2013-01-18 Thread Karim Fodil-Lemelin
On 18/01/2013 10:16 AM, Mark Felder wrote: On Thu, 17 Jan 2013 16:12:17 -0600, Karim Fodil-Lemelin fodillemlinka...@gmail.com wrote: SAS controllers may connect to SATA devices, either directly connected using native SATA protocol or through SAS expanders using SATA Tunneled Protocol (STP).

Re: IBM blade server abysmal disk write performances

2013-01-18 Thread Matthew Jacob
This is all turning into a bikeshed discussion. As far as I can tell, the basic original question was why a *SAS* (not a SATA) drive was not performing as well as expected based upon experiences with Linux. I still don't know whether reads or writes were being used for dd. This morning, I ran

Re: IBM blade server abysmal disk write performances

2013-01-18 Thread Matthew Jacob
mpt0: LSILogic SAS/SATA Adapter port 0x1000-0x10ff mem 0x9991-0x99913fff,0x9990-0x9990 irq 28 at device 0.0 on pci11 mpt0: MPI Version=1.5.20.0 mpt0: Capabilities: ( RAID-0 RAID-1E RAID-1 ) mpt0: 0 Active Volumes (2 Max) mpt0: 0 Hidden Drive Members (14 Max) Ah. Historically IBM

Re: IBM blade server abysmal disk write performances

2013-01-18 Thread Dieter BSD
Scott writes: If I had my way, the WC would be off, everyone would be using SAS, and anyone who enabled SATA WC or complained about I/O slowness would be forced into Siberian salt mines for the remainder of their lives. Actually, If you are running SAS, having SATA WC on or off wouldn't

Re: IBM blade server abysmal disk write performances

2013-01-18 Thread Karim Fodil-Lemelin
On 18/01/2013 5:42 PM, Matthew Jacob wrote: This is all turning into a bikeshed discussion. As far as I can tell, the basic original question was why a *SAS* (not a SATA) drive was not performing as well as expected based upon experiences with Linux. I still don't know whether reads or writes

Re: IBM blade server abysmal disk write performances

2013-01-18 Thread Dieter BSD
Matthew writes: There is also no information in the original email as to which direction the I/O was being sent. In one of the followups, Karim reported: # dd if=/dev/zero of=foo count=10 bs=1024000 10+0 records in 10+0 records out 1024 bytes transferred in 19.615134 secs (522046

Re: IBM blade server abysmal disk write performances

2013-01-18 Thread Adrian Chadd
On 18 January 2013 19:11, Dieter BSD dieter...@gmail.com wrote: Matthew writes: There is also no information in the original email as to which direction the I/O was being sent. In one of the followups, Karim reported: # dd if=/dev/zero of=foo count=10 bs=1024000 10+0 records in 10+0

Re: IBM blade server abysmal disk write performances

2013-01-17 Thread Wojciech Puchar
Note that the driver says Command Queueing enabled without specifying which. If the driver is trying to use SATA's NCQ but the drive only speaks SCSI's TCQ, that could explain it. Or if the TCQ isn't working for some other reason. even without TCQ,NCQ and write cache the write speed is really

Re: IBM blade server abysmal disk write performances

2013-01-17 Thread Karim Fodil-Lemelin
On 16/01/2013 2:48 AM, Dieter BSD wrote: Karim writes: It is quite obvious that something is awfully slow on SAS drives, whatever it is and regardless of OS comparison. We swapped the SAS drives for SATA and we're seeing much higher speeds. Basically on par with what we were expecting (roughly

Re: IBM blade server abysmal disk write performances

2013-01-17 Thread Dieter BSD
I am thinking that something fancy in that SAS drive is not being handled correctly by the FreeBSD driver. I think so too, and I think the something fancy is tagged command queuing. The driver prints da0: Command Queueing enabled and yet your SAS drive is only getting 1 write per rev, and

Re: IBM blade server abysmal disk write performances

2013-01-17 Thread Adrian Chadd
When you run gstat, how many ops/sec are you seeing? Adrian On 17 January 2013 20:03, Dieter BSD dieter...@gmail.com wrote: I am thinking that something fancy in that SAS drive is not being handled correctly by the FreeBSD driver. I think so too, and I think the something fancy is tagged

Re: IBM blade server abysmal disk write performances

2013-01-17 Thread Matthew Jacob
On 1/17/2013 8:03 PM, Dieter BSD wrote: I think it is time to ask the driver wizards why TCQ isn't working, so I'm cc-ing the authors listed on the mpt man page. It is the MPT firmware that implements SATL, but there are probably tweaks that the FreeBSD driver doesn't do that the Linux

Re: IBM blade server abysmal disk write performances

2013-01-15 Thread Matthew D. Fuller
On Tue, Jan 15, 2013 at 09:12:14AM -0500 I heard the voice of Karim Fodil-Lemelin, and lo! it spake thus: da0: IBM-ESXS HUC106030CSS60 D3A6 Fixed Direct Access SCSI-6 device That's a 10k RPM drive. FreeBSD 9.1: 1+0 records in 1+0 records out 512 bytes transferred in

Re: IBM blade server abysmal disk write performances

2013-01-15 Thread Matthew D. Fuller
Dur... 10k ops in 2 seconds is 300k per second. RPM I mean... -- Matthew Fuller (MF4839) | fulle...@over-yonder.net Systems/Network Administrator | http://www.over-yonder.net/~fullermd/ On the Internet, nobody can hear you scream.

Re: IBM blade server abysmal disk write performances

2013-01-15 Thread Mark Felder
On Tue, 15 Jan 2013 08:12:14 -0600, Karim Fodil-Lemelin fodillemlinka...@gmail.com wrote: Hi, I'm struggling getting FreeBSD 9.1 properly work on an IBM blade server (HS22). Here is a dd output from Linux CentOS vs FreeBSD 9.1. GNU dd is heavily buffered unless you tell it not to be.

Re: IBM blade server abysmal disk write performances

2013-01-15 Thread Wojciech Puchar
1+0 records out 512 bytes transferred in 60.024997 secs (85298 bytes/sec) 1 ops in 60 seconds is practically the definition of a 10k drive. nonsense. ___ freebsd-hackers@freebsd.org mailing list

Re: IBM blade server abysmal disk write performances

2013-01-15 Thread Tim Kientzle
On Jan 15, 2013, at 6:12 AM, Karim Fodil-Lemelin wrote: Hi, I'm struggling getting FreeBSD 9.1 properly work on an IBM blade server (HS22). Here is a dd output from Linux CentOS vs FreeBSD 9.1. CentOS: 10+0 records in 10+0 records out 5120 bytes (51 MB) copied, 1.97883

Re: IBM blade server abysmal disk write performances

2013-01-15 Thread Wojciech Puchar
1+0 records in 1+0 records out 512 bytes transferred in 60.024997 secs (85298 bytes/sec) What exactly was the 'dd' command you used? In particular, what block size did you specify? 512/1=512 default if it takes one revolution for one write it means that write caching

Re: IBM blade server abysmal disk write performances

2013-01-15 Thread Karim Fodil-Lemelin
On 15/01/2013 3:03 PM, Dieter BSD wrote: Disabling the disks's write cache is *required* for data integrity. One op per rev means write caching is disabled and no queueing. But dmesg claims Command Queueing enabled, so you should be getting more than one op per rev, and writes should be fast. Is

Re: IBM blade server abysmal disk write performances

2013-01-15 Thread Adrian Chadd
Hi, You're only doing one IO at the end. That's just plain silly. There's all kinds of overhead that could show up, that would be amortized over doing many IOs. You should also realise that the raw disk IO on Linux is by default buffered, so you're hitting the buffer cache. The results aren't

Re: IBM blade server abysmal disk write performances

2013-01-15 Thread Wojciech Puchar
# dd if=/dev/zero of=foo count=1 bs=1024 1+0 records in 1+0 records out 1024 bytes transferred in 19.579077 secs (523007 bytes/sec) you write to file not device, so it will be clustered anyway by FreeBSD. 128kB by default, more if you put options MAXPHYS=... in kernel config and

Re: IBM blade server abysmal disk write performances

2013-01-15 Thread Matthew D. Fuller
On Tue, Jan 15, 2013 at 12:03:33PM -0800 I heard the voice of Dieter BSD, and lo! it spake thus: But dmesg claims Command Queueing enabled, so you should be getting more than one op per rev, and writes should be fast. Queueing would only help if your load threw multiple ops at the drive

Re: IBM blade server abysmal disk write performances

2013-01-15 Thread Karim Fodil-Lemelin
On 15/01/2013 3:55 PM, Adrian Chadd wrote: You're only doing one IO at the end. That's just plain silly. There's all kinds of overhead that could show up, that would be amortized over doing many IOs. You should also realise that the raw disk IO on Linux is by default buffered, so you're hitting

Re: IBM blade server abysmal disk write performances

2013-01-15 Thread Karim Fodil-Lemelin
On 15/01/2013 4:54 PM, Wojciech Puchar wrote: # dd if=/dev/zero of=foo count=1 bs=1024 1+0 records in 1+0 records out 1024 bytes transferred in 19.579077 secs (523007 bytes/sec) you write to file not device, so it will be clustered anyway by FreeBSD. 128kB by default, more if you put

Re: IBM blade server abysmal disk write performances

2013-01-15 Thread Dieter BSD
Karim writes: dd to the raw drive and no compression/encryption or some other features, just a naive boot off a live 9.1 CD then dd (see below). The following results have been gathered on the FreeBSD 9.1 system: # dd if=/dev/zero of=toto count=100 100+0 records in 100+0 records out 51200

Re: IBM blade server abysmal disk write performances

2013-01-15 Thread Dieter BSD
I wrote: The kernel must be doing write-behind even to a raw disk, otherwise waiting for write(2) to return before issuing the next write would slow it down as Matthew suggests. And a minute after hitting send, I remembered that FreeBSD does not provide the traditional raw disk devices, e.g.

Re: IBM blade server abysmal disk write performances

2013-01-15 Thread Dieter BSD
25.9 MB/s Even Linux is pretty slow. Transfer rates: outside: 102400 kbytes in 0.685483 sec = 149384 kbytes/sec middle:102400 kbytes in 0.747424 sec = 137004 kbytes/sec inside:102400 kbytes in 1.051036 sec = 97428 kbytes/sec That's more

Re: IBM blade server abysmal disk write performances

2013-01-15 Thread Karim Fodil-Lemelin
On 15/01/2013 4:54 PM, Wojciech Puchar wrote: # dd if=/dev/zero of=foo count=1 bs=1024 1+0 records in 1+0 records out 1024 bytes transferred in 19.579077 secs (523007 bytes/sec) you write to file not device, so it will be clustered anyway by FreeBSD. 128kB by default, more if you put

Re: IBM blade server abysmal disk write performances

2013-01-15 Thread Ian Lepore
On Tue, 2013-01-15 at 15:28 -0500, Karim Fodil-Lemelin wrote: On 15/01/2013 3:03 PM, Dieter BSD wrote: Disabling the disks's write cache is *required* for data integrity. One op per rev means write caching is disabled and no queueing. But dmesg claims Command Queueing enabled, so you should

Re: IBM blade server abysmal disk write performances

2013-01-15 Thread Wojciech Puchar
The kernel must be doing write-behind even to a raw disk, otherwise waiting for write(2) to return before issuing the next write would slow it down as Matthew suggests. And a minute after hitting send, I remembered that FreeBSD does not provide the traditional raw disk devices, e.g. /dev/rda0

Re: IBM blade server abysmal disk write performances

2013-01-15 Thread Wojciech Puchar
Transfer rates: outside: 102400 kbytes in 0.685483 sec = 149384 kbytes/sec middle:102400 kbytes in 0.747424 sec = 137004 kbytes/sec inside:102400 kbytes in 1.051036 sec = 97428 kbytes/sec this is right. Yet we get only a tiny fraction of those

Re: IBM blade server abysmal disk write performances

2013-01-15 Thread Dieter BSD
Karim writes: It is quite obvious that something is awfully slow on SAS drives, whatever it is and regardless of OS comparison. We swapped the SAS drives for SATA and we're seeing much higher speeds. Basically on par with what we were expecting (roughly 300 to 400 times faster then what we