Re: O_DIRECT to md raid 6 is slow

2012-08-21 Thread Stan Hoeppner
On 8/21/2012 9:51 AM, Miquel van Smoorenburg wrote: > On 08/20/2012 01:34 AM, Stan Hoeppner wrote: >> I'm glad you jumped in David. You made a critical statement of fact >> below which clears some things up. If you had stated it early on, >> before Miquel stole the thread and moved it to LKML

Re: O_DIRECT to md raid 6 is slow

2012-08-21 Thread Miquel van Smoorenburg
On 08/20/2012 01:34 AM, Stan Hoeppner wrote: I'm glad you jumped in David. You made a critical statement of fact below which clears some things up. If you had stated it early on, before Miquel stole the thread and moved it to LKML proper, it would have short circuited a lot of this discussion.

Re: O_DIRECT to md raid 6 is slow

2012-08-21 Thread Stan Hoeppner
On 8/21/2012 9:51 AM, Miquel van Smoorenburg wrote: On 08/20/2012 01:34 AM, Stan Hoeppner wrote: I'm glad you jumped in David. You made a critical statement of fact below which clears some things up. If you had stated it early on, before Miquel stole the thread and moved it to LKML proper,

Re: O_DIRECT to md raid 6 is slow

2012-08-21 Thread Miquel van Smoorenburg
On 08/20/2012 01:34 AM, Stan Hoeppner wrote: I'm glad you jumped in David. You made a critical statement of fact below which clears some things up. If you had stated it early on, before Miquel stole the thread and moved it to LKML proper, it would have short circuited a lot of this discussion.

Re: O_DIRECT to md raid 6 is slow

2012-08-20 Thread David Brown
On 20/08/2012 02:01, NeilBrown wrote: On Sun, 19 Aug 2012 18:34:28 -0500 Stan Hoeppner wrote: Since we are trying to set the record straight md/RAID6 must read all devices in a RMW cycle. md/RAID6 must read all data devices (i.e. not parity devices) which it is not going to write to,

Re: O_DIRECT to md raid 6 is slow

2012-08-20 Thread David Brown
On 20/08/2012 02:01, NeilBrown wrote: On Sun, 19 Aug 2012 18:34:28 -0500 Stan Hoeppner s...@hardwarefreak.com wrote: Since we are trying to set the record straight md/RAID6 must read all devices in a RMW cycle. md/RAID6 must read all data devices (i.e. not parity devices) which it is

Re: O_DIRECT to md raid 6 is slow

2012-08-19 Thread NeilBrown
On Sun, 19 Aug 2012 18:34:28 -0500 Stan Hoeppner wrote: > On 8/19/2012 9:01 AM, David Brown wrote: > > I'm sort of jumping in to this thread, so my apologies if I repeat > > things other people have said already. > > I'm glad you jumped in David. You made a critical statement of fact > below

Re: O_DIRECT to md raid 6 is slow

2012-08-19 Thread Stan Hoeppner
On 8/19/2012 9:01 AM, David Brown wrote: > I'm sort of jumping in to this thread, so my apologies if I repeat > things other people have said already. I'm glad you jumped in David. You made a critical statement of fact below which clears some things up. If you had stated it early on, before

Re: O_DIRECT to md raid 6 is slow

2012-08-19 Thread Stan Hoeppner
On 8/19/2012 9:01 AM, David Brown wrote: I'm sort of jumping in to this thread, so my apologies if I repeat things other people have said already. I'm glad you jumped in David. You made a critical statement of fact below which clears some things up. If you had stated it early on, before

Re: O_DIRECT to md raid 6 is slow

2012-08-19 Thread NeilBrown
On Sun, 19 Aug 2012 18:34:28 -0500 Stan Hoeppner s...@hardwarefreak.com wrote: On 8/19/2012 9:01 AM, David Brown wrote: I'm sort of jumping in to this thread, so my apologies if I repeat things other people have said already. I'm glad you jumped in David. You made a critical statement of

Re: O_DIRECT to md raid 6 is slow

2012-08-17 Thread Miquel van Smoorenburg
On 08/17/2012 09:31 AM, Stan Hoeppner wrote: On 8/16/2012 4:50 PM, Miquel van Smoorenburg wrote: I did a simple test: * created a 1G partition on 3 seperate disks * created a md raid5 array with 512K chunksize: mdadm -C /dev/md0 -l 5 -c $((1024*512)) -n 3 /dev/sdb1 /dev/sdc1 /dev/sdd1 * ran

Re: O_DIRECT to md raid 6 is slow

2012-08-17 Thread Stan Hoeppner
On 8/16/2012 4:50 PM, Miquel van Smoorenburg wrote: > On 16-08-12 1:05 PM, Stan Hoeppner wrote: >> On 8/15/2012 6:07 PM, Miquel van Smoorenburg wrote: >>> Ehrm no. If you modify, say, a 4K block on a RAID5 array, you just have >>> to read that 4K block, and the corresponding 4K block on the >>>

Re: O_DIRECT to md raid 6 is slow

2012-08-17 Thread Stan Hoeppner
On 8/16/2012 4:50 PM, Miquel van Smoorenburg wrote: On 16-08-12 1:05 PM, Stan Hoeppner wrote: On 8/15/2012 6:07 PM, Miquel van Smoorenburg wrote: Ehrm no. If you modify, say, a 4K block on a RAID5 array, you just have to read that 4K block, and the corresponding 4K block on the parity drive,

Re: O_DIRECT to md raid 6 is slow

2012-08-17 Thread Miquel van Smoorenburg
On 08/17/2012 09:31 AM, Stan Hoeppner wrote: On 8/16/2012 4:50 PM, Miquel van Smoorenburg wrote: I did a simple test: * created a 1G partition on 3 seperate disks * created a md raid5 array with 512K chunksize: mdadm -C /dev/md0 -l 5 -c $((1024*512)) -n 3 /dev/sdb1 /dev/sdc1 /dev/sdd1 * ran

Re: O_DIRECT to md raid 6 is slow

2012-08-16 Thread Miquel van Smoorenburg
On 16-08-12 1:05 PM, Stan Hoeppner wrote: On 8/15/2012 6:07 PM, Miquel van Smoorenburg wrote: Ehrm no. If you modify, say, a 4K block on a RAID5 array, you just have to read that 4K block, and the corresponding 4K block on the parity drive, recalculate parity, and write back 4K of data and 4K

Re: O_DIRECT to md raid 6 is slow

2012-08-16 Thread Stan Hoeppner
On 8/15/2012 6:07 PM, Miquel van Smoorenburg wrote: > In article you write: >> It's time to blow away the array and start over. You're already >> misaligned, and a 512KB chunk is insanely unsuitable for parity RAID, >> but for a handful of niche all streaming workloads with little/no >> rewrite,

Re: O_DIRECT to md raid 6 is slow

2012-08-16 Thread Roman Mamedov
On Wed, 15 Aug 2012 18:50:44 -0500 Stan Hoeppner wrote: > TTBOMK there are two, and only two, COW filesystems in existence: ZFS and > BTRFS. There is also NILFS2: http://www.nilfs.org/en/ And in general, any https://en.wikipedia.org/wiki/Log-structured_file_system is COW by design, but afaik

Re: O_DIRECT to md raid 6 is slow

2012-08-16 Thread Roman Mamedov
On Wed, 15 Aug 2012 18:50:44 -0500 Stan Hoeppner s...@hardwarefreak.com wrote: TTBOMK there are two, and only two, COW filesystems in existence: ZFS and BTRFS. There is also NILFS2: http://www.nilfs.org/en/ And in general, any https://en.wikipedia.org/wiki/Log-structured_file_system is COW

Re: O_DIRECT to md raid 6 is slow

2012-08-16 Thread Stan Hoeppner
On 8/15/2012 6:07 PM, Miquel van Smoorenburg wrote: In article xs4all.502c1c01.1040...@hardwarefreak.com you write: It's time to blow away the array and start over. You're already misaligned, and a 512KB chunk is insanely unsuitable for parity RAID, but for a handful of niche all streaming

Re: O_DIRECT to md raid 6 is slow

2012-08-16 Thread Miquel van Smoorenburg
On 16-08-12 1:05 PM, Stan Hoeppner wrote: On 8/15/2012 6:07 PM, Miquel van Smoorenburg wrote: Ehrm no. If you modify, say, a 4K block on a RAID5 array, you just have to read that 4K block, and the corresponding 4K block on the parity drive, recalculate parity, and write back 4K of data and 4K

Re: O_DIRECT to md raid 6 is slow

2012-08-15 Thread Andy Lutomirski
On Wed, Aug 15, 2012 at 4:50 PM, Stan Hoeppner wrote: > On 8/15/2012 5:10 PM, Andy Lutomirski wrote: >> On Wed, Aug 15, 2012 at 3:00 PM, Stan Hoeppner >> wrote: >>> On 8/15/2012 12:57 PM, Andy Lutomirski wrote: On Wed, Aug 15, 2012 at 4:50 AM, John Robinson wrote: > On 15/08/2012

Re: O_DIRECT to md raid 6 is slow

2012-08-15 Thread Stan Hoeppner
On 8/15/2012 5:10 PM, Andy Lutomirski wrote: > On Wed, Aug 15, 2012 at 3:00 PM, Stan Hoeppner wrote: >> On 8/15/2012 12:57 PM, Andy Lutomirski wrote: >>> On Wed, Aug 15, 2012 at 4:50 AM, John Robinson >>> wrote: On 15/08/2012 01:49, Andy Lutomirski wrote: > > If I do: > # dd

Re: O_DIRECT to md raid 6 is slow

2012-08-15 Thread Miquel van Smoorenburg
In article you write: >It's time to blow away the array and start over. You're already >misaligned, and a 512KB chunk is insanely unsuitable for parity RAID, >but for a handful of niche all streaming workloads with little/no >rewrite, such as video surveillance or DVR workloads. > >Yes, 512KB is

Re: O_DIRECT to md raid 6 is slow

2012-08-15 Thread Andy Lutomirski
On Wed, Aug 15, 2012 at 3:00 PM, Stan Hoeppner wrote: > On 8/15/2012 12:57 PM, Andy Lutomirski wrote: >> On Wed, Aug 15, 2012 at 4:50 AM, John Robinson >> wrote: >>> On 15/08/2012 01:49, Andy Lutomirski wrote: If I do: # dd if=/dev/zero of=/dev/md0p1 bs=8M >>> >>> [...] >>>

Re: O_DIRECT to md raid 6 is slow

2012-08-15 Thread Stan Hoeppner
On 8/15/2012 12:57 PM, Andy Lutomirski wrote: > On Wed, Aug 15, 2012 at 4:50 AM, John Robinson > wrote: >> On 15/08/2012 01:49, Andy Lutomirski wrote: >>> >>> If I do: >>> # dd if=/dev/zero of=/dev/md0p1 bs=8M >> >> [...] >> >>> It looks like md isn't recognizing that I'm writing whole stripes

Re: O_DIRECT to md raid 6 is slow

2012-08-15 Thread Andy Lutomirski
On Wed, Aug 15, 2012 at 4:50 AM, John Robinson wrote: > On 15/08/2012 01:49, Andy Lutomirski wrote: >> >> If I do: >> # dd if=/dev/zero of=/dev/md0p1 bs=8M > > [...] > >> It looks like md isn't recognizing that I'm writing whole stripes when >> I'm in O_DIRECT mode. > > > I see your md device is

Re: O_DIRECT to md raid 6 is slow

2012-08-15 Thread John Robinson
On 15/08/2012 01:49, Andy Lutomirski wrote: If I do: # dd if=/dev/zero of=/dev/md0p1 bs=8M [...] It looks like md isn't recognizing that I'm writing whole stripes when I'm in O_DIRECT mode. I see your md device is partitioned. Is the partition itself stripe-aligned? Cheers, John. -- To

Re: O_DIRECT to md raid 6 is slow

2012-08-15 Thread John Robinson
On 15/08/2012 01:49, Andy Lutomirski wrote: If I do: # dd if=/dev/zero of=/dev/md0p1 bs=8M [...] It looks like md isn't recognizing that I'm writing whole stripes when I'm in O_DIRECT mode. I see your md device is partitioned. Is the partition itself stripe-aligned? Cheers, John. -- To

Re: O_DIRECT to md raid 6 is slow

2012-08-15 Thread Andy Lutomirski
On Wed, Aug 15, 2012 at 4:50 AM, John Robinson john.robin...@anonymous.org.uk wrote: On 15/08/2012 01:49, Andy Lutomirski wrote: If I do: # dd if=/dev/zero of=/dev/md0p1 bs=8M [...] It looks like md isn't recognizing that I'm writing whole stripes when I'm in O_DIRECT mode. I see your

Re: O_DIRECT to md raid 6 is slow

2012-08-15 Thread Stan Hoeppner
On 8/15/2012 12:57 PM, Andy Lutomirski wrote: On Wed, Aug 15, 2012 at 4:50 AM, John Robinson john.robin...@anonymous.org.uk wrote: On 15/08/2012 01:49, Andy Lutomirski wrote: If I do: # dd if=/dev/zero of=/dev/md0p1 bs=8M [...] It looks like md isn't recognizing that I'm writing whole

Re: O_DIRECT to md raid 6 is slow

2012-08-15 Thread Andy Lutomirski
On Wed, Aug 15, 2012 at 3:00 PM, Stan Hoeppner s...@hardwarefreak.com wrote: On 8/15/2012 12:57 PM, Andy Lutomirski wrote: On Wed, Aug 15, 2012 at 4:50 AM, John Robinson john.robin...@anonymous.org.uk wrote: On 15/08/2012 01:49, Andy Lutomirski wrote: If I do: # dd if=/dev/zero

Re: O_DIRECT to md raid 6 is slow

2012-08-15 Thread Miquel van Smoorenburg
In article xs4all.502c1c01.1040...@hardwarefreak.com you write: It's time to blow away the array and start over. You're already misaligned, and a 512KB chunk is insanely unsuitable for parity RAID, but for a handful of niche all streaming workloads with little/no rewrite, such as video

Re: O_DIRECT to md raid 6 is slow

2012-08-15 Thread Stan Hoeppner
On 8/15/2012 5:10 PM, Andy Lutomirski wrote: On Wed, Aug 15, 2012 at 3:00 PM, Stan Hoeppner s...@hardwarefreak.com wrote: On 8/15/2012 12:57 PM, Andy Lutomirski wrote: On Wed, Aug 15, 2012 at 4:50 AM, John Robinson john.robin...@anonymous.org.uk wrote: On 15/08/2012 01:49, Andy Lutomirski

Re: O_DIRECT to md raid 6 is slow

2012-08-15 Thread Andy Lutomirski
On Wed, Aug 15, 2012 at 4:50 PM, Stan Hoeppner s...@hardwarefreak.com wrote: On 8/15/2012 5:10 PM, Andy Lutomirski wrote: On Wed, Aug 15, 2012 at 3:00 PM, Stan Hoeppner s...@hardwarefreak.com wrote: On 8/15/2012 12:57 PM, Andy Lutomirski wrote: On Wed, Aug 15, 2012 at 4:50 AM, John Robinson

Re: Re: O_DIRECT to md raid 6 is slow

2012-08-14 Thread kedacomkernel
On 2012-08-15 09:12 Andy Lutomirski Wrote: >Ubuntu's 3.2.0-27-generic. I can test on a newer kernel tomorrow. I guess maybe miss the blk_plug function. Can you add this patch and retest. Move unplugging for direct I/O from around ->direct_IO() down to do_blockdev_direct_IO(). This implicitly

Re: O_DIRECT to md raid 6 is slow

2012-08-14 Thread Andy Lutomirski
Ubuntu's 3.2.0-27-generic. I can test on a newer kernel tomorrow. --Andy On Tue, Aug 14, 2012 at 6:07 PM, kedacomkernel wrote: > On 2012-08-15 08:49 Andy Lutomirski Wrote: >>If I do: >># dd if=/dev/zero of=/dev/md0p1 bs=8M >>then iostat -m 5 says: >> >>avg-cpu: %user %nice %system %iowait

Re: O_DIRECT to md raid 6 is slow

2012-08-14 Thread kedacomkernel
On 2012-08-15 08:49 Andy Lutomirski Wrote: >If I do: ># dd if=/dev/zero of=/dev/md0p1 bs=8M >then iostat -m 5 says: > >avg-cpu: %user %nice %system %iowait %steal %idle > 0.000.00 26.88 35.270.00 37.85 > >Device:tpsMB_read/sMB_wrtn/sMB_read

Re: O_DIRECT to md raid 6 is slow

2012-08-14 Thread kedacomkernel
On 2012-08-15 08:49 Andy Lutomirski l...@amacapital.net Wrote: If I do: # dd if=/dev/zero of=/dev/md0p1 bs=8M then iostat -m 5 says: avg-cpu: %user %nice %system %iowait %steal %idle 0.000.00 26.88 35.270.00 37.85 Device:tpsMB_read/sMB_wrtn/s

Re: O_DIRECT to md raid 6 is slow

2012-08-14 Thread Andy Lutomirski
Ubuntu's 3.2.0-27-generic. I can test on a newer kernel tomorrow. --Andy On Tue, Aug 14, 2012 at 6:07 PM, kedacomkernel kedacomker...@gmail.com wrote: On 2012-08-15 08:49 Andy Lutomirski l...@amacapital.net Wrote: If I do: # dd if=/dev/zero of=/dev/md0p1 bs=8M then iostat -m 5 says: avg-cpu:

Re: Re: O_DIRECT to md raid 6 is slow

2012-08-14 Thread kedacomkernel
On 2012-08-15 09:12 Andy Lutomirski l...@amacapital.net Wrote: Ubuntu's 3.2.0-27-generic. I can test on a newer kernel tomorrow. I guess maybe miss the blk_plug function. Can you add this patch and retest. Move unplugging for direct I/O from around -direct_IO() down to do_blockdev_direct_IO().