Hi,
This problem occurs on Freebsd, Here is a patch trying to fix the issue
The cause is under high memory pressure zfs will wrongly evict cached metadata in prior to data during rewrite.

People who experience the same issue might be interested.

Attachment: zfs-rewrite.diff
Description: Binary data


James Pan


在 2014年8月19日,下午8:35,James Pan <[email protected]> 写道:

Hi Etienne,

Thank you for pointing out the known issue.

However, I found the bad performance not only happen for unaligned write, it also happen for aligned write (i.e, write block size == record size),
For example, I created a zfs with record size 64k and rerun iozone with 64k block size. here is the result:
write throughput: 1.2GB/s
rewrite throughput: 572MB/s

rewrite is still much worse than write.



spa# zfs get all pool/share2 | grep record                     
pool/share2  recordsize            64K                    local
spa# !102
/var/iozone -r 64k -s 64g -f /tank/pool/share2/iozone -i 0
Iozone: Performance Test of File I/O
        Version $Revision: 3.397 $
Compiled for 64 bit mode.
Build: freebsd 

Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins
            Al Slater, Scott Rhine, Mike Wisner, Ken Goss
            Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR,
            Randy Dunlap, Mark Montague, Dan Million, Gavin Brebner,
            Jean-Marc Zucconi, Jeff Blomberg, Benny Halevy, Dave Boone,
            Erik Habbinga, Kris Strecker, Walter Wong, Joshua Root,
            Fabrice Bacchella, Zhenghua Xue, Qin Li, Darren Sawyer.
            Ben England.

Run began: Wed Aug 20 12:22:58 2014

Record Size 64 KB
File size set to 67108864 KB
Command line used: /var/iozone -r 64k -s 64g -f /tank/pool/share2/iozone -i 0
Output is in Kbytes/sec
Time Resolution = 0.000001 seconds.
Processor cache size set to 1024 Kbytes.
Processor cache line size set to 32 bytes.
File stride size set to 17 * record size.
                                                            random  random    bkwd   record   stride                                   
              KB  reclen   write rewrite    read    reread    read   write    read  rewrite     read   fwrite frewrite   fread  freread
        67108864      64 1288669  571936                                                                                            

iozone test complete.




 
Best Regards,



James Jiaming Pan


On Tuesday, August 19, 2014 4:04 AM, Etienne Dechamps <[email protected]> wrote:


Hi James,

I'm afraid this is a known issue. See the following for details:

https://github.com/zfsonlinux/zfs/issues/361 (this is for ZFS On Linux,
but the issue is common to all ZFS platforms)

On 14/08/2014 10:57, James Pan wrote:
>
>
>  Hi,
> Sorry to bother but I can’t find a zfs user mailing list so I post this email here.
>
> Recently I run iozone test on a raid-0 pool with 9 disks and the rewrite performance is horrible.
>
> write throughput is as expected, about 1.1GB/s but rewrite (write on an existing file)
> only has 184MB/s which is far below write.
>
> zpool iostat shows during rewrite there are a lot of read operations while during write there is almost zero read.
> The read operations may have badly impacted the rewrite performance.
>
> I am not sure what it is reading during rewrite? My first guess is metadata, the old indirect blocks for example
> but I think these blocks be cached in ARC?
>
> To confirm this I print message in arc_evict(), I didn’t see metadata bufs being evicted,
> does this mean the metadata are all in cache?
>
> Any idea and advice on how to improve it? Thanks for you answer in advance!
>
> BTW, my OS is freebsd 8.2.
>
>
>
> =========== iozone result ==============
> spa# /var/iozone -i 0 -r 64k -s 32G -f /tank/pool/share/iozone
>  Iozone: Performance Test of File I/O
>          Version $Revision: 3.397 $
>    Compiled for 64 bit mode.
>    Build: freebsd
>
>  Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins
>                Al Slater, Scott Rhine, Mike Wisner, Ken Goss
>                Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR,
>                Randy Dunlap, Mark Montague, Dan Million, Gavin Brebner,
>                Jean-Marc Zucconi, Jeff Blomberg, Benny Halevy, Dave Boone,
>                Erik Habbinga, Kris Strecker, Walter Wong, Joshua Root,
>                Fabrice Bacchella, Zhenghua Xue, Qin Li, Darren Sawyer.
>                Ben England.
>
>  Run began: Fri Aug 15 08:28:26 2014
>
>  Record Size 64 KB
>  File size set to 33554432 KB
>  Command line used: /var/iozone -i 0 -r 64k -s 32G -f /tank/pool/share/iozone
>  Output is in Kbytes/sec
>  Time Resolution = 0.000001 seconds.
>  Processor cache size set to 1024 Kbytes.
>  Processor cache line size set to 32 bytes.
>  File stride size set to 17 * record size.
>                                                              random  random    bkwd  record  stride
>                KB  reclen  write rewrite    read    reread    read  write    read  rewrite    read  fwrite frewrite  fread  freread
>          33554432      64 1135955  184680
>
> iozone test complete.
>
>
> ============ zpool iostat result ========
> spa# zpool iostat pool 5
>                capacity    operations    bandwidth
> pool        alloc  free  read  write  read  write
> ----------  -----  -----  -----  -----  -----  -----
> pool        17.8G  8.14T    187    456  23.2M  56.2M
> pool        20.6G  8.14T      0  9.22K      0  1.14G
> pool        26.4G  8.13T      0  10..3K      0  1.27G
> pool        30.3G  8.13T      0  10.8K    204  1.33G
> pool        33.9G  8.12T  1.45K    898  185M  106M
> pool        35.7G  8..12T  1.33K  1.58K  169M  198M
> pool        34.6G  8.12T  1.48K  1.34K  188M  168M
> pool        33.7G  8.12T  1.36K  1.50K  173M  187M
>
>
> Best Regards,
>
>
>
>
> James Jiaming Pan

> _______________________________________________
> developer mailing list
> [email protected]
> http://lists.open-zfs.org/mailman/listinfo/developer
>


--
Etienne Dechamps




_______________________________________________
developer mailing list
[email protected]
http://lists.open-zfs.org/mailman/listinfo/developer

Reply via email to