Re: [zfs-discuss] Sequential reading/writting from large stripe faster on SVM than ZFS?

2007-10-29 Thread Robert Milkowski
Hello Roch,

Wednesday, October 24, 2007, 3:49:45 PM, you wrote:

RP I would suspect the checksum part of this (I do believe it's being
RP actively worked on) :

RP 6533726 single-threaded checksum  raidz2 parity
RP calculations limit write bandwidth on thumper


I guess it's single threaded per pool - that's why once I created
multiple pool the performance was much better.

Thanks for info.


-- 
Best regards,
 Robertmailto:[EMAIL PROTECTED]
   http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Sequential reading/writting from large stripe faster on SVM than ZFS?

2007-10-24 Thread Roch - PAE

I would suspect the checksum part of this (I do believe it's being
actively worked on) :

6533726 single-threaded checksum  raidz2 parity calculations limit 
write bandwidth on thumper

-r

Robert Milkowski writes:
  Hi,
  
  snv_74, x4500, 48x 500GB, 16GB RAM, 2x dual core
  
  # zpool create test c0t0d0 c0t1d0 c0t2d0 c0t3d0 c0t4d0 c0t5d0 c0t6d0 c0t7d0 
  c1t0d0 c1t1d0 c1t2d0 c1t3d0 c1t4d0 c1t5d0 c1t6d0 c1t7d0 c4t0d0 c4t1d0 c4t2d0 
  c4t3d0 c4t4d0 c4t5d0 c4t6d0 c4t7d0 c5t1d0 c5t2d0 c5t3d0 c5t5d0 c5t6d0 c5t7d0 
  c6t0d0 c6t1d0 c6t2d0 c6t3d0 c6t4d0 c6t5d0 c6t6d0 c6t7d0 c7t0d0 c7t1d0 c7t2d0 
  c7t3d0 c7t4d0 c7t5d0 c7t6d0 c7t7d0
  [46x 500GB]
  
  # ls -lh /test/q1
  -rw-r--r--   1 root root 82G Oct 18 09:43 /test/q1
  
  # dd if=/test/q1 of=/dev/null bs=16384k 
  # zpool iostat 1
 capacity operationsbandwidth
  pool used  avail   read  write   read  write
  --  -  -  -  -  -  -
  test 213G  20.6T645120  80.1M  14.7M
  test 213G  20.6T  9.26K  0  1.16G  0
  test 213G  20.6T  9.66K  0  1.21G  0
  test 213G  20.6T  9.41K  0  1.18G  0
  test 213G  20.6T  9.41K  0  1.18G  0
  test 213G  20.6T  7.45K  0   953M  0
  test 213G  20.6T  7.59K  0   971M  0
  test 213G  20.6T  7.41K  0   948M  0
  test 213G  20.6T  8.25K  0  1.03G  0
  test 213G  20.6T  9.17K  0  1.15G  0
  test 213G  20.6T  9.54K  0  1.19G  0
  test 213G  20.6T  9.89K  0  1.24G  0
  test 213G  20.6T  9.41K  0  1.18G  0
  test 213G  20.6T  9.31K  0  1.16G  0
  test 213G  20.6T  9.80K  0  1.22G  0
  test 213G  20.6T  8.72K  0  1.09G  0
  test 213G  20.6T  7.86K  0  1006M  0
  test 213G  20.6T  7.21K  0   923M  0
  test 213G  20.6T  7.62K  0   975M  0
  test 213G  20.6T  8.68K  0  1.08G  0
  test 213G  20.6T  9.81K  0  1.23G  0
  test 213G  20.6T  9.57K  0  1.20G  0
  
  So it's around 1GB/s.
  
  # dd if=/dev/zero of=/test/q10 bs=128k 
  # zpool iostat 1
 capacity operationsbandwidth
  pool used  avail   read  write   read  write
  --  -  -  -  -  -  -
  test 223G  20.6T656170  81.5M  20.8M
  test 223G  20.6T  0  8.10K  0  1021M
  test 223G  20.6T  0  7.94K  0  1001M
  test 216G  20.6T  0  6.53K  0   812M
  test 216G  20.6T  0  7.19K  0   906M
  test 216G  20.6T  0  6.78K  0   854M
  test 216G  20.6T  0  7.88K  0   993M
  test 216G  20.6T  0  10.3K  0  1.27G
  test 222G  20.6T  0  8.61K  0  1.04G
  test 222G  20.6T  0  7.30K  0   919M
  test 222G  20.6T  0  8.16K  0  1.00G
  test 222G  20.6T  0  8.82K  0  1.09G
  test 225G  20.6T  0  4.19K  0   511M
  test 225G  20.6T  0  10.2K  0  1.26G
  test 225G  20.6T  0  9.15K  0  1.13G
  test 225G  20.6T  0  8.46K  0  1.04G
  test 225G  20.6T  0  8.48K  0  1.04G
  test 225G  20.6T  0  10.9K  0  1.33G
  test 231G  20.6T  0  3  0  3.96K
  test 231G  20.6T  0  0  0  0
  test 231G  20.6T  0  0  0  0
  test 231G  20.6T  0  9.02K  0  1.11G
  test 231G  20.6T  0  12.2K  0  1.50G
  test 231G  20.6T  0  9.14K  0  1.13G
  test 231G  20.6T  0  10.3K  0  1.27G
  test 231G  20.6T  0  9.08K  0  1.10G
  test 237G  20.6T  0  0  0  0
  test 237G  20.6T  0  0  0  0
  test 237G  20.6T  0  6.03K  0   760M
  test 237G  20.6T  0  9.18K  0  1.13G
  test 237G  20.6T  0  8.40K  0  1.03G
  test 237G  20.6T  0  8.45K  0  1.04G
  test 237G  20.6T  0  11.1K  0  1.36G
  
  Well, writing could be faster than reading here... there're gaps due to bug 
  6415647 I guess.
  
  
  # zpool destroy test
  
  # metainit d100 1 46 c0t0d0s0 c0t1d0s0 c0t2d0s0 c0t3d0s0 c0t4d0s0 c0t5d0s0 
  c0t6d0s0 c0t7d0s0 c1t0d0s0 c1t1d0s0 c1t2d0s0 c1t3d0s0 c1t4d0s0 c1t5d0s0 
  c1t6d0s0 c1t7d0s0 c4t0d0s0 c4t1d0s0 c4t2d0s0 c4t3d0s0 c4t4d0s0 c4t5d0s0 
  c4t6d0s0 c4t7d0s0 c5t1d0s0 c5t2d0s0 c5t3d0s0 c5t5d0s0 c5t6d0s0 c5t7d0s0 
  c6t0d0s0 c6t1d0s0 c6t2d0s0 c6t3d0s0 c6t4d0s0 c6t5d0s0 c6t6d0s0 c6t7d0s0 
  c7t0d0s0 c7t1d0s0 c7t2d0s0 c7t3d0s0 c7t4d0s0 c7t5d0s0 c7t6d0s0 c7t7d0s0 -i 
  128k
  d100: Concat/Stripe is setup
  [46x 500GB]
  
  And I get not so good results - maximum 1GB of reading... h...
  
  maxphys is 56K - I thought 

Re: [zfs-discuss] Sequential reading/writting from large stripe faster on SVM than ZFS?

2007-10-19 Thread Robert Milkowski
Hello Mario,

Friday, October 19, 2007, 5:37:07 PM, you wrote:

 The question is - why I can't get that kind of performance with single zfs 
 pool (striping accross all te disks)? Concurrency problem or something else?

MG Remember that ZFS is checksumming everything on reads and writes.

I know - but if I divide that pool into 6 smaler ones (see my 2nd
post) that the performance is much better. So it's not that system is
not able to cope because of checksums it's rather that I can't get
that kind of a performance from single pool.

I know it's a dd test but anyway...

-- 
Best regards,
 Robert Milkowski   mailto:[EMAIL PROTECTED]
   http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Sequential reading/writting from large stripe faster on SVM than ZFS?

2007-10-19 Thread Mario Goebbels
 The question is - why I can't get that kind of performance with single zfs 
 pool (striping accross all te disks)? Concurrency problem or something else?

Remember that ZFS is checksumming everything on reads and writes.

-mg



signature.asc
Description: OpenPGP digital signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Sequential reading/writting from large stripe faster on SVM than ZFS?

2007-10-18 Thread Robert Milkowski
Hi,

snv_74, x4500, 48x 500GB, 16GB RAM, 2x dual core

# zpool create test c0t0d0 c0t1d0 c0t2d0 c0t3d0 c0t4d0 c0t5d0 c0t6d0 c0t7d0 
c1t0d0 c1t1d0 c1t2d0 c1t3d0 c1t4d0 c1t5d0 c1t6d0 c1t7d0 c4t0d0 c4t1d0 c4t2d0 
c4t3d0 c4t4d0 c4t5d0 c4t6d0 c4t7d0 c5t1d0 c5t2d0 c5t3d0 c5t5d0 c5t6d0 c5t7d0 
c6t0d0 c6t1d0 c6t2d0 c6t3d0 c6t4d0 c6t5d0 c6t6d0 c6t7d0 c7t0d0 c7t1d0 c7t2d0 
c7t3d0 c7t4d0 c7t5d0 c7t6d0 c7t7d0
[46x 500GB]

# ls -lh /test/q1
-rw-r--r--   1 root root 82G Oct 18 09:43 /test/q1

# dd if=/test/q1 of=/dev/null bs=16384k 
# zpool iostat 1
   capacity operationsbandwidth
pool used  avail   read  write   read  write
--  -  -  -  -  -  -
test 213G  20.6T645120  80.1M  14.7M
test 213G  20.6T  9.26K  0  1.16G  0
test 213G  20.6T  9.66K  0  1.21G  0
test 213G  20.6T  9.41K  0  1.18G  0
test 213G  20.6T  9.41K  0  1.18G  0
test 213G  20.6T  7.45K  0   953M  0
test 213G  20.6T  7.59K  0   971M  0
test 213G  20.6T  7.41K  0   948M  0
test 213G  20.6T  8.25K  0  1.03G  0
test 213G  20.6T  9.17K  0  1.15G  0
test 213G  20.6T  9.54K  0  1.19G  0
test 213G  20.6T  9.89K  0  1.24G  0
test 213G  20.6T  9.41K  0  1.18G  0
test 213G  20.6T  9.31K  0  1.16G  0
test 213G  20.6T  9.80K  0  1.22G  0
test 213G  20.6T  8.72K  0  1.09G  0
test 213G  20.6T  7.86K  0  1006M  0
test 213G  20.6T  7.21K  0   923M  0
test 213G  20.6T  7.62K  0   975M  0
test 213G  20.6T  8.68K  0  1.08G  0
test 213G  20.6T  9.81K  0  1.23G  0
test 213G  20.6T  9.57K  0  1.20G  0

So it's around 1GB/s.

# dd if=/dev/zero of=/test/q10 bs=128k 
# zpool iostat 1
   capacity operationsbandwidth
pool used  avail   read  write   read  write
--  -  -  -  -  -  -
test 223G  20.6T656170  81.5M  20.8M
test 223G  20.6T  0  8.10K  0  1021M
test 223G  20.6T  0  7.94K  0  1001M
test 216G  20.6T  0  6.53K  0   812M
test 216G  20.6T  0  7.19K  0   906M
test 216G  20.6T  0  6.78K  0   854M
test 216G  20.6T  0  7.88K  0   993M
test 216G  20.6T  0  10.3K  0  1.27G
test 222G  20.6T  0  8.61K  0  1.04G
test 222G  20.6T  0  7.30K  0   919M
test 222G  20.6T  0  8.16K  0  1.00G
test 222G  20.6T  0  8.82K  0  1.09G
test 225G  20.6T  0  4.19K  0   511M
test 225G  20.6T  0  10.2K  0  1.26G
test 225G  20.6T  0  9.15K  0  1.13G
test 225G  20.6T  0  8.46K  0  1.04G
test 225G  20.6T  0  8.48K  0  1.04G
test 225G  20.6T  0  10.9K  0  1.33G
test 231G  20.6T  0  3  0  3.96K
test 231G  20.6T  0  0  0  0
test 231G  20.6T  0  0  0  0
test 231G  20.6T  0  9.02K  0  1.11G
test 231G  20.6T  0  12.2K  0  1.50G
test 231G  20.6T  0  9.14K  0  1.13G
test 231G  20.6T  0  10.3K  0  1.27G
test 231G  20.6T  0  9.08K  0  1.10G
test 237G  20.6T  0  0  0  0
test 237G  20.6T  0  0  0  0
test 237G  20.6T  0  6.03K  0   760M
test 237G  20.6T  0  9.18K  0  1.13G
test 237G  20.6T  0  8.40K  0  1.03G
test 237G  20.6T  0  8.45K  0  1.04G
test 237G  20.6T  0  11.1K  0  1.36G

Well, writing could be faster than reading here... there're gaps due to bug 
6415647 I guess.


# zpool destroy test

# metainit d100 1 46 c0t0d0s0 c0t1d0s0 c0t2d0s0 c0t3d0s0 c0t4d0s0 c0t5d0s0 
c0t6d0s0 c0t7d0s0 c1t0d0s0 c1t1d0s0 c1t2d0s0 c1t3d0s0 c1t4d0s0 c1t5d0s0 
c1t6d0s0 c1t7d0s0 c4t0d0s0 c4t1d0s0 c4t2d0s0 c4t3d0s0 c4t4d0s0 c4t5d0s0 
c4t6d0s0 c4t7d0s0 c5t1d0s0 c5t2d0s0 c5t3d0s0 c5t5d0s0 c5t6d0s0 c5t7d0s0 
c6t0d0s0 c6t1d0s0 c6t2d0s0 c6t3d0s0 c6t4d0s0 c6t5d0s0 c6t6d0s0 c6t7d0s0 
c7t0d0s0 c7t1d0s0 c7t2d0s0 c7t3d0s0 c7t4d0s0 c7t5d0s0 c7t6d0s0 c7t7d0s0 -i 128k
d100: Concat/Stripe is setup
[46x 500GB]

And I get not so good results - maximum 1GB of reading... h...

maxphys is 56K - I thought it was increased some time ago on x86!

Still no performance increase.

# metainit d101 -r c0t0d0s0 c1t0d0s0 c4t0d0s0 c6t0d0s0 c7t0d0s0 -i 128k
# metainit d102 -r c0t1d0s0 c1t1d0s0 c5t1d0s0 c6t1d0s0 c7t1d0s0 -i 128k
# metainit d103 -r c0t2d0s0 c1t2d0s0 c5t2d0s0 c6t2d0s0 c7t2d0s0 -i 128k
# metainit d104 -r c0t4d0s0 c1t4d0s0 c4t4d0s0 c6t4d0s0 c7t4d0s0 -i 128k
# metainit d105 -r c0t3d0s0 c1t3d0s0 c4t3d0s0 c5t3d0s0 c6t3d0s0 

Re: [zfs-discuss] Sequential reading/writting from large stripe faster on SVM than ZFS?

2007-10-18 Thread Robert Milkowski
zpool create t1 c0t0d0 c0t1d0 c0t2d0 c0t3d0 c0t4d0 c0t5d0 c0t6d0 c0t7d0
 zpool create t2 c1t0d0 c1t1d0 c1t2d0 c1t3d0 c1t4d0 c1t5d0 c1t6d0 c1t7d0
 zpool create t3 c4t0d0 c4t1d0 c4t2d0 c4t3d0 c4t4d0 c4t5d0 c4t6d0 c4t7d0
 zpool create t4 c5t1d0 c5t2d0 c5t3d0 c5t5d0 c5t6d0 c5t7d0
 zpool create t5 c6t0d0 c6t1d0 c6t2d0 c6t3d0 c6t4d0 c6t5d0 c6t6d0 c6t7d0
 zpool create t6 c7t0d0 c7t1d0 c7t2d0 c7t3d0 c7t4d0 c7t5d0 c7t6d0 c7t7d0

 zfs set atime=off t1
 zfs set atime=off t2
 zfs set atime=off t3
 zfs set atime=off t4
 zfs set atime=off t5
 zfs set atime=off t6

 dd if=/dev/zero of=/t1/q1 bs=512k
[1] 903
 dd if=/dev/zero of=/t2/q1 bs=512k
[2] 908
 dd if=/dev/zero of=/t3/q1 bs=512k
[3] 909
 dd if=/dev/zero of=/t4/q1 bs=512k
[4] 910
 dd if=/dev/zero of=/t5/q1 bs=512k
[5] 911
 dd if=/dev/zero of=/t6/q1 bs=512k
[6] 912



 zpool iostat 1
[...]
   capacity operationsbandwidth
pool used  avail   read  write   read  write
--  -  -  -  -  -  -
t1  20.1G  3.61T  0  3.19K  0   405M
t2  12.9G  3.61T  0  2.38K  0   302M
t3  8.51G  3.62T  0  2.79K  63.4K   357M
t4  5.19G  2.71T  0  1.39K  63.4K   170M
t5  1.96G  3.62T  0  2.65K  0   336M
t6  1.29G  3.62T  0  1.05K  63.4K   127M
--  -  -  -  -  -  -
t1  20.1G  3.61T  0  3.77K  0   483M
t2  12.9G  3.61T  0  3.49K  0   446M
t3  8.51G  3.62T  0  2.36K  63.3K   295M
t4  5.19G  2.71T  0  2.84K  0   359M
t5  2.29G  3.62T  0 97  62.7K   494K
t6  1.29G  3.62T  0  4.03K  0   510M
--  -  -  -  -  -  -

 iostat -xnzCM 1 | egrep device| c[0-7]$
[...]
extended device statistics  
r/sw/s   Mr/s   Mw/s wait actv wsvc_t asvc_t  %w  %b device
0.0 5277.80.0  659.7  0.6 120.20.1   22.8   1 646 c0
0.0 5625.70.0  703.2  0.1 116.70.0   20.7   0 691 c1
0.0 4806.70.0  599.4  0.0 83.90.0   17.4   0 582 c4
0.0 2457.40.0  307.2  3.3 134.91.3   54.9   2 600 c5
0.0 3882.80.0  485.3  0.4 157.10.1   40.5   0 751 c7



So right now I'm getting up-to 2,7GB/s.

It's still jumpy (I provided only peak outputs) but it's much better than one 
large pool - lets try again:

# zpool create test c0t0d0 c0t1d0 c0t2d0 c0t3d0 c0t4d0 c0t5d0 c0t6d0 c0t7d0 
c1t0d0 c1t1d0 c1t2d0 c1t3d0 c1t4d0 c1t5d0 c1t6d0 c1t7d0 c4t0d0 c4t1d0 c4t2d0 
c4t3d0 c4t4d0 c4t5d0 c4t6d0 c4t7d0 c5t1d0 c5t2d0 c5t3d0 c5t5d0 c5t6d0 c5t7d0 
c6t0d0 c6t1d0 c6t2d0 c6t3d0 c6t4d0 c6t5d0 c6t6d0 c6t7d0 c7t0d0 c7t1d0 c7t2d0 
c7t3d0 c7t4d0 c7t5d0 c7t6d0 c7t7d0

 zfs set atime=off test

 dd if=/dev/zero of=/test/q1 bs=512k
 dd if=/dev/zero of=/test/q2 bs=512k
 dd if=/dev/zero of=/test/q3 bs=512k
 dd if=/dev/zero of=/test/q4 bs=512k
 dd if=/dev/zero of=/test/q5 bs=512k
 dd if=/dev/zero of=/test/q6 bs=512k


iostat -xnzCM 1 | egrep device| c[0-7]$
[...]
extended device statistics  
r/sw/s   Mr/s   Mw/s wait actv wsvc_t asvc_t  %w  %b device
0.0 1891.90.0  233.0 11.7 13.56.27.1   3 374 c0
0.0 1944.90.0  239.5 10.9 14.05.67.2   3 350 c1
7.0 1897.90.1  233.0 11.3 13.35.97.0   3 339 c4
   13.0 1455.90.2  178.5 13.2  6.19.04.2   3 226 c5
0.0 1921.90.0  236.0  8.1 10.74.25.5   2 322 c6
0.0 1919.90.0  236.0  7.8 10.54.15.5   2 321 c7

So it's about 1.3GB/s - about half of what I get with more pools.


Looks like a problem with scalability of one pool.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss