Re: [ceph-users] odd performance graph
Hi, Am 02.12.2013 04:27, schrieb James Harper: Hi, The low points are all ~35Mbytes/sec and the high points are all ~60Mbytes/sec. This is very reproducible. It occurred to me that just stopping the OSD's selectively would allow me to see if there was a change when one was ejected, but at no time was there a change to the graph... did you configure the pool with 3 copies and try to run the benchmark test with one OSD only ? Can you reproduce the values for each OSD ? I'll have to do that after hours. I'm not seeing this across all VM's though so I think it's a bit hit and miss (the graph is constant for that one VM though). What I did do was to shut down each OSD selectively, and there was no change to the graph. One thing I hadn't considered is that this VM is running on a physical host which has a different network set up - using LACP across 2 ports. I suspect that the combination of connections and the way LACP works means that sometimes the data goes across one network port (although I don't understand why I'm only getting 30mbytes/second per port in that case). I'm going to recable things at some point soon, so I'll revisit it after that. James I also noticed a graph like this once i benchmarked w2k8 guest on ceph with rbd. To me it looked like when the space on the drive is used, the throughput is lower, when the space read by rbd on the drive is unused, the reads are superfast. I don't know how rbd works inside, but i think ceph rbd here returns zeros without real osd disk read if the block/sector of the rbd-disk is unused. That would explain the graph you see. You can try adding a second rbd image and not format/use it and benchmark this disk, then make a filesystem on it and write some data and benchmark again... -- Mit freundlichen Grüßen, Florian Wiessner Smart Weblications GmbH Martinsberger Str. 1 D-95119 Naila fon.: +49 9282 9638 200 fax.: +49 9282 9638 205 24/7: +49 900 144 000 00 - 0,99 EUR/Min* http://www.smart-weblications.de -- Sitz der Gesellschaft: Naila Geschäftsführer: Florian Wiessner HRB-Nr.: HRB 3840 Amtsgericht Hof *aus dem dt. Festnetz, ggf. abweichende Preise aus dem Mobilfunknetz ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] odd performance graph
I don't know how rbd works inside, but i think ceph rbd here returns zeros without real osd disk read if the block/sector of the rbd-disk is unused. That would explain the graph you see. You can try adding a second rbd image and not format/use it and benchmark this disk, then make a filesystem on it and write some data and benchmark again... When performance testing RBDs I generally write in the whole area before doing any testing to avoid this problem. It would be interesting to have confirmation this is a real concern with Ceph. I know it is in other thin provisioned storage, for example, VMWare. Perhaps someone more expert can comment. Also, is there any way to shortcut the write-in process? Writing in TBs of RBD image can really extend the length of our performance test cycle. It would be great if there was some shortcut to cause Ceph to treat the whole RBD as having already been written, or just go fetch data from disk on all reads regardless of whether that area had been written, just for testing purposes. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] odd performance graph
On 12/02/2013 05:06 PM, Gruher, Joseph R wrote: I don't know how rbd works inside, but i think ceph rbd here returns zeros without real osd disk read if the block/sector of the rbd-disk is unused. That would explain the graph you see. You can try adding a second rbd image and not format/use it and benchmark this disk, then make a filesystem on it and write some data and benchmark again... When performance testing RBDs I generally write in the whole area before doing any testing to avoid this problem. It would be interesting to have confirmation this is a real concern with Ceph. I know it is in other thin provisioned storage, for example, VMWare. Perhaps someone more expert can comment. Also, is there any way to shortcut the write-in process? Writing in TBs of RBD image can really extend the length of our performance test cycle. It would be great if there was some shortcut to cause Ceph to treat the whole RBD as having already been written, or just go fetch data from disk on all reads regardless of whether that area had been written, just for testing purposes. For our internal testing, we always write data out in it's entirety before doing reads as well. Not doing so will show inaccurate results as you've noticed. Mark ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] odd performance graph
I also noticed a graph like this once i benchmarked w2k8 guest on ceph with rbd. To me it looked like when the space on the drive is used, the throughput is lower, when the space read by rbd on the drive is unused, the reads are superfast. I don't know how rbd works inside, but i think ceph rbd here returns zeros without real osd disk read if the block/sector of the rbd-disk is unused. That would explain the graph you see. You can try adding a second rbd image and not format/use it and benchmark this disk, then make a filesystem on it and write some data and benchmark again... That makes a lot of sense, although I wonder why the performance is that different in the 'used' vs 'unused' areas of the disk... James ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] odd performance graph
Hi, The low points are all ~35Mbytes/sec and the high points are all ~60Mbytes/sec. This is very reproducible. It occurred to me that just stopping the OSD's selectively would allow me to see if there was a change when one was ejected, but at no time was there a change to the graph... did you configure the pool with 3 copies and try to run the benchmark test with one OSD only ? Can you reproduce the values for each OSD ? Also, while doing the benchmarks, check the native IO performance on linux side with e.g. iostat(hdd) or iperf (net). Additionally you can use other benchmark tools like bonnie, fio or the ceph-benchmark on linux to get values not intercepted by a windows virtual-machine (running HDTach on) abstract storage layer. regards Danny smime.p7s Description: S/MIME cryptographic signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] odd performance graph
Hi, The low points are all ~35Mbytes/sec and the high points are all ~60Mbytes/sec. This is very reproducible. It occurred to me that just stopping the OSD's selectively would allow me to see if there was a change when one was ejected, but at no time was there a change to the graph... did you configure the pool with 3 copies and try to run the benchmark test with one OSD only ? Can you reproduce the values for each OSD ? I'll have to do that after hours. I'm not seeing this across all VM's though so I think it's a bit hit and miss (the graph is constant for that one VM though). What I did do was to shut down each OSD selectively, and there was no change to the graph. One thing I hadn't considered is that this VM is running on a physical host which has a different network set up - using LACP across 2 ports. I suspect that the combination of connections and the way LACP works means that sometimes the data goes across one network port (although I don't understand why I'm only getting 30mbytes/second per port in that case). I'm going to recable things at some point soon, so I'll revisit it after that. James ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] odd performance graph
I ran HTTach on one of my VM's and got a graph that looks like this: ___-- The low points are all ~35Mbytes/sec and the high points are all ~60Mbytes/sec. This is very reproducible. HDTach does sample reads across the whole disk, so would I be right in thinking that the variation is due to pg's being on different OSD's, and there is a difference in performance because of a difference in my OSD's. Is there a way for me to identify which OSD's are letting me down here? Presently I have 3: #1 - xfs with isize=256 #2 - xfs with isize=2048 #3 - btrfs I have my suspicions about which is dragging the chain, but how could I confirm it? Thanks James ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] odd performance graph
I ran HTTach on one of my VM's and got a graph that looks like this: ___-- The low points are all ~35Mbytes/sec and the high points are all ~60Mbytes/sec. This is very reproducible. HDTach does sample reads across the whole disk, so would I be right in thinking that the variation is due to pg's being on different OSD's, and there is a difference in performance because of a difference in my OSD's. Is there a way for me to identify which OSD's are letting me down here? Presently I have 3: #1 - xfs with isize=256 #2 - xfs with isize=2048 #3 - btrfs I have my suspicions about which is dragging the chain, but how could I confirm it? It occurred to me that just stopping the OSD's selectively would allow me to see if there was a change when one was ejected, but at no time was there a change to the graph... James ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com