On Wed, Jan 17, 2018 at 8:19 PM, Ben Turner <btur...@redhat.com> wrote:
> Hi all. I am seeing the strangest thing and I have no explanation and I > was hoping I could get your input here. I have a RHEV / RHS setup(non > RHHI, just traditional) where we have two hypervisors with the exact same > specs, mount the same Gluster volume, and one sees 1/2 the performance > inside the VM as the other. For these tests we would live migrate the VM > from one HV to the other and testing back and forth. The perf we are > seeing is: > > RHEV3 read - 210 MB / sec 483 GB RAM Dell r620 CPU - 8 cores > write - 150 MB / sec > > RHEV5 read - 592 / sec 483 GB RAM Dell r620 CPU - 8 cores > write - 280 MB / sec > > On identical HW, mounting the same volume, live migrating the VM back and > forth, the perf inside the VM is about 1/2 of the perf when run on the > other HV! To eliminate the RHEV / VM layer we ran similar tests directly > on the mount and saw almost the same level of throughput difference. > > After that we ran iperf tests, both systems the same 9+ Gb / sec, in fact > the speeds were almost identical over iperf. After that the customer > swapped cables / ports on the physical systems / router, again, same > behavior. We compared and contrasted configs, NW stats, and driver / FW > versions. Again all the same. Both validated identical configs on BJeans > and I later went through sosreport for anything I missed, I am completely > stumped. Does anyone have any ideas on where else to look? RHEV3 is using > less memory and has fewer VMs running as well! While I think that we have > eliminated a RHEV / Virt issue I left the RHHI guys in CC in case they had > any ideas. > > I am happy to provide any info / open a bug / whatever ya'll think, any > guidance on next steps would be appreciated. Note - this is a production > cluster with 100s of VMs so while we can test and move VMs around we can't > just stop them all. One last observation - These HVs have a bunch of RAM > and I think the VMs are heavily leveraging page cache on the HV. > RHV uses direct IO, so I don't see why they'd use a lot of page cache. I would assume opening a case makes sense. I didn't really understand if it's a Gluster or RHV issue though. You should be able to run fio on the hosts to eliminate those. I personally like https://github.com/pcuzner/fio-tools Y. > > Thanks in advance all! > > -b > > --- > Note: This list is intended for discussions relating to Hyperconvergence > and Red Hat Storage products, customers and/or support. >
_______________________________________________ Gluster-devel mailing list Gluster-devel@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-devel