Hi guys, As an FYI, from discussion on gluster-devel IRC yesterday, the RDMA code still isn't in a good enough state for production usage with 3.4.0. :(
There are still outstanding bugs with it, and I'm working to make the Gluster Test Framework able to work with RDMA so we can help shake out more of them: http://www.gluster.org/community/documentation/index.php/Using_the_Gluster_Test_Framework Hopefully RDMA will be ready for 3.4.1, but don't hold me to that at this stage. :) Regards and best wishes, Justin Clift On 09/07/2013, at 8:36 PM, Ryan Aydelott wrote: > Matthew, > > Personally - I have experienced this same problem (even with the mount being > something.rdma). Running 3.4beta4, if I mounted a volume via RDMA that also > had TCP configured as a transport option (which obviously you do based on the > mounts you gave below), if there is ANY issue with RDMA not working the mount > will silently fall back to TCP. This problem is described here: > https://bugzilla.redhat.com/show_bug.cgi?id=982757 > > The way to test for this behavior is create a new volume specifying ONLY RDMA > as the transport. If you mount this and your RDMA is broken for whatever > reason - it will simply fail to mount. > > Assuming this test fails, I would then tail the logs for the volume to get a > hint of what's going on. In my case there was an RDMA_CM kernel module that > was not loaded which started to matter as of 3.4beta2 IIRC as they did a > complete rewrite for this based on poor performance in prior releases. The > clue in my volume log file was "no such file or directory" preceded with an > rdma_cm. > > Hope that helps! > > > -ryan > > > On Jul 9, 2013, at 2:03 PM, Matthew Nicholson <[email protected]> > wrote: > >> Hey guys, >> >> So, we're testing Gluster RDMA storage, and are having some issues. Things >> are working...just not as we expected them. THere isn't a whole lot in the >> way, that I've foudn on docs for gluster rdma, aside from basically "install >> gluster-rdma", create a volume with transport=rdma, and mount w/ >> transport=rdma.... >> >> I've done that...and the IB fabric is known to be good...however, a volume >> created with transport=rdma,tcp and mounted w/ transport=rdma, still seems >> to go over tcp? >> >> A little more info about the setup: >> >> we've got 10 storage nodes/bricks, each of which has a single 1GB NIC and a >> FRD IB port. Likewise for the test clients. Now, the 1GB nic is for >> management only, and we have all of the systems on this fabric configured >> with IPoIB, so there is eth0, and ib0 on each node. >> >> All storage nodes are peer'd using the ib0 interface, ie: >> >> gluster peer probe storage_node01-ib >> etc >> >> thats all well and good. >> >> Volume was created: >> >> gluster volume create holyscratch transport rdma,tcp >> holyscratch01-ib:/holyscratch01/brick >> for i in `seq -w 2 10` ; do gluster volume add-brick holyscratch >> holyscratch${i}-ib:/holyscratch${i}/brick; done >> >> yielding: >> >> Volume Name: holyscratch >> Type: Distribute >> Volume ID: 788e74dc-6ae2-4aa5-8252-2f30262f0141 >> Status: Started >> Number of Bricks: 10 >> Transport-type: tcp,rdma >> Bricks: >> Brick1: holyscratch01-ib:/holyscratch01/brick >> Brick2: holyscratch02-ib:/holyscratch02/brick >> Brick3: holyscratch03-ib:/holyscratch03/brick >> Brick4: holyscratch04-ib:/holyscratch04/brick >> Brick5: holyscratch05-ib:/holyscratch05/brick >> Brick6: holyscratch06-ib:/holyscratch06/brick >> Brick7: holyscratch07-ib:/holyscratch07/brick >> Brick8: holyscratch08-ib:/holyscratch08/brick >> Brick9: holyscratch09-ib:/holyscratch09/brick >> Brick10: holyscratch10-ib:/holyscratch10/brick >> Options Reconfigured: >> nfs.disable: on >> >> >> For testing, we wanted to see how rdma stacked up vs tcp using IPoIB, so we >> mounted this like: >> >> [root@holy2a01202 holyscratch.tcp]# df -h |grep holyscratch >> holyscratch:/holyscratch >> 273T 4.1T 269T 2% /n/holyscratch.tcp >> holyscratch:/holyscratch.rdma >> 273T 4.1T 269T 2% /n/holyscratch.rdma >> >> so, 2 mounts, same volume different transports. fstab looks like: >> >> holyscratch:/holyscratch /n/holyscratch.tcp glusterfs >> transport=tcp,fetch-attempts=10,gid-timeout=2,acl,_netdev 0 0 >> holyscratch:/holyscratch /n/holyscratch.rdma glusterfs >> transport=rdma,fetch-attempts=10,gid-timeout=2,acl,_netdev 0 0 >> >> where holyscratch is a RRDNS entry for all the IPoIB interfaces for fetching >> the volfile (something it seems, just like peering, MUST be tcp? ) >> >> but, again, when running just dumb,dumb,dumb tests (160 threads of dd over 8 >> nodes w/ each thread writing 64GB, so a 10TB throughput test), I'm seeing >> all the traffic on the IPoIB interface for both RDMA and TCP >> transports...when i really shouldn't be seeing ANY tcp traffic, aside from >> volfile fetches/management on the IPoIB interface when using RDMA as a >> transport...right? As a result, from early tests (the bigger 10TB ones are >> running now), the tpc and rdma speeds were basically the same...when i would >> expect the RDMA one to be at least slightly faster... >> >> >> Oh, and this is all 3.4beta4, on both the clients and storage nodes. >> >> So, I guess my questions are: >> >> Is this expected/normal? >> Is peering/volfile fetching always tcp based? >> How should one peer nodes in a RDMA setup? >> Should this be tried with only RDMA as a transport on the volume? >> Are there more detailed docs for RDMA gluster coming w/ the 3.4 release? >> >> >> -- >> Matthew Nicholson >> Research Computing Specialist >> Harvard FAS Research Computing >> [email protected] >> >> _______________________________________________ >> Gluster-users mailing list >> [email protected] >> http://supercolony.gluster.org/mailman/listinfo/gluster-users > > _______________________________________________ > Gluster-users mailing list > [email protected] > http://supercolony.gluster.org/mailman/listinfo/gluster-users -- Open Source and Standards @ Red Hat twitter.com/realjustinclift _______________________________________________ Gluster-users mailing list [email protected] http://supercolony.gluster.org/mailman/listinfo/gluster-users
