Kyle, Yes I think we need some form of fail-over capability with multi-port NICs in orangefs for HA. As the number of I/O servers grow, the odds of some kind of hardware failure increases. Network errors in this brave new world should be expected and tolerated as much as possible. This might be a good step in that direction.
-Randy On 1/31/12 10:27 AM, "Kyle Schochenmaier" <[email protected]> wrote: >Hi Vlad, All - > >A couple comments.. >You can probably just hardcode to port 2 to force things onto port 2, >feel free to test it out, just be sure to rebuild and push out all of >the client and server binaries so they all play nicely with eachother. > >Also, I don't think we can implement port bonding at this level, it >would require quite a bit of work and synchronization which could put >in enough overhead to make it not perform significantly faster.. I >guess the only way to tell would be to try it but Im going to suspect >that it might not be beneficial. > >Now, >The other thing you mentioned had to do with port fail-over I >believe..this is why I'm bringing in the dev list here. Currently I >believe the standard practice across all interconnects using bmi is to >have a hard fail whenever a particular port configuration fails to >come up initially. > >But I know we're going to be making a push into HA with orangefs soon >so I am wondering what peoples thoughts are here? >Is this something that would need to be implemented anyways, does it >fit the HA scheme that is being examined for orangefs? >Thoughts? > > >Kyle Schochenmaier > > > >On Tue, Jan 31, 2012 at 2:25 AM, vlad <[email protected]> wrote: >> Dear Kyle, >> >> >>> I dont think we ever got around to testing this with multiple ports >>> active on each HCA when we wrote it, so I believe I hard coded it to >>> just default to the first port... iirc we tried to bring up the 2nd >>> port at one point and found that there were some memory exhaustion >>> issues when using more than one port AND the default HCA >>> buffers/MTUs/etc on the cards that this was primarily tested on so it >>> went back to 1. >>> >>> I wouldn't recommend changing this via a hard code for obvious >>> reasons, but at the same time it probably wouldn't take more than >>> 20-30 lines of code to fix this up to take more than one port. I'll >>> try to take a look at it. >> >> >> Thanks, I never thought about bonding the 2 infiniband ports. >> >> It is absolutely sufficient for me to swap port 1 for port 2. I never >>had >> the intention of using both ports simultaneusly. >> Could this be achieved by changing IBV_PORT to "2" instead of "1" ? >> >> For the new code wishlist (if I may ask for it ..): >> >> It would be nice to be able to define the infiniband port in the config >> file and to have a defined fallback to the other infiniband port, if the >> 1st one does not work. >> >> (example connection should go over ib0,ib1://host:port/service) >> >> But that is not very urgent to have . >> >> Thanks for all the help, >> >> Greetings, >> >> Vlad > >_______________________________________________ >Pvfs2-developers mailing list >[email protected] >http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers _______________________________________________ Pvfs2-developers mailing list [email protected] http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
