Actually I have those exact cards and I'm not seeing your problem but getting 
those cards to work was a major pain in the rear end.  I much prefer the 
Myricom cards but for this HP server pair I got stuck using the HP cards due to 
a political issue.
 
Anyways, some of the things I found out about these cards might be of help to 
you.  We use SuSE here but doing the same for RedHat shouldn't be much of a 
problem.  The biggest issue is that these cards get very hot and can over heat 
easily if they don't have a good amount of airflow.  Once they begin to 
overheat packets disapper and things fall apart.  Since you are seeing stalls 
after a bit of a run I would think that you might be having an overheating 
issue.
 
Also, the driver that comes with Linux kernel doesn't work very well so you 
need to get the HP driver and install it.  HOWEVER, you absolutely must use the 
driver version that match the firmware version.  If they are different things 
don't work and you can't even run the diagnostic tool.  Here I'm running 
firmware 4.0.516 and driver 4.0.516.
 
When I was trying to get these working I would setup long runs of netperf and 
iperf and see how hot I can get the cards and then run the diagnostic tool as 
it will tell you the temperature of the card.  I have found they start to freak 
out at about 85C.  After playing around with card position they run under load 
at 66C and seem to work fine with 27C ambient air temp. 
 
All in all I'm not very impressed with these cards but I got stuck using them 
in one place.
 
Hope the information helps a bit,
Morey


________________________________

From: [email protected] 
[mailto:[email protected]] On Behalf Of Mike Lovell
Sent: Wednesday, November 25, 2009 11:45 AM
To: James Larcombe
Cc: [email protected]
Subject: Re: [DRBD-user] 8.3.5 Stalling on sync


hrm. i thought i had heard of someone using drbd over 10 gig with netxen cards. 
i went looking for a few minutes and didn't find anything though. my 
recommendation would be try newer drivers either through compiling the drivers 
for you existing kernel or using a newer kernel. i don't have details on how to 
do that for your cards cause i have never used any 10 gig from hp or netxen. 
other than that, my only recommendation is new nics.

good luck

mike

James Larcombe wrote: 

        Hi Mike,

        

        The cards I'm using are HP NC522SFP Dual Port 10GbE Server Adapters 
with HP BLc 10Gb SR SFP+ Fiber Transceivers. I could try running these with 1GB 
Fiber cables instead of 10GB. 

        

        James

        

        From: Mike Lovell [mailto:[email protected]] 
        Sent: 25 November 2009 16:01
        To: James Larcombe
        Cc: [email protected]
        Subject: Re: [DRBD-user] 8.3.5 Stalling on sync

        

        nothing i tried tweaking in drbd.conf worked. the only thing that did 
was changing the 10gig interfaces. what cards are you using? i was using ones 
with an intel chip. the cards that i did get it to work with were from chelsio. 
in my previous thread on the list, someone mentioned that they had neterion 
cards working.
        
        mike
        
        James Larcombe wrote: 

        Hi Mike,

        

        Thanks for the quick response. Yes you are correct we are using 10gig 
fibre cards. I'm not sure we could change them though as the fibre modules used 
in them cost over £400 each.

        

        Is there anything I can tweak in the drbd.conf file to get these to 
work.

        

        James

        

        From: Mike Lovell [mailto:[email protected]] 
        Sent: 24 November 2009 17:49
        To: James Larcombe
        Cc: [email protected]
        Subject: Re: [DRBD-user] 8.3.5 Stalling on sync

        

        James Larcombe wrote: 

        Hi List,

        

        Please help. I have installed drbd 8.3.5 on Open Suse 11.1 (Kernel 
2.6.27.29-0.1). 

        

        I have run drbdadm create-md dbms-test on one node and create-md 
dbms-test2 on the other node. I then ran drbdadm up all on both nodes. I then 
ran drbdadm -- --overwrite-data-of-my-peer primary dbms-test on the first node 
and the same with dbms-test2 on the other node. They then run for a short while 
before stalling. I have tried older version without success and turning the 
sync rate down does not make any difference. Downing the resources and bringing 
back up starts the sync again but this then stalls quickly.

        

        I have attached /proc/drbd, /etc/drbd.conf and a section from 
/var/log/messages. Any pointers would be greatly appreciated.

        

        version: 8.3.5 (api:88/proto:86-91)

        GIT-hash: ded8cdf09b0efa1460e8ce7a72327c60ff2210fb build by 
r...@hp-tm-40, 2009-11-24 12:21:46

         0: cs:SyncSource ro:Secondary/Secondary ds:UpToDate/Inconsistent C 
r----

            ns:160896 nr:0 dw:0 dr:160896 al:0 bm:9 lo:1 pe:0 ua:0 ap:0 ep:1 
wo:b oos:926694296

                [>.] sync'ed:  0.1% (905040/905132)M      4972

                stalled

         1: cs:SyncTarget ro:Secondary/Secondary ds:Inconsistent/UpToDate C 
r----

            ns:0 nr:2173248 dw:2173248 dr:0 al:0 bm:132 lo:0 pe:29878 ua:0 ap:0 
ep:1 wo:b oos:777971256

                [>.] sync'ed:  0.3% (759736/761856)M

                Stalled

        
        what kind of network are you using between the two servers? this is 
almost the exact same behavior i had when i was trying to get drbd to work over 
10gig ethernet. turned out to be something in drbd didn't like something about 
the 10gig cards i had. i eventually had to change my network cards. what cards 
are you using? 1gig? 10gig? have you tried other cards? that is where i would 
look.
        
        mike

        


_______________________________________________
drbd-user mailing list
[email protected]
http://lists.linbit.com/mailman/listinfo/drbd-user

Reply via email to