Re: IPoIB issues

2010-03-10 Thread Moni Shoua
Eli Cohen wrote: I just posted a patch which might fix your problem. Please try it and let us know if it fixed anything. Hi Eli Although Josh already reported that the patch seems to fix the issue I have a question though. post_send failed prints were during work in datagram mode. I don't

Re: IPoIB issues

2010-03-10 Thread Eli Cohen
On Wed, Mar 10, 2010 at 05:30:38PM +0200, Moni Shoua wrote: Hi Eli Although Josh already reported that the patch seems to fix the issue I have a question though. post_send failed prints were during work in datagram mode. I don't know if Josh verified that but I don't expect that these

Re: IPoIB issues

2010-03-10 Thread Or Gerlitz
Eli Cohen wrote: The patch does not address these failures directly but maybe as a side effect they would go away too. The patch seems to solve a case of possible live lock happening in a node which has both CM and datagram neighbors e.g where ipoib have called netif_stop etc but there is now

Re: IPoIB issues

2010-03-10 Thread Eli Cohen
On Thu, Mar 11, 2010 at 09:47:31AM +0200, Or Gerlitz wrote: The patch does not address these failures directly but maybe as a side effect they would go away too. The patch seems to solve a case of possible live lock happening in a node which has both CM and datagram neighbors e.g where ipoib

Re: IPoIB issues

2010-03-03 Thread Josh England
I've applied the patch and initial testing has not produced any transmit timeout errors. I'll be doing some heavier testing in the next couple days, but it looks good so far. Thanks for the quick turn-around! -JE On Wed, Mar 3, 2010 at 4:29 AM, Eli Cohen e...@dev.mellanox.co.il wrote: I just

IPoIB issues

2010-03-02 Thread Josh England
Hello, I've been running into several issues using IPoIB. The 2 primary uses are for read-only NFS to the clients (over TCP) and access to an ethernet-connected parallel filesystem (Panasas) through router nodes passing IPoIB--10GbE. All nodes are running CentOS 5.3 and OFED 1.4.2, although a