[ofa-general] Re: NFSRDMA connectathon prelim. testing status,

2009-02-23 Thread Tom Tucker

Vu:

What memory registration model are you using?

Vu Pham wrote:

Hi Tom,

I have both nfsrdma client and server on 2.6.29-rc5 kernel, 
nfs-utils-1.1.4. I'm using both Infinihost III (ib_mthca) and ConnectX 
(mlx4_ib) HCAs

I have seen several problems during my testing at NFS Connectathon 2009

1. When I used ConnectX (mlx4_ib) HCAs on both client and server, the 
client can not mount. Talking to Tom Talpey and scanning the code, I saw 
that xprtrdma module is using ib_reg_phys_mr() and mlx4_ib verbs 
provider does not have the implementation for this verb.
If I have client on mlx4_ib and server on ib_mthca, I hit the following 
crash because of bad error handling in xprtrdma (see file attached - 
mlx4_mount_problem.log)


Because of this problem, I use InfiniHost III (ib_mthca) for all of my 
tests at Connectathon


2. Testing Linux nfsrdma client against both Linux and OpenSolaris 
nfsrdma servers, I hit the process hung problem during the 
connectathon's lock test (seeing sync_page_1.log and sync_page_2.log 
attached files). I can only reproduce it when I ran connectathon more 
than 500 iterations (-N 1000)

I can NOT reproduce the problem with nfs client/server over IPoIB

3. Testing openSolaris nfsrdma client against linux nfsrdma server, I 
hit the following BUG_ON() right away(see file attached - svcrdma_send.log)


thanks,
-vu



___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[ofa-general] Re: NFSRDMA connectathon prelim. testing status,

2009-02-23 Thread Vu Pham

Tom,


Vu:

What memory registration model are you using?


It is 6 (when the connection/mount established)




Vu Pham wrote:

Hi Tom,

I have both nfsrdma client and server on 2.6.29-rc5 kernel, 
nfs-utils-1.1.4. I'm using both Infinihost III (ib_mthca) and 
ConnectX (mlx4_ib) HCAs

I have seen several problems during my testing at NFS Connectathon 2009

1. When I used ConnectX (mlx4_ib) HCAs on both client and server, the 
client can not mount. Talking to Tom Talpey and scanning the code, I 
saw that xprtrdma module is using ib_reg_phys_mr() and mlx4_ib verbs 
provider does not have the implementation for this verb.
If I have client on mlx4_ib and server on ib_mthca, I hit the 
following crash because of bad error handling in xprtrdma (see file 
attached - mlx4_mount_problem.log)


Because of this problem, I use InfiniHost III (ib_mthca) for all of 
my tests at Connectathon


2. Testing Linux nfsrdma client against both Linux and OpenSolaris 
nfsrdma servers, I hit the process hung problem during the 
connectathon's lock test (seeing sync_page_1.log and sync_page_2.log 
attached files). I can only reproduce it when I ran connectathon more 
than 500 iterations (-N 1000)

I can NOT reproduce the problem with nfs client/server over IPoIB

3. Testing openSolaris nfsrdma client against linux nfsrdma server, I 
hit the following BUG_ON() right away(see file attached - 
svcrdma_send.log)


thanks,
-vu





___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[ofa-general] Re: NFSRDMA connectathon prelim. testing status,

2009-02-23 Thread Tom Talpey
At 01:03 PM 2/23/2009, Vu Pham wrote:
Tom,

 Vu:

 What memory registration model are you using?

It is 6 (when the connection/mount established)

i.e. all physical (get_dma_mr). Long chunk lists due to discontiguous
physical pages.

We'll try with ConnectX and frmr's later today here at Connectathon.
This will reduce the chunk lists to roughly three entries (head, pages,
tail).

With the two assertions disabled, we're again passing all general and
special tests from the OpenSolaris client, btw. :-)

Tom.




 Vu Pham wrote:
 Hi Tom,

 I have both nfsrdma client and server on 2.6.29-rc5 kernel, 
 nfs-utils-1.1.4. I'm using both Infinihost III (ib_mthca) and 
 ConnectX (mlx4_ib) HCAs
 I have seen several problems during my testing at NFS Connectathon 2009

 1. When I used ConnectX (mlx4_ib) HCAs on both client and server, the 
 client can not mount. Talking to Tom Talpey and scanning the code, I 
 saw that xprtrdma module is using ib_reg_phys_mr() and mlx4_ib verbs 
 provider does not have the implementation for this verb.
 If I have client on mlx4_ib and server on ib_mthca, I hit the 
 following crash because of bad error handling in xprtrdma (see file 
 attached - mlx4_mount_problem.log)

 Because of this problem, I use InfiniHost III (ib_mthca) for all of 
 my tests at Connectathon

 2. Testing Linux nfsrdma client against both Linux and OpenSolaris 
 nfsrdma servers, I hit the process hung problem during the 
 connectathon's lock test (seeing sync_page_1.log and sync_page_2.log 
 attached files). I can only reproduce it when I ran connectathon more 
 than 500 iterations (-N 1000)
 I can NOT reproduce the problem with nfs client/server over IPoIB

 3. Testing openSolaris nfsrdma client against linux nfsrdma server, I 
 hit the following BUG_ON() right away(see file attached - 
 svcrdma_send.log)

 thanks,
 -vu





___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [ofa-general] Re: NFSRDMA connectathon prelim. testing status,

2009-02-23 Thread Vu Pham

Tom,


What memory registration model are you using?


It is 6 (when the connection/mount established)




Vu Pham wrote:



2. Testing Linux nfsrdma client against both Linux and OpenSolaris 
nfsrdma servers, I hit the process hung problem during the 
connectathon's lock test (seeing sync_page_1.log and sync_page_2.log 
attached files). I can only reproduce it when I ran connectathon 
more than 500 iterations (-N 1000)

I can NOT reproduce the problem with nfs client/server over IPoIB
With mem_reg=4, I can not reproduce this problem (running against both 
OpenSolaris and Linux servers.





3. Testing openSolaris nfsrdma client against linux nfsrdma server, 
I hit the following BUG_ON() right away(see file attached - 
svcrdma_send.log)


After disable two BUG_ON(), we can run test multiple times without 
problem yet


-vu
___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general