The question is, how did that page get unset? My understanding is that the get_user_pages() call in sdp_bz_setup() should have incremented the count for each page in the range. Since there is supposed to be only one thread doing bzcopy at a time (to preserve order), the sdp_bz_cleanup() ought to be able to call without checking. Simple netperf testing uses a single send thread so it should just work.
The problem that I am chasing now is that while a thread is sleeping waiting for credits, the socket is not locked. Another thread hops in and tries to do a zero copy. Since there is only one active context, and it is associated with the socket structure, it will be trampled by the second thread. Thread 2 blocks waiting for credit, thread 1 wakes up and decrements thread 2's page and bad things follow. I've got that fixed, but there are some cleanup issues that are not quite working. Thoughts? Thanks, JIm Jim Mott Mellanox Technologies Ltd. mail: [EMAIL PROTECTED] Phone: 512-294-5481 -----Original Message----- From: Ralph Campbell [mailto:[EMAIL PROTECTED] Sent: Friday, November 30, 2007 5:07 PM To: Jim Mott Cc: EWG Subject: [PATCH] IB/sdp Fix a kernel panic in put_page() that was passingNULL The new bzcopy_state() was trying to free unset bz->pages[i] entries. Signed-off-by: Dave Olson <[EMAIL PROTECTED]> diff --git a/drivers/infiniband/ulp/sdp/sdp_main.c b/drivers/infiniband/ulp/sdp/sdp_main.c index 809f7b8..35c4dd3 100644 --- a/drivers/infiniband/ulp/sdp/sdp_main.c +++ b/drivers/infiniband/ulp/sdp/sdp_main.c @@ -1212,7 +1212,8 @@ static inline struct bzcopy_state *sdp_bz_cleanup(struct bzcopy_state *bz) if (bz->pages) { for (i = bz->cur_page; i < bz->page_cnt; i++) - put_page(bz->pages[i]); + if (bz->pages[i]) + put_page(bz->pages[i]); kfree(bz->pages); } _______________________________________________ ewg mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
