WinOF 2.1 Release Candidate #5 (RC5) is available @ http://www.openfabrics.org/downloads/WinOF/v2.1-RC5/
*** WinOF is now installed into %ProgramFiles%\WinOF on all OS variants & architectures! Changes in RC5 from RC4 ----------------------- SVN Commits: 2390 ... 2450 Revision: 2450 Author: stansmith Date: 4:07:14 PM, Friday, September 18, 2009 Message: [DAPL2] wait for async processing thread to actually exit. DAPL doesn't actually wait for the async processing thread to exit before allowing the library to close. It will wait up to 10 seconds, which under heavy load isn't enough time. Since the thread is created by an application level thread, it will continue to run as long as the application runs. But if the application closes the library, then all library data and code is invalid, which can result in the thread running something that's not library code and accessing freed memory. With this change, I was able to run MPI ping-pong, 16 ranks on a single system (scm provider) without crashes 1300 times. Signed-off-by: Sean Hefty <[email protected]> ---- Modified : /gen1/branches/WOF2-1/ulp/dapl2/dapl/openib_cma/device.c Modified : /gen1/branches/WOF2-1/ulp/dapl2/dapl/openib_scm/device.c Revision: 2449 Author: stansmith Date: 4:04:33 PM, Friday, September 18, 2009 Message: [DAPL2] add cleanup/release code for timer thread dapl_set_timer() creates a thread to process timers for dat_ep_connect but provides no mechanism to destroy/exit during dapl library unload. Timers are initialized in library init code and should be released in the fini code. Add a dapl_timer_release call to the dapl_fini function to check state of timer thread and destroy before exiting. Signed-off-by: Arlin Davis <[email protected]> ---- Modified : /gen1/branches/WOF2-1/ulp/dapl2/dapl/common/dapl_timer_util.c Modified : /gen1/branches/WOF2-1/ulp/dapl2/dapl/common/dapl_timer_util.h Modified : /gen1/branches/WOF2-1/ulp/dapl2/dapl/udapl/dapl_init.c Revision: 2448 Author: stansmith Date: 4:02:31 PM, Friday, September 18, 2009 Message: [WinVerbs] fix crash accessing freed memory from async thread If an application exits while asynchronous accept processing is queued, it's possible for the async processing to access the IbCmId after it has been freed. A similar problem to this was fixed that dealt with accessing the verbs QP handle. A simpler, more generic solution to this problem is to handle application exit in the same manner as device removal, and lock the winverb provider lookup lists with exclusive access. Asynchronous operations that are in process will run to completion, and future operations will be blocked until the provider cleanup thread has completed. Once they run, they will fail to acquire a reference on the desired object, which should result in a Graceful failure. This avoids more complicated locking to use handles belonging to the lower level code. If a reference on an object can be acquired, the handle will be available for use until the reference is released. To handle IB CM callbacks, additional state checking is required to avoid processing CM events when we're trying to destroy the endpoint. Signed-off-by: Sean Hefty <[email protected]> ---- Modified : /gen1/branches/WOF2-1/core/winverbs/kernel/wv_ep.c Modified : /gen1/branches/WOF2-1/core/winverbs/kernel/wv_provider.c Modified : /gen1/branches/WOF2-1/core/winverbs/kernel/wv_qp.c Revision: 2447 Author: stansmith Date: 2:24:01 PM, Friday, September 18, 2009 Message: [WinOF] detect possible ConnectX HCA driver load failure and suggest examination of system event log to ascertain if invalid firmware is an issue. ---- Modified : /gen1/branches/WOF2-1/WinOF/WIX/CustomActions.vbs Modified : /gen1/trunk/WinOF/WIX/CustomActions.vbs Revision: 2439 Author: stansmith Date: 10:29:29 AM, Wednesday, September 16, 2009 Message: [IBAL] use non-pageable memory to prevent possible problems on power down. IBAL uses pageable memory to create PnP context. It can create possible problems in power down flows at the time of system contention. We saw a similar case at a customer. There is no strong evidence that this is what influenced, but with this patch IBAL will be more safe and at no cost. WinOF 2.1 testing has demonstrated that with this patch, infrequent (1 out of 10) power-down BSOD have disappeared. Found by Hobin Lee (Xsigo), signed off by Leo. ---- Modified : /gen1/branches/WOF2-1/core/al/kernel/al_pnp.c Revision: 2428 Author: stansmith Date: 4:07:05 PM, Tuesday, September 15, 2009 Message: [WinOF] Streamline WinOF uninstall such that is plays nicely with MSFT PNP. 1) allow PNP to remove .inf referenced files and cleanup driver store. 2) shutdown ND & WSD prior to PNP device removal. 3) remove stale code which checks for OpenIB installs and forces a reboot ---- Modified : /gen1/branches/WOF2-1/WinOF/WIX/CustomActions.vbs Modified : /gen1/branches/WOF2-1/WinOF/WIX/common/InstallExecuteSeq.inc Revision: 2427 Author: stansmith Date: 4:02:17 PM, Tuesday, September 15, 2009 Message: [DAPL2+Winverbs] use private heaps for debug + local control. ---- Modified : /gen1/branches/WOF2-1/core/winverbs/user/wv_main.cpp Modified : /gen1/branches/WOF2-1/core/winverbs/user/wv_memory.h Modified : /gen1/branches/WOF2-1/ulp/dapl2/dapl/openib_scm/cm.c Modified : /gen1/branches/WOF2-1/ulp/dapl2/dapl/udapl/windows/dapl_osd.c Modified : /gen1/branches/WOF2-1/ulp/dapl2/dapl/udapl/windows/dapl_osd.h Modified : /gen1/branches/WOF2-1/ulp/dapl2/dat/udat/windows/dat_osd.c Modified : /gen1/branches/WOF2-1/ulp/dapl2/dat/udat/windows/dat_osd.h Modified : /gen1/branches/WOF2-1/ulp/libibverbs/src/ibv_main.cpp Modified : /gen1/branches/WOF2-1/ulp/libibverbs/src/ibverbs.h Modified : /gen1/branches/WOF2-1/ulp/librdmacm/src/cma.h Modified : /gen1/branches/WOF2-1/ulp/librdmacm/src/cma_main.cpp Revision: 2426 Author: stansmith Date: 2:23:13 PM, Wednesday, September 09, 2009 Message: [WinOF] allow 64-bit installer, remove win64=no and default to Product/Platform specification. ---- Modified : /gen1/branches/WOF2-1/WinOF/WIX/common/OpenSM_service.inc Revision: 2425 Author: stansmith Date: 2:20:34 PM, Wednesday, September 09, 2009 Message: [WinOF] Correct typo RemoveShorcutFolder --> ProgramMenuDir so ProgramMenu always gets deleted. ---- Modified : /gen1/branches/WOF2-1/WinOF/WIX/common/Docs.inc Revision: 2424 Author: stansmith Date: 2:18:00 PM, Wednesday, September 09, 2009 Message: [WinOF] remove redundant Root spec as TARGETDIR implies '%WindowsVolume%\' ---- Modified : /gen1/branches/WOF2-1/WinOF/WIX/win7/ia64/wof.wxs Modified : /gen1/branches/WOF2-1/WinOF/WIX/win7/x64/wof.wxs Modified : /gen1/branches/WOF2-1/WinOF/WIX/win7/x86/wof.wxs Modified : /gen1/branches/WOF2-1/WinOF/WIX/wlh/ia64/wof.wxs Modified : /gen1/branches/WOF2-1/WinOF/WIX/wlh/x64/wof.wxs Modified : /gen1/branches/WOF2-1/WinOF/WIX/wlh/x86/wof.wxs Modified : /gen1/branches/WOF2-1/WinOF/WIX/wnet/ia64/wof.wxs Modified : /gen1/branches/WOF2-1/WinOF/WIX/wnet/x64/wof.wxs Modified : /gen1/branches/WOF2-1/WinOF/WIX/wnet/x86/wof.wxs Modified : /gen1/branches/WOF2-1/WinOF/WIX/wxp/x86/wof.wxs Revision: 2418 Author: stansmith Date: 9:44:42 AM, Wednesday, September 02, 2009 Message: [IBBUS,COMPLIB] Eliminate re-initialization of the stop lock. Crash reported upon running "System Common Scenario" WHQL test with our stack. The crash: C4 (0xd7), which means Driver Verifier revealed a re-initializing of Remove Lock. Signed-off by Leonid Keller [email protected] ---- Modified : /gen1/branches/WOF2-1/core/bus/kernel/bus_pnp.c Modified : /gen1/branches/WOF2-1/core/complib/kernel/cl_pnp_po.c Revision: 2402 Author: stansmith Date: 3:00:33 PM, Tuesday, September 01, 2009 Message: [WinOF] Shutdown NetworkDirect and Winsock direct before DIFxApp removes devices. Makes sure no lingering device references are on the IB stack which would prevent components from being removed. Moved ND/WSD shutdown into separate CustomAction called before MsiProcessDevices. ---- Modified : /gen1/branches/WOF2-1/WinOF/WIX/CustomActions.vbs Modified : /gen1/branches/WOF2-1/WinOF/WIX/common/InstallExecuteSeq.inc Revision: 2401 Author: stansmith Date: 2:56:34 PM, Tuesday, September 01, 2009 Message: [DAPL2] udapl/scm: convert error code into dapl error code Intel MPI checks the uDAPL error code when calling dat_psp_create() to see if the port number that it provides is in use or not. Convert winsock error codes to unix errno values. This fixes the following error reported by Intel MPI: 'DAPL provider is not found and fallback device is not enabled' Signed-off-by: Sean Hefty <[email protected]> ---- Modified : /gen1/branches/WOF2-1/ulp/dapl2/dapl/openib_scm/cm.c Revision: 2400 Author: stansmith Date: 2:53:35 PM, Tuesday, September 01, 2009 Message: [WINMAD] winmad: allocate registration struct from NonPagedPool. Apparently data structures that are accessed from within MAD callbacks must be allocated from NonPagedPool. Allocated the WM_REGISTRATION structure from non paged pool. Signed-off-by: Sean Hefty <[email protected]> ---- Modified : /gen1/branches/WOF2-1/core/winmad/kernel/wm_reg.c Revision: 2399 Author: stansmith Date: 6:07:33 PM, Friday, August 28, 2009 Message: [WinOF] Install Librdmacm.dll in a consistent place for all installs (%windir%). After 2.1, explore installing .dll into [SYSTEM] folder. ---- Modified : /gen1/branches/WOF2-1/WinOF/WIX/common/winverbs_OFED.inc Revision: 2398 Author: stansmith Date: 6:03:54 PM, Friday, August 28, 2009 Message: [WinOF] Add WinOF to Command Window name to distinguish it from other Command Windows as Svr 2008 likes to add recently used commands to the start menu. signed off by [email protected] ---- Modified : /gen1/branches/WOF2-1/WinOF/WIX/common/Docs.inc Revision: 2397 Author: stansmith Date: 6:01:14 PM, Friday, August 28, 2009 Message: [WINVERBS] should have been pat of Revision: 2391; DllMain is called multiple times for a given process. Prevent double initialization of critical sections by only initializing it during process attach. This avoids corrupting the critical section while it may be in use. Signed-off-by: Sean Hefty <[email protected]> ---- Modified : /gen1/branches/WOF2-1/ulp/librdmacm/src/cma_main.cpp Revision: 2396 Author: stansmith Date: 5:58:34 PM, Friday, August 28, 2009 Message: [WINVERBS] winverbs: fix race in async connect handling. If an application calls Connect or Accept, their IRP is queued to a work queue for asynchronous processing. However, if the application crashes or exits before the work queue can process the IRP, the cleanup code will call WvEpFree(). This destroys the IbCmId. When the work queue finally runs, it can access a freed IbCmId. This is bad. A similar race exists with the QP and the asynchronous disconnect processing. The disconnect processing can access a the hVerbsQp handle after it has been destroyed. Additionally, in all three cases, the IRPs assume that the WV provider is able to process IRPs. Specifically, they require that the index tables maintained by the provider are still valid. References must be held on the WV provider until the IRPs finish their processing to ensure this. Fix invalid accesses to the IbCmId and hVerbsQp handles by locking around their use after valid state checks. In the case of the QP, we add a guarded mutex for synchronization purposes and use that in place where the PD mutex had been used. Signed-off-by: Sean Hefty <[email protected]> ---- Modified : /gen1/branches/WOF2-1/core/winverbs/kernel/wv_ep.c Modified : /gen1/branches/WOF2-1/core/winverbs/kernel/wv_ep.h Modified : /gen1/branches/WOF2-1/core/winverbs/kernel/wv_qp.c Modified : /gen1/branches/WOF2-1/core/winverbs/kernel/wv_qp.h Revision: 2395 Author: stansmith Date: 5:52:02 PM, Friday, August 28, 2009 Message: [WINVERBS] To help match memory allocations with free, replace ExFreePool with ExFreePoolWithTag. Signed-off-by: Sean Hefty <[email protected]> ---- Modified : /gen1/branches/WOF2-1/core/winmad/kernel/wm_driver.c Modified : /gen1/branches/WOF2-1/core/winmad/kernel/wm_reg.c Modified : /gen1/branches/WOF2-1/core/winverbs/kernel/wv_cq.c Modified : /gen1/branches/WOF2-1/core/winverbs/kernel/wv_srq.c Modified : /gen1/branches/WOF2-1/etc/kernel/work_queue.c Revision: 2394 Author: stansmith Date: 5:44:58 PM, Friday, August 28, 2009 Message: [WINVERBS] Endpoints are not maintained in a list associated with a provider. The list entry for an endpoint is used to track connection requests with listens. When an endpoint is unassociated from a listen, it is removed from the listen list. Trying to remove it from a list during provider cleanup results in a duplicate removal, can corrupt the listen list, and may access freed memory. Signed-off-by: Sean Hefty <[email protected]> ---- Modified : /gen1/branches/WOF2-1/core/winverbs/kernel/wv_provider.c Revision: 2393 Author: stansmith Date: 5:42:29 PM, Friday, August 28, 2009 Message: [WINVERBS] The winverbs PD structure contains both an event and a guarded mutex. Both must be allocated as part of resident memory, or vague system corruptions may occur if their memory is paged out. The fix is to allocate the PD structure from NonPagedPool. Signed-off-by: Sean Hefty <[email protected]> ---- Modified : /gen1/branches/WOF2-1/core/winverbs/kernel/wv_pd.c Revision: 2392 Author: stansmith Date: 5:39:47 PM, Friday, August 28, 2009 Message: [WINVERBS] Fix a memory leak. We need to free the port array, which is allocated separately from the device structure. Signed-off-by: Sean Hefty <[email protected]> ---- Modified : /gen1/branches/WOF2-1/core/winverbs/kernel/wv_device.c Revision: 2391 Author: stansmith Date: 5:37:08 PM, Friday, August 28, 2009 Message: [WINVERBS] DllMain is called multiple times for a given process. Prevent double initialization of critical sections by only initializing it during process attach. This avoids corrupting the critical section while it may be in use. Signed-off-by: Sean Hefty <[email protected]> ---- Modified : /gen1/branches/WOF2-1/ulp/dapl2/dapl/openib_cma/cm.c Modified : /gen1/branches/WOF2-1/ulp/libibverbs/src/ibv_main.cpp Revision: 2390 Author: stansmith Date: 5:33:41 PM, Friday, August 28, 2009 Message: [WinOF] All installs now install into 'Program Files' and not 'Program Files (x86)'. Cleanup references to \Program Files (x86)\WinOF. ---- Modified : /gen1/branches/WOF2-1/WinOF/WIX/HPC/cert-add.bat Modified : /gen1/branches/WOF2-1/WinOF/WIX/README_release.txt Modified : /gen1/branches/WOF2-1/WinOF/WIX/Release_notes.htm Modified : /gen1/branches/WOF2-1/WinOF/WIX/dat.conf Modified : /gen1/branches/WOF2-1/WinOF/WIX/ia64/Command Window.lnk Modified : /gen1/branches/WOF2-1/WinOF/WIX/win7/ia64/wof.wxs Modified : /gen1/branches/WOF2-1/WinOF/WIX/win7/x64/wof.wxs Modified : /gen1/branches/WOF2-1/WinOF/WIX/wlh/ia64/wof.wxs Modified : /gen1/branches/WOF2-1/WinOF/WIX/wlh/x64/wof.wxs Modified : /gen1/branches/WOF2-1/WinOF/WIX/wnet/ia64/wof.wxs Modified : /gen1/branches/WOF2-1/WinOF/WIX/wnet/x64/wof.wxs Modified : /gen1/branches/WOF2-1/WinOF/WIX/x64/Command Window.lnk Modified : /gen1/branches/WOF2-1/ulp/dapl2/test/dapltest/scripts/dt-cli.bat Modified : /gen1/branches/WOF2-1/ulp/dapl2/test/dapltest/scripts/dt-svr.bat _______________________________________________ ofw mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ofw
