> We have the following code commented out at umad_receiver_stop:
> 
>         /* XXX hangs current thread - suspect umad_recv() ignoring wakeup.
> 
>         cl_thread_destroy(&p_ur->tid);
> 
>         */

This definitely looks like it will hang.  cl_thread_destroy() does:

void
cl_thread_destroy( 
        IN      cl_thread_t* const      p_thread )
{
        CL_ASSERT( p_thread );

        if( !p_thread->osd.h_thread )
                return;

        /* Wait for the thread to exit. */
        WaitForSingleObject( p_thread->osd.h_thread, INFINITE );

so, it immediately waits for some other action.  Opensm calls umad_recv with an 
infinite timeout as well, and nothing signals that thread to exit.  I don't see 
that Windows provides any way to signal a thread directly, or that Windows umad 
provides a way for a user to wake up the thread short of sending itself a MAD.

The best fix I can think of is to expose a new call on windows, 
umad_cancel_recv(), that umad_receiver_stop() can call before calling 
cl_thread_destroy().

- Sean
_______________________________________________
ofw mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ofw

Reply via email to