On 01/05/20(Fri) 12:13, Anton Lindqvist wrote:
> The order in which the pty master/slave is closed seems to be the
> trigger here. While not duping the master, it's closed before the slave.
> In the opposite scenario, the slave is closed before the master. While
> closing the slave, it ends up here expressed as a simplified backtrace:
> 
>   tsleep()
>   ttysleep()
>   ttywait()
>   ttywflush()
>   ttylclose()
>   ptsclose()
>   fdfree()
>   exit1()
> 
> In order words, it ends up doing a tsleep(INFSLP) causing the thread to
> hang. Note that this is not the case when the master is closed before
> the slave since `tp->t_oproc == NULL' causing ttywait() to bail early.

Why is the sleeper never awaken?  Does that mean a ttwakeup() is missing?

> NetBSD does a sleep with a timeout in ttywflush(). I've applied the same
> approach in the diff below which does fix the hang.

This seems like a racy workaround for a bug that we do not fully
understand.  If this is a proper solution I'd be happy to understand
why.  If we go with such fix we should be using a value in "nsecs"
instead of ticks and INFSLP should be used instead of 0.  We should
refrain from introducing new usages of `hz' ;)

Reply via email to