Re: checkX problems
Thomas Dickey wrote: On Thu, 26 Nov 2009, Lothar Brendel wrote: [...] Hence, to make Cygwin/X+xterm run out of the box (using the start menu shortcut), you have to install the CJK fonts. One more noob-question, otoh, (discarding run-out-of-the-box, since that doesn't give a good solution), What's wrong with it running out-of-the-box? For somebody new to Cygwin it's a positive experience when the XWin Server shortcut actually opens an xterm. Which font-package does provide the CJK fonts? I tried several ones but up to now in vain. BTW: Following Ken's info (There are three packages: font-isas-misc, font-jis-misc, and font-daewoo-misc.), I found out to need font-daewoo-misc *and* font-isas-misc to make xterm happy. Ciao Lothar -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://x.cygwin.com/docs/ FAQ: http://x.cygwin.com/docs/faq/
Re: checkX problems
On Fri, 27 Nov 2009, Lothar Brendel wrote: Thomas Dickey wrote: On Thu, 26 Nov 2009, Lothar Brendel wrote: [...] Hence, to make Cygwin/X+xterm run out of the box (using the start menu shortcut), you have to install the CJK fonts. One more noob-question, otoh, (discarding run-out-of-the-box, since that doesn't give a good solution), What's wrong with it running out-of-the-box? For somebody new to Cygwin it's a positive experience when the XWin Server shortcut actually opens an xterm. out-of-the-box apparently doesn't include the small fix to make it work with Cygwin's configuration. (It wouldn't be in upstream since that would potentially break other packages which can add localized menus, and there doesn't appear to be a programmatic way for xterm to decide if it should apply the workaround). -- Thomas E. Dickey http://invisible-island.net ftp://invisible-island.net -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://x.cygwin.com/docs/ FAQ: http://x.cygwin.com/docs/faq/
Re: checkX problems
On Thu, 26 Nov 2009, Lothar Brendel wrote: Charles Wilson wrote: Lothar Brendel wrote: Unfortunately the situatiuon with ``startxwin.bat'' is worse now: * ``checkX -t 12'' still doesn't wait (?!?) I can't reproduce this. Stupid me, sorry. When updating to pull in libustr1, run2 was accidently reverted to 0.3.0-1. * After again inserting a sleep between checkXing and starting the xterm, the latter is marginally successful: The process is shown as running but no xterm is showing up :-( That's an xterm/XWin issue. Errh, yes. Hence, to make Cygwin/X+xterm run out of the box (using the start menu shortcut), you have to install the CJK fonts. One more noob-question, otoh, (discarding run-out-of-the-box, since that doesn't give a good solution), see the comment here about menuLocale: http://invisible-island.net/xterm/xterm.log.html#xterm_224 sorry: Which font-package does provide the CJK fonts? I tried several ones but up to now in vain. Ciao Lothar -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://x.cygwin.com/docs/ FAQ: http://x.cygwin.com/docs/faq/ -- Thomas E. Dickey http://invisible-island.net ftp://invisible-island.net -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://x.cygwin.com/docs/ FAQ: http://x.cygwin.com/docs/faq/
Re: checkX problems
On 11/26/2009 2:30 AM, Lothar Brendel wrote: Errh, yes. Hence, to make Cygwin/X+xterm run out of the box (using the start menu shortcut), you have to install the CJK fonts. One more noob-question, sorry: Which font-package does provide the CJK fonts? I tried several ones but up to now in vain. There are three packages: font-isas-misc, font-jis-misc, and font-daewoo-misc. -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://x.cygwin.com/docs/ FAQ: http://x.cygwin.com/docs/faq/
Re: checkX problems
Charles Wilson wrote: Lothar Brendel wrote: It should list, but it doesn't: $ grep -A9 '@ run2' setup-2.ini ^^^ This was the clue. As it happens, the union mount stuff had an override for setup.hint, but not the entire directory. So, the tarballs themselves magically showed up in the release-2 area when I installed them in the release/ area, but release-2 retained the old setup.hint. Fixed. ACK. libustr1 was pulled in now. Unfortunately the situatiuon with ``startxwin.bat'' is worse now: * ``checkX -t 12'' still doesn't wait (?!?) * After again inserting a sleep between checkXing and starting the xterm, the latter is marginally successful: The process is shown as running but no xterm is showing up :-( I really would investigate this further, but I only get diagnostic output from ``checkX'' (--verbose or --debug) when running it from within an xterm, and that's obviously pointless. Thus, how to obtain output from ``checkX`` in Windows' Command Prompt, how to get it in Cygwin's bash window? Asks Lothar -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://x.cygwin.com/docs/ FAQ: http://x.cygwin.com/docs/faq/
Re: checkX problems
Lothar Brendel wrote: Unfortunately the situatiuon with ``startxwin.bat'' is worse now: * ``checkX -t 12'' still doesn't wait (?!?) I can't reproduce this. * After again inserting a sleep between checkXing and starting the xterm, the latter is marginally successful: The process is shown as running but no xterm is showing up :-( That's an xterm/XWin issue. I really would investigate this further, but I only get diagnostic output from ``checkX'' (--verbose or --debug) when running it from within an xterm, and that's obviously pointless. Thus, how to obtain output from ``checkX`` in Windows' Command Prompt, how to get it in Cygwin's bash window? checkX --notty --debug -t 12 -- Chuck -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://x.cygwin.com/docs/ FAQ: http://x.cygwin.com/docs/faq/
Re: checkX problems
On 11/25/2009 8:18 AM, Charles Wilson wrote: Lothar Brendel wrote: * After again inserting a sleep between checkXing and starting the xterm, the latter is marginally successful: The process is shown as running but no xterm is showing up :-( That's an xterm/XWin issue. And it's been discussed in several recent threads. A summary of workarounds can be found in http://cygwin.com/ml/cygwin-xfree/2009-11/msg00174.html Ken -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://x.cygwin.com/docs/ FAQ: http://x.cygwin.com/docs/faq/
Re: checkX problems
Charles Wilson wrote: Lothar Brendel wrote: Unfortunately the situatiuon with ``startxwin.bat'' is worse now: * ``checkX -t 12'' still doesn't wait (?!?) I can't reproduce this. Stupid me, sorry. When updating to pull in libustr1, run2 was accidently reverted to 0.3.0-1. * After again inserting a sleep between checkXing and starting the xterm, the latter is marginally successful: The process is shown as running but no xterm is showing up :-( That's an xterm/XWin issue. Errh, yes. Hence, to make Cygwin/X+xterm run out of the box (using the start menu shortcut), you have to install the CJK fonts. One more noob-question, sorry: Which font-package does provide the CJK fonts? I tried several ones but up to now in vain. Ciao Lothar -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://x.cygwin.com/docs/ FAQ: http://x.cygwin.com/docs/faq/
Re: checkX problems
Charles Wilson wrote: I've integrated Lothar's patch into run2/checkX (along with some other internal changes), and published a test release. Please try run-0.3.1-1 and let me know if it fixes your problems with checkX. checkX fails due to a missing cygustr-1.dll. That's contained in which package? Asks Lothar -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://x.cygwin.com/docs/ FAQ: http://x.cygwin.com/docs/faq/
Re: checkX problems
Lothar Brendel wrote: checkX fails due to a missing cygustr-1.dll. That's contained in which package? From http://cygwin.com/packages/ and typing in 'cygustr-1.dll', I get: libustr1 This *should* have been installed by setup automatically, as the run2 package now lists libustr1 as a dependency. -- Chuck -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://x.cygwin.com/docs/ FAQ: http://x.cygwin.com/docs/faq/
Re: checkX problems
Charles Wilson wrote: Lothar Brendel wrote: checkX fails due to a missing cygustr-1.dll. That's contained in which package? From http://cygwin.com/packages/ and typing in 'cygustr-1.dll', I get: Great, thanx for that one. This *should* have been installed by setup automatically, as the run2 package now lists libustr1 as a dependency. It should list, but it doesn't: $ grep -A9 '@ run2' setup-2.ini @ run2 sdesc: An enhanced version of the 'run' application launcher ldesc: Launches console applications without an console. Uses an xml configuration file to control environment settings and target command line options. Optionally, checks for a running X server and launches one of two alternate targets based on X server status. Also provides the checkX utility. category: Utils requires: cygwin libxml2 libiconv2 zlib0 version: 0.3.1-1 Ciao Lothar -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://x.cygwin.com/docs/ FAQ: http://x.cygwin.com/docs/faq/
Re: checkX problems
Lothar Brendel wrote: It should list, but it doesn't: $ grep -A9 '@ run2' setup-2.ini ^^^ This was the clue. As it happens, the union mount stuff had an override for setup.hint, but not the entire directory. So, the tarballs themselves magically showed up in the release-2 area when I installed them in the release/ area, but release-2 retained the old setup.hint. Fixed. Thanks for tracking it down. -- Chuck -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://x.cygwin.com/docs/ FAQ: http://x.cygwin.com/docs/faq/
Re: checkX problems
On 20/11/2009 19:43, Ken Brown wrote: On 11/20/2009 12:14 PM, Charles Wilson wrote: I've integrated Lothar's patch into run2/checkX (along with some other internal changes), and published a test release. Please try run-0.3.1-1 and let me know if it fixes your problems with checkX. But the new behavior of the timeout option (my problem #1) works fine, with one caveat: If I start the X server with startxwin.bat, it immediately exits and claims that a server is already running. This was also reported earlier today by Jim Reisert: http://cygwin.com/ml/cygwin-xfree/2009-11/msg00158.html Not quite: It seems that report says that the internal multiwindow-mode window manager claims another window manager is running (although it's difficult to be absolutely sure as the exact error message isn't reported :-)) Could it be that checkX is tricking XWin into thinking that a different X server is running? (I have no idea how XWin decides whether an X server is running, so this may or may not be plausible.) Strangely, the problem doesn't occur if I use startxwin.sh instead of startxwin.bat. -- Jon TURNEY Volunteer Cygwin/X X Server maintainer -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://x.cygwin.com/docs/ FAQ: http://x.cygwin.com/docs/faq/
Re: checkX problems
I've integrated Lothar's patch into run2/checkX (along with some other internal changes), and published a test release. Please try run-0.3.1-1 and let me know if it fixes your problems with checkX. -- Chuck -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://x.cygwin.com/docs/ FAQ: http://x.cygwin.com/docs/faq/
Re: checkX problems
On 11/20/2009 12:14 PM, Charles Wilson wrote: I've integrated Lothar's patch into run2/checkX (along with some other internal changes), and published a test release. Please try run-0.3.1-1 and let me know if it fixes your problems with checkX. I still have the instability that I reported as problem #2 in http://cygwin.com/ml/cygwin-xfree/2009-10/msg00143.html but I wasn't expecting you to fix that. As I said later in the thread, I suspect BLODA. But the new behavior of the timeout option (my problem #1) works fine, with one caveat: If I start the X server with startxwin.bat, it immediately exits and claims that a server is already running. This was also reported earlier today by Jim Reisert: http://cygwin.com/ml/cygwin-xfree/2009-11/msg00158.html Could it be that checkX is tricking XWin into thinking that a different X server is running? (I have no idea how XWin decides whether an X server is running, so this may or may not be plausible.) Strangely, the problem doesn't occur if I use startxwin.sh instead of startxwin.bat. Thanks to Lothar and you for the timeout patch. Ken -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://x.cygwin.com/docs/ FAQ: http://x.cygwin.com/docs/faq/
Re: checkX problems
On Fri, Nov 20, 2009 at 02:43:11PM -0500, Ken Brown wrote: On 11/20/2009 12:14 PM, Charles Wilson wrote: I've integrated Lothar's patch into run2/checkX (along with some other internal changes), and published a test release. Please try run-0.3.1-1 and let me know if it fixes your problems with checkX. I still have the instability that I reported as problem #2 in http://cygwin.com/ml/cygwin-xfree/2009-10/msg00143.html but I wasn't expecting you to fix that. As I said later in the thread, I suspect BLODA. But the new behavior of the timeout option (my problem #1) works fine, with one caveat: If I start the X server with startxwin.bat, it immediately exits and claims that a server is already running. This was also reported earlier today by Jim Reisert: http://cygwin.com/ml/cygwin-xfree/2009-11/msg00158.html Could it be that checkX is tricking XWin into thinking that a different X server is running? (I have no idea how XWin decides whether an X server is running, so this may or may not be plausible.) Strangely, the problem doesn't occur if I use startxwin.sh instead of startxwin.bat. has it something to do with the -wait option to checkX that is present in the startxwin.bat? I had to remove that option, since it broke the start of the X server. And I now note it is not in the bash version. GJ -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://x.cygwin.com/docs/ FAQ: http://x.cygwin.com/docs/faq/
Re: checkX problems
On 11/20/2009 2:47 PM, Gertjan van Noord wrote: On Fri, Nov 20, 2009 at 02:43:11PM -0500, Ken Brown wrote: On 11/20/2009 12:14 PM, Charles Wilson wrote: I've integrated Lothar's patch into run2/checkX (along with some other internal changes), and published a test release. Please try run-0.3.1-1 and let me know if it fixes your problems with checkX. I still have the instability that I reported as problem #2 in http://cygwin.com/ml/cygwin-xfree/2009-10/msg00143.html but I wasn't expecting you to fix that. As I said later in the thread, I suspect BLODA. But the new behavior of the timeout option (my problem #1) works fine, with one caveat: If I start the X server with startxwin.bat, it immediately exits and claims that a server is already running. This was also reported earlier today by Jim Reisert: http://cygwin.com/ml/cygwin-xfree/2009-11/msg00158.html Could it be that checkX is tricking XWin into thinking that a different X server is running? (I have no idea how XWin decides whether an X server is running, so this may or may not be plausible.) Strangely, the problem doesn't occur if I use startxwin.sh instead of startxwin.bat. has it something to do with the -wait option to checkX that is present in the startxwin.bat? I had to remove that option, since it broke the start of the X server. And I now note it is not in the bash version. If you remove -wait but still start checkX with run, then you might as well not use checkX at all. (run will return immediately, leaving checkX running in the background.) Ken -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://x.cygwin.com/docs/ FAQ: http://x.cygwin.com/docs/faq/
Re: checkX problems
Charles Wilson wrote: [...] The call to XOpenDisplay can take up to 12 seconds. Suppose the main thread times out after say 5 seconds, and then just after that we have a *successful* return in the worker thread. The worker thread tries to get the mutex: + (*(data-xclosedis))(dpy); + pthread_mutex_lock (mtx_xopenOK); But the main thread, if you follow the timed-out codepath, never releases the mutex. Ok, this can be cured by if (pthread_cond_timedwait (cv_xopenOK, mtx_xopenOK, then) == ETIMEDOUT) { xopenOK = XSERV_TIMEDOUT; /* it's okay, we have the mutex */ xopenTrying = 0; /* allow open_display() to give up */ + pthread_mutex_unlock(mtx_xopenOK); /* allow for a last minute change */ }/* else open_display() was successful */ pthread_detach(id); /* leave open_display() on its own */ But the problem is not so much destroying a locked mutex, but rather locking a destroyed mutex, right? This happens in your race condition but also whenever ``delay'' is shorter than the time spent in a successful XOpenDisplay(). The failure doesn't really harm, but we can be less dirty by checking the result of pthread_mutex_unlock(), cf. the new patch. [...] So, I'm just going to leave it, and take your patch as-is. Maybe you consider the new one instead. Thanks! My pleasure! Ciao Lothar --- checkX.c-0.3.0 2009-06-15 02:29:07.0 +0200 +++ checkX.c 2009-11-15 12:32:24.0 +0100 @@ -32,6 +32,7 @@ #endif #include stdio.h +#include errno.h #if HAVE_SYS_TYPES_H # include sys/types.h @@ -102,7 +103,8 @@ static pthread_mutex_t mtx_xopenOK; static pthread_cond_t cv_xopenOK; -static int xopenOK = XSERV_TIMEDOUT; +static int xopenOK; +static int xopenTrying; static const char* XLIBfmt = cygX11-%d.dll; static const char* DefaultAppendPath = /usr/X11R6/bin SEP_CHAR /usr/bin; @@ -314,6 +316,9 @@ timespec_t delta; timespec_t then; + xopenTrying = delay!=0.0; /* false actually means: try once */ + xopenOK = XSERV_NOTFOUND; /* a pessimistic start out */ + computeTimespec(fabs(delay), delta); debugMsg(1, (%s) Using delay of %d secs, %ld nanosecs (%5.2f), __func__, delta.tv_sec, delta.tv_nsec, @@ -333,15 +338,15 @@ if (delay != 0.0) { clock_gettime(CLOCK_REALTIME, now); timerspec_add(now, delta, then); -pthread_cond_timedwait (cv_xopenOK, mtx_xopenOK, then); - } - - pthread_mutex_unlock(mtx_xopenOK); - - if (delay != 0.0) { -pthread_detach(id); +if (pthread_cond_timedwait (cv_xopenOK, mtx_xopenOK, then) == ETIMEDOUT) { + xopenOK = XSERV_TIMEDOUT; /* it's okay, we have the lock */ + xopenTrying = 0; /* allow open_display() to give up */ + pthread_mutex_unlock(mtx_xopenOK); /* but also allow for a last minute change */ +}/* else open_display() was successful */ +pthread_detach(id); /* leave it on its own */ } else { -pthread_join(id, (void*)status); +pthread_mutex_unlock(mtx_xopenOK); /* allow open_display() to set xopenOK */ +pthread_join(id, (void*)status); /* and wait for it */ } pthread_mutex_destroy(mtx_xopenOK); @@ -357,19 +362,20 @@ open_display(void* /* WorkerThreadData* */ v) { Display* dpy; - int rc = 0; WorkerThreadData* data = (WorkerThreadData*)v; - if( (dpy = (*(data-xopendis))(data-displayname)) == NULL ) { -rc = 1; - } else { -(*(data-xclosedis))(dpy); -rc = 0; - } - pthread_mutex_lock (mtx_xopenOK); - xopenOK = rc; - pthread_cond_signal(cv_xopenOK); - pthread_mutex_unlock (mtx_xopenOK); + do +if((dpy = (*(data-xopendis))(data-displayname))) { + (*(data-xclosedis))(dpy); + if (pthread_mutex_lock (mtx_xopenOK)) /* the mutex may already be destroyed */ + break; + else { + xopenOK = XSERV_FOUND; + pthread_cond_signal(cv_xopenOK); + pthread_mutex_unlock (mtx_xopenOK); + } +} + while (xopenTrying xopenOK == XSERV_NOTFOUND); pthread_exit((void*)0); } -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://x.cygwin.com/docs/ FAQ: http://x.cygwin.com/docs/faq/
Re: checkX problems
Lothar Brendel wrote: [...] The failure doesn't really harm, but we can be less dirty by checking the result of pthread_mutex_unlock(), cf. the new patch. Correction: I meant the result of pthread_mutex_lock() (in open_display()). Ciao Lothar -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://x.cygwin.com/docs/ FAQ: http://x.cygwin.com/docs/faq/
Re: checkX problems
Lothar Brendel wrote: Ok, this can be cured by if (pthread_cond_timedwait (cv_xopenOK, mtx_xopenOK, then) == ETIMEDOUT) { xopenOK = XSERV_TIMEDOUT; /* it's okay, we have the mutex */ xopenTrying = 0; /* allow open_display() to give up */ + pthread_mutex_unlock(mtx_xopenOK); /* allow for a last minute change */ }/* else open_display() was successful */ pthread_detach(id); /* leave open_display() on its own */ Not, that won't do it -- because now that you've unlocked the mutex, you can't guarantee that the worker thread hasn't locked it again by the time you try to destroy the mutex. You can't move the pthread_mutex_destroy call to the worker thread -- because what if there never IS a successful return from the call to XOpenDisplay? And finally -- a little bit later in the main thread you are going to USE the value of xopenOK to compute your return value -- but it's not an atomic operation so you don't know if the worker thread is going to change xopenOK's value in the middle of that operation. AFAICT, the cure for all of these problems is worse than the disease -- and the only *total* fix is for the main thread to always join() the worker. Which is precisely what we want to avoid. There are a few *minor* tweaks that could improve things, but I'm willing to go ahead with the current as a test release (as soon as my ITP for libustr is approved; it's a new dependency for run2). But the problem is not so much destroying a locked mutex, but rather locking a destroyed mutex, right? Well, frankly by the time that could happen I really don't care what the worker thread does, so long as it doesn't crash the whole process. We've already detached from it, and just want it to finish its call to XOpenDisplay() and terminate. This happens in your race condition but also whenever ``delay'' is shorter than the time spent in a successful XOpenDisplay(). The failure doesn't really harm, but we can be less dirty by checking the result of pthread_mutex_unlock(), cf. the new patch. No, I don't think we should unlock the mutex in the main thread, at least until after we've computed the return value. And once we DO unlock the mutex, there just is NO WAY to guarantee that the worker won't (successfully) lock the mutex, but not also have unlocked it, by the time we try to call pthread_mutex_destroy -- except by waiting until the worker thread exits (e.g. pthread_join()). Which we don't want to do. (If you try to relock the mutex in the main thread, you're right back where we started...) -- Chuck -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://x.cygwin.com/docs/ FAQ: http://x.cygwin.com/docs/faq/
Re: checkX problems
Charles Wilson wrote: [...] AFAICT, the cure for all of these problems is worse than the disease -- and the only *total* fix is for the main thread to always join() the worker. Which is precisely what we want to avoid. ACK and thanx for the explanations. Ciao Lothar -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://x.cygwin.com/docs/ FAQ: http://x.cygwin.com/docs/faq/
Re: checkX problems
Lothar Brendel wrote: Charles Wilson already set up this kind of infrastructure, I just had to introduce one more communication variable, cf. the patch below (positively tested on my system). Yep, there are really two different purposes for a setting a timeout [i) Just check whether an X server is available, but don't struggle with that too long. and ii) There *should* be an X server coming up, just be a little patient.], but now both can be achieved by choosing either a short or a long duration. Looks pretty good. There's still one problematic case -- but it actually already existed; your change doesn't make it any worse than what was already there. + do +if((dpy = (*(data-xopendis))(data-displayname))) { The call to XOpenDisplay can take up to 12 seconds. Suppose the main thread times out after say 5 seconds, and then just after that we have a *successful* return in the worker thread. The worker thread tries to get the mutex: + (*(data-xclosedis))(dpy); + pthread_mutex_lock (mtx_xopenOK); But the main thread, if you follow the timed-out codepath, never releases the mutex. Instead, it destroys it while still having it locked. Then, it evaluates xopenOK to compute the return value. The spec says: It shall be safe to destroy an initialized mutex that is unlocked. Attempting to destroy a locked mutex results in undefined behavior. So, the child thread might be stuck waiting for a mutex that has already been destroyed. That could be a problem -- but a very very rare one, I think. It only happens if you time out on the worker -- and THEN, before the main app gets to exit(), the worker successfully returns from XOpenDisplay. (If the main thread exits(), that should kill the worker thread...so it never gets a chance to return successfully or otherwise). + xopenOK = XSERV_FOUND; + pthread_cond_signal(cv_xopenOK); + pthread_mutex_unlock (mtx_xopenOK); +} + while (xopenTrying xopenOK == XSERV_NOTFOUND); pthread_exit((void*)0); } However, (a) it's working now, even if it is technically wrong to do it that way, and (b) it gets real complicated to figure out how to guarantee the mutex is unlocked, in both threads, before destroying it -- without forcing the calling thread to join() the worker, which is explicitly what we DON'T want to do in this case. So, I'm just going to leave it, and take your patch as-is. Thanks! -- Chuck -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://x.cygwin.com/docs/ FAQ: http://x.cygwin.com/docs/faq/
RE: checkX problems
From: cygwin-xfree-ow...@cygwin.com [mailto:cygwin-xfree- ow...@cygwin.com] On Behalf Of Lothar Brendel Could you please clarify an issue here? (Sorry, it seems, I wronged to ``run'' in the previous posts.) In a Windows command prompt (being somewhere on C:) I put the line \cygwin\bin\run -p /usr/bin sleep -wait 5 into a file ``dosleep.bat''. Executing that BAT-script (w/o any wrapper), it *does* sleep. Typing that very line directly at the prompt lets ``run'' return immediately, though. Can you confirm this behaviour? I can confirm that without testing (so I'm probably chomping foot here...). The sleep is holding the console open after run quits. This comes under the console programs must have a console heading. It takes a bit ti get used to, but you'll get it soon. Looking forward to reading your patches to address any of these problems. It shouldn't be too hard to add an option to checkX to make it retry if ECONNREFUSED. This would have to manually track the elapsed time for each attempt, charging against the specified -t waittime. Another possibility would be an option ``-n'' to specify the number of retries. GAH! No, that's just lame. Just spawn/fork a sleep-then-interrupt-daddy thread/process, set up a SIGINT handler that exits with an error, loop connection attempts until successful, check X, kill child, exit with success. That enforces both types of timeout. HTH, Mike
Re: checkX problems
On 30/10/2009 13:48, Ken Brown wrote: I'm having trouble with checkX. I haven't seen other people complain about this, so I assume it's something about my system, but I can't figure out what. There are two symptoms: 1. If I run checkX with a timeout, the timeout seems to be ignored. For example, with the X server *not* running: $ checkX -d 127.0.0.1:0.0 -t 100 --debug checkX.exe DEBUG: displayname : '127.0.0.1:0.0' checkX.exe DEBUG: opt_location: 0 checkX.exe DEBUG: opt_loglevel: 7 checkX.exe DEBUG: opt_nogui : 0 checkX.exe DEBUG: opt_notty : 0 checkX.exe DEBUG: opt_timeout : 100.00 checkX.exe DEBUG: (adjust_path) path is : /usr/local/texlive/2009/bin/i386-cygwin:/usr/local/bin:/usr/bin:/c/Program Files/ThinkPad/Utilities:/c/WINDOWS/system32:/c/WINDOWS:/c/WINDOWS/System32/Wbem:/c/Program Files/Intel/Wireless/Bin/:/c/Program Files/IBM ThinkVantage/Client Security Solution:/c/Program Files/ThinkPad/ConnectUtilities:/c/Program Files/QuickTime/QTSystem/:/c/Program Files/Common Files/Lenovo:/usr/lib/lapack:/usr/X11R6/bin:/usr/bin checkX.exe DEBUG: (find_X11_lib) DLL is /usr/bin/cygX11-6.dll checkX.exe DEBUG: (dlopen_X11_lib) /usr/bin/cygX11-6.dll dlopen'ed successfully. checkX.exe DEBUG: (load_X11_symbols) symbol XOpenDisplay loaded ok checkX.exe DEBUG: (load_X11_symbols) symbol XCloseDisplay loaded ok checkX.exe DEBUG: (try_with_timeout) Using delay of 100 secs, 0 nanosecs (100.00) checkX.exe DEBUG: (try_with_timeout) xserver search was unsuccessful checkX.exe Info: could not open X display '127.0.0.1:0.0' checkX.exe DEBUG: returning with status 1 checkX.exe Info: Exiting with status 1 The problem is that it returns within a second, in spite of the timeout. Or am I misunderstanding what the timeout is supposed to do? I think this is a misunderstanding here. Looking at the source, the timeout is the maximum time checkX will wait for an XOpenDisplay() to complete. If that fails immediately (e.g. due to with ECONNREFUSED), checkX will stop immediately. This is pretty reasonable. If there is nothing listening on the socket for the X server, that is not going to get better if we wait... ... except if the server happens to be starting up when we execute checkX. So, this is not quite what startxwin.bat requires, as the server may still be in the process of starting up. Fortunately, the X server binds it's socket pretty early in the startup, so this probably works pretty well, but in theory at least there is still a possible timing window in startxwin.bat. So it perhaps be useful if checkX retried the XOpenDisplay() periodically until the timeout was up (as xinit does) 2. If I start the X server by using the default startxwin.bat or startxwin.sh (both of which call checkX), the server is very unstable and crashes within a few minutes. This happens consistently, and it never happens if I comment out the line calling checkX. I think I was able to reproduce this problem (it is not how I normally start the X server) However, now I come back to look at this in detail, the problem no longer seems to exist. Are you still able to demonstrate it? I tried strace'ing checkX, but I don't know what to look for in the output. (I'll send it if it would be useful, but I don't want to spam the list otherwise.) I'm attaching cygcheck output. -- Jon TURNEY Volunteer Cygwin/X X Server maintainer -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://x.cygwin.com/docs/ FAQ: http://x.cygwin.com/docs/faq/
Re: checkX problems
On 12/11/2009 23:09, Lothar Brendel wrote: Jon TURNEY wrote: [...] Fortunately, the X server binds it's socket pretty early in the startup, so this probably works pretty well, but in theory at least there is still a possible timing window in startxwin.bat. Yep, and in my setup the X server *always* comes up too late. So it perhaps be useful if checkX retried the XOpenDisplay() periodically until the timeout was up (as xinit does) In principle, the script calling checkX could do that, because ``checkX'' has return status 1 if it couldn't connect. But as I already pointed out (http://www.mail-archive.com/cygwin-xfree@cygwin.com/msg19529.html) ``startxwin.bat'' uses ``run'' as a wrapper for ``checkX''. = Definitely no waiting and no passing on of the status of ``checkX'' to %errorlevel%, as ``run'' immediately goes background (unless called from an xterm, dunno why). But why would you fix the timeout problem by doing some horrible horrible iteration in the DOS batch script (horrible) (which would probably have to busy-wait (also horrible) as there's no sleep command), when you can fix checkX, which has the bonus of fixing other uses of it? Since more people seem to have this problem (cf. also Olivia's post), I repeat my question (essentially already posed by Ken Brown: http://www.mail-archive.com/cygwin-xfree@cygwin.com/msg19402.html): Why using ``run'' at all? If we really need a wrapper (do we?) wouldn't ``sh'' be a better one? I guess we use run for the reason run exists: to hide the console window, which otherwise would be shown? Perhaps if you look how startxwin.bat is started from the start menu shortcut, (as 'run startxwin.bat') you can see why this might be useful? To push this even further: Do we really need two *independent* scripts, ``starxwin.bat'' and ``starxwin.sh''? Why can't the former just delegate to the latter? Indeed. They are useless to me for starting the server and a continual source of problems. I would be quite happy to just delete them completely. :-) Looking forward to reading your patches to address any of these problems. -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://x.cygwin.com/docs/ FAQ: http://x.cygwin.com/docs/faq/
Re: checkX problems
Jon TURNEY wrote: But why would you fix the timeout problem by doing some horrible horrible iteration in the DOS batch script (horrible) (which would probably have to busy-wait (also horrible) as there's no sleep command), when you can fix checkX, which has the bonus of fixing other uses of it? I actually wrote a version of this. It IS horrible -- but no need for a busywait, I used the ping trick: C:\windows\system32\ping -n 2 127.0.0.1nul sleeps for (n-1)==1 seconds. Since more people seem to have this problem (cf. also Olivia's post), I repeat my question (essentially already posed by Ken Brown: http://www.mail-archive.com/cygwin-xfree@cygwin.com/msg19402.html): Why using ``run'' at all? If we really need a wrapper (do we?) wouldn't ``sh'' be a better one? I guess we use run for the reason run exists: to hide the console window, which otherwise would be shown? But checkX is compiled as a GUI program, so it really shouldn't need run to hide its (non-existent) console: $ objdump -p /usr/bin/checkX.exe | grep ^Subsystem Subsystem 0002(Windows GUI) Now, in startxwin.bat, we actually use: %RUN% checkX -wait other args run.exe is peculiar. The first argument is the target, and IF the VERY NEXT argument is -wait, run usurps that argument. That is, run will invoke: checkX other args and checkX will never see -wait. So, what does run.exe do with -wait? It...waits. run.exe won't exit, until after the inferior does. So, if checkX takes 12 seconds to come back, then run will take 12 seconds ... and the entire script is paused until checkX exits. HOWEVER, since checkX is a GUI program, you SHOULD get the same result from both of these two commands: %RUN% checkX -wait other args checkX other args Perhaps if you look how startxwin.bat is started from the start menu shortcut, (as 'run startxwin.bat') you can see why this might be useful? To push this even further: Do we really need two *independent* scripts, ``starxwin.bat'' and ``starxwin.sh''? Why can't the former just delegate to the latter? Indeed. They are useless to me for starting the server and a continual source of problems. I would be quite happy to just delete them completely. :-) Well, I think the OP is suggesting that one or the other go away, not both. g However, what is /your/ preferred way of starting the Xserver from a shortcut, Jon? launching xinit via run? or do you always start the server from within an existing shell session? Looking forward to reading your patches to address any of these problems. It shouldn't be too hard to add an option to checkX to make it retry if ECONNREFUSED. This would have to manually track the elapsed time for each attempt, charging against the specified -t waittime. Just look at run2-0.3.0/lib/checkX.c::try_with_timeout(). Some function signatures might need to be changed in order to pass opt.retry down to that level, but it'd be a nice short project for someone. I'll try to get to this but it'll be a few weeks, unless somebody sends me a patch sooner. -- Chuck -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://x.cygwin.com/docs/ FAQ: http://x.cygwin.com/docs/faq/
Re: checkX problems
On 11/12/2009 4:31 PM, Jon TURNEY wrote: On 30/10/2009 13:48, Ken Brown wrote: 2. If I start the X server by using the default startxwin.bat or startxwin.sh (both of which call checkX), the server is very unstable and crashes within a few minutes. This happens consistently, and it never happens if I comment out the line calling checkX. I think I was able to reproduce this problem (it is not how I normally start the X server) However, now I come back to look at this in detail, the problem no longer seems to exist. Are you still able to demonstrate it? Yes, it's still reproducible on one machine (the one for which I sent cygcheck output in my original post), but the problem doesn't occur on a second machine that I sometimes use. And I don't think anyone else has reported this problem. So I guess it must be BLODA or some other peculiarity of the one system. Ken -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://x.cygwin.com/docs/ FAQ: http://x.cygwin.com/docs/faq/
Re: checkX problems
Charles Wilson wrote: [...] run.exe is peculiar. The first argument is the target, and IF the VERY NEXT argument is -wait, run usurps that argument. That is, run will invoke: checkX other args and checkX will never see -wait. So, what does run.exe do with -wait? It...waits. run.exe won't exit, until after the inferior does. Could you please clarify an issue here? (Sorry, it seems, I wronged to ``run'' in the previous posts.) In a Windows command prompt (being somewhere on C:) I put the line \cygwin\bin\run -p /usr/bin sleep -wait 5 into a file ``dosleep.bat''. Executing that BAT-script (w/o any wrapper), it *does* sleep. Typing that very line directly at the prompt lets ``run'' return immediately, though. Can you confirm this behaviour? Looking forward to reading your patches to address any of these problems. It shouldn't be too hard to add an option to checkX to make it retry if ECONNREFUSED. This would have to manually track the elapsed time for each attempt, charging against the specified -t waittime. Another possibility would be an option ``-n'' to specify the number of retries. Just look at run2-0.3.0/lib/checkX.c::try_with_timeout(). Some function signatures might need to be changed in order to pass opt.retry down to that level, but it'd be a nice short project for someone. I'll try to get to this but it'll be a few weeks, unless somebody sends me a patch sooner. I'd volunteer for that. How/where do I upload? Asks Lothar -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://x.cygwin.com/docs/ FAQ: http://x.cygwin.com/docs/faq/