On 2018-03-28 01:03, David Holmes wrote:
Hi Magnus,

I appreciate what you are trying to do here but there are reasons this kind of cleanup keeps getting deferred :) - it's not trivial and deals with really old code that isn't always clear. You may have bitten off more than you want to chew here. (And I'm strapped for time to work through this with you - sorry.)

*sighs* Yeah, I'm starting to realize that. I thought it was more or less trivial, and I didn't want to add new duplicated code. Let's see if I can give reasonable answers to your questions here, but if that's not enough, I'll drop this.

On the positive side you can use the pthread functions on Solaris so no need to special case for that.
Ok!

However there are still a number of issues:

  52 #ifndef NSIG
  53   #define NSIG SIGRTMAX
  54 #endif

SIGRTMAX is Solaris only, but even on Solaris NSIG might be defined (it's conditional in sys/signal.h but I don't know when the conditions are enabled).

 56 static struct sigaction sact[NSIG];

SIGRTMAX is a macro not a constant so you can't declare a static array using it. You would need to malloc the array as done on Solaris.
Ah, I see. That explains the special code in Solaris.

 58 static __thread bool reentry = false; /* prevent reentry deadlock (per-thread) */

I didn't realize this was in the OSX version as we don't unconditionally use compiler-based thread locals. I would make this field and its use OSX specific
Ok.

148 #ifdef SOLARIS
 149     sact[sig].sa_flags = SA_NODEFER;
 150     if (sig != SIGILL && sig != SIGTRAP && sig != SIGPWR) {
 151       sact[sig].sa_flags |= SA_RESETHAND;
 152     }
 153 #else

I'd have to do some research to see why this is needed so best to keep it.

178 #ifdef SOLARIS
 179     if (is_sigset && sigblocked) {
 180       /* We won't honor the SIG_HOLD request to change the signal mask */
 181       oldhandler = SIG_HOLD;
 182     }
 183 #endif

Ditto.

More below ...

On 27/03/2018 7:42 PM, Magnus Ihse Bursie wrote:
When I was about to update jsig.c, I noticed that the four copies for aix, linux, macosx and solaris were basically the same, with only small differences. Some of the differences were not even well motivated, but were likely the result of this code duplication causing the code to diverge.

A better solution is to unify them all into a single unix version, with the few platform-specific changes handled on a per-platform basis.

I've made the following notable changes:

* I have removed the version check for Solaris. All other platforms seem to do fine without it, and in general, we don't mistrust other JDK libraries. An alternative is to add this version check to all other platforms instead. If you think this is the correct course of action, let me know and I'll fix it.

That's fine. I actually thought this had gone. Or it may be we have a bug to clean this up. Which reminds me - there may be a number of existing bugs that get handled by the changes you're making. But there may still be open bugs that need fixing.
I searched JBS for "jsig" and "libjsig". Then I screened all hits, and filtered out all bugs where this was just mentioned in passing. In the end, I found only this which has bearing to the current changes: https://bugs.openjdk.java.net/browse/JDK-8170639 "[Linux] jsig is limited to a maximum of 64 signals"

I also found a couple of bugs related to build (JDK-8164790, JDK-8146563) and a discussion about using __thread (JDK-8137018).

JDK-8170639 will be fixed by the propsed patch.


* Solaris used to have a dynamically allocated sact, instead of a statically allocated array as all other platforms have. It's not likely to be large, and the size is known at compile time, so there seems to be no good reason for this.

See above. If you didn't get a compile-time error here then it's likely NSIG was already defined.

> cat sig.c
#include <signal.h>

static int sigs[SIGRTMAX];

int main(int argc, char* argv[]) {
  sigs[0] = 1;
}

 > CC -c sig.c
"sig.c", line 3: Error: An integer constant expression is required within the array subscript operator.
1 Error(s) detected.
Yes, you are right, NSIG is defined when I compile on Solaris. I looked into the conditionals, but I could not either easily determine when they are true. But maybe that's not really necessary to do? Most of the system provided functions are guarded by feature tests, but we don't normally verify that we fulfill the proper conditions, but just use them, and if it works, we're happy.

(That said, we should probably be more prudent on how we use feature/standard defines...)


* Linux and macosx used a uint32_t/uint64_t instead of sigset_t for jvmsigs, as aix and solaris do. This is a less robust solution, and the added checks show that it has failed in the past. Now all platforms use sigset_t/sigismember().

Probably historical (macosx copied linux) and a good change to make. That said I have to wonder why we didn't make it when we put in the guards on the size.
Maybe the developer putting in the guards didn't know about sigismember()? And since it was not obvious how this was done on other platform, that probably seemed like a good idea given the pre-existing code on linux.

Also worth noting:

* Solaris is not using pthreads, but it's own thread library, which accounts for most of the #ifdef SOLARIS.

Not needed as above.

* In general, if an implementation was needed on one platform, but has no effect or is harmless on others, I've kept it on all platforms instead of sprinkling the code with #ifdefs.

That depends on your own knowledge/assumptions about the behaviour.
Of course. :-)
I'd want to check each such case to verify that is the case.

Sure. I'll list them here:
* booleans were handled differently. On macosx, stdbool.h was included, other platforms used  #define bool int. I researched stdbool.h, and it is part of the C99 requirement, so I replaced it with:

#if (__STDC_VERSION__ >= 199901L)
#include <stdbool.h>
#else
#define bool int
#define true 1
#define false 0
#endif

on all platforms. On macosx, the combinations of flags we're sending to clang makes it C99 compatible. (I have not looked into detail on why/how this is so. The only thing I know from memory is that on solstudio we explicitly forbid it from being C99.)

* The reentry code was only used on macosx. I'll return it to macosx specific with ifdefs.

* On AIX, the type "signal_t" was already used in signal.h, so they had to rename it. I renamed it across all platforms, to "signal_function_t".

* On solaris, save_signal_handler() needed an additional argument is_sigset. I'm passing this argument on all platform, even if it's not needed.

* Only solaris needed the sigblocked test in set_signal. I kept the #ifdef around the code that set the oldhandler to SIG_HOLD, but the actual test for setting sigblocked needed to be done earlier:

if (is_sigset) {
sigblocked = sigismember(&(sact[sig].sa_mask), sig);
}

While useless on other platforms, I didn't think it was worth clogging the code with more #ifdefs.


As above the use of __thread with "reentry" is not something to use unconditionally without closer examination.
Sure. I'll make them macosx specific again.

To facilitate code review, here is a specially crafted webrev that shows the differences compared to each of the individual, original per-OS versions of jsig.c:
http://cr.openjdk.java.net/~ihse/JDK-8200298-unify-libjsig/webrev.01

Appreciated, but still very hard to see when a feature from one OS is now being used by all.

I'd be less concerned if we had extensive testing here but I don't think we do, so any issues only turn up when something that's been working since Java 1.4 suddenly stops working. What testing was done?
I've run hotspot/jtreg/runtime/XCheckJniJsig, and the closed Oracle tests that test libjsig, on all platforms. (mach5 --job hs-tier5-rt,hs-tier1)

I have not looked in detail into these tests, so I cannot say how well they test libjsig. I just searched the tests for whatever used libjsig.

/Magnus

Thanks,
David
-----

Bug: https://bugs.openjdk.java.net/browse/JDK-8200298
WebRev: http://cr.openjdk.java.net/~ihse/JDK-8200298-unify-libjsig/webrev.03

/Magnus

Reply via email to