OK, I have a patched wss4j 2.1.8-asoldano-SNAPSHOT on the snapshot repository and I'm letting the CI server here run with it for few days. Let's see if the failures pop up or not...


Il 15/09/2016 11:20, Alessio Soldano ha scritto:
mmh... I need to build a patched wss4j snapshot and have it consumed by the remote machine that is reproducing the issue a bit more frequently (locally it's very rare). Will let you know :-)

Il 15/09/2016 10:35, Colm O hEigeartaigh ha scritto:
Hi Alessio,

Yes, that makes sense to me. If you perform the fix locally, do the
intermittent failures go away?


On Wed, Sep 14, 2016 at 9:55 PM, Alessio Soldano <asold...@redhat.com>


I'm currently seeing an intermittent issue in the JBossWS-CXF testsuite
(stacktrace at https://paste.fedoraproject.org/428145/14738847/raw/ ),
with the EHCacheTokenStore creation failing due to the CacheManager having been shutdown. The testsuite includes multiple tests, almost all of them
create jaxws clients and in most of them the current thread bus is used
(few of them do create a new bus, set it as default thread bus, run and
eventually shutdown the bus). What I suspect is some kind of concurrency
issue in the CacheManager lifecycle management.

I've looked a bit at the code and noticed that there's basically a 1-1
relationship between Bus instances and CacheManager instances. Given I have some tests that do not explicitly shutdown the bus (or the client) after execution, it can happen that a client is closed because the JDK eventually finalize ClientProxy, which in the end causes the CacheCleanupListener to close the token store and hence to release/shutdown the cache manager (see
the invocation flow at https://paste.fedoraproject.or
g/428150/47388530/raw/ ). Unfortunately that exact cache manager could
possibly be in use to serve another client running in the same bus. AFAICS,
there's an attempt to avoid problems like this in WSS4J's
EHCacheManagerHolder (which deals with CXF requests of creating/releasing
cache managers), as it has a ConcurrentHashMap<String, AtomicInteger>
attribute to keep track of how many consumers of a given cache manager are there and avoid shutting down a manager if it's still in use. Looking at its getCacheManager and releaseCacheManager methods I can see a possible
concurrency flaw which could be the root of my failure. The
releaseCacheManager method could be called with cacheManager X as parameter while a different thread is running getCacheManager and is just before line 106 (that is just before the AtomicInteger is got from the map) with local cacheManager variable already resolved to X. That should later deal to an
attempt to use an already shutdown cache manager. I would be tempted to
suggest making those two methods syncronized (the map could then probably
be a plain hash map).

WDYT? I might be missing something, so posting here before opening up a
jira. Any idea?



Alessio Soldano
Web Service Lead, JBoss

Alessio Soldano
Web Service Lead, JBoss

Reply via email to