https://osgeo-org.atlassian.net/browse/GEOS-7961

This bug happens under load as a result of a multithreading bug in the 
implementation of BBOXTypeBinding.getProperties.  More specifically, the 
partical.getContent() value there will be be corrupted/set to null by another 
thread executing that same code.  I'm not certain what the correct fix is yet 
but I do have some good detail and can make it fail every time.  I'm hoping 
that somebody with more knowledge on the XSD classes might have some insight 
that would help reach a solution.

I encounter this bug very reliably by running a jMeter ThreadGroup that sends 
concurrent GetFeature requests that include a BBOX filter.  I've since then 
figured out how to repro the bug in eclipse and identified a fairly narrow 
scope of code that is the root of the problem.

To repro this problem in eclipse place a breakpoint in 
BBOXTypeBinding.getProperties at the following code:
        
particle.setContent(GML.getInstance().getSchema().resolveElementDeclaration(GML.Envelope.getLocalPart()));
particle.setMinOccurs(0);

Put the breakpoint on the particle.setMinOccurs call right after the above call 
to setContent.

Now execute the GetFeature request and hit the breakpoint.  Note that as 
expected the particle.getContent() value at this point is not null, as it was 
set by the proceeding line of code.

Now while sitting on that breakpoint execute another GetFeature and stop there 
on the other thread.  Again the value of particle.getContent() is not null.  
But go to the original breakpoint in the first thread and see that 
particle.getContent() IS NOW NULL!

In other words a GetFeature started on another thread has changed the 
particle.getContent() of the original thread.  NullPointer exceptions like that 
shown in GEOS-7961 and others are now the result.

Digging deeper I see that each thread has a unique "particle" instance 
(particle on one thread refers to a different object than particle on another 
thread) - as expected since the object is obtained from the factory on entry to 
getProperties.  But in scenarios where this code works successfully (not forced 
to fail as above) it's at least interesting that the value of 
particle.getContent() is the very same object instance on both threads.

Ultimately the root of the problem has to do with the EObjectImpl.eContainer 
member - this is the object to which each particle refers.

While I don't know what the best fix is, I do how to make this specific failure 
never happen.  I patched my code as follows by adding the thread 
synchronization you see below in GetFeature...
                synchronized(GetFeature.class) {
                    Encoder e = new Encoder(new FESConfiguration());
                    e.setOmitXMLDeclaration(true);
                    filter.append(e.encodeAsString(q.getFilter(), FES.Filter));
                }

A more targeted fix would of course be much better and might address potential 
failures other than GetFeature.  I'm not sure what the right fix is but it 
would appear to be centered on the behavior of the following, which is rather 
convoluted for my brain to fully digest when considering changes.
      if (content != null)
        msgs = ((InternalEObject)content).eInverseRemove(this, 
EOPPOSITE_FEATURE_BASE - XSDPackage.XSD_PARTICLE__CONTENT, null, msgs);
      if (newContent != null)
        msgs = ((InternalEObject)newContent).eInverseAdd(this, 
EOPPOSITE_FEATURE_BASE - XSDPackage.XSD_PARTICLE__CONTENT, null, msgs);

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Geoserver-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

Reply via email to