> Hi again everyone, > > I've got a setup here like this: > > One server is running tomcat-4.0 (final), plus slide, plus our webapps. > Another server (elsewhere in the same room) is running oracle. > > So, the webserver box has slide set up to use: > a) FileContentStore (local) > b) JDBCDescriptorsStore (talking to oracle on another machine) > Also, we use slide's SlideRealm in tomcat-4.0 for authentication. > > Our webapps (all running in the same VM as slide, only one copy of slide > is ever running at a time) do slide stuff directly, because they often > do quite a bit of work and the overhead of using webdav for stuff inside > the same VM is too high. > > One of the classes is used to set a property on a slide object. A couple > of weeks ago, I fixed this to use slide's transactions (doing > token.begin(), token.commit(), and sometimes token.rollback(), just like > the webdav servlet does. > > At some point since (or possibly all the time since), we've been getting > exceptions thrown occasionally (and dumped to std(out|err)). They all > look more or less like this, and have no additional information around > them in the logs. > > Enlist error(Transaction 86 in HttpProcessor[7070][6]) = -4 > slidestore.reference.JDBCDescriptorsStore@73a34b Branch: > HttpProcessor[7070][6]-1002069841359-86-20 Flag: 2097152 > javax.transaction.xa.XAException > at > org.apache.slide.common.AbstractSimpleService.start(AbstractSimpleService.ja va:415) > at > slidestore.reference.JDBCDescriptorsStore.start(JDBCDescriptorsStore.java:51 5) > at > org.apache.slide.transaction.SlideTransaction.enlistResource(SlideTransactio n.java:464) > at > org.apache.slide.store.AbstractStore.enlist(AbstractStore.java:1373) > at > org.apache.slide.store.AbstractStore.storeRevisionDescriptor(AbstractStore.j ava:1094) > at > org.apache.slide.store.StandardStore.storeRevisionDescriptor(StandardStore.j ava:606) > at org.apache.slide.content.ContentImpl.store(ContentImpl.java:943) > > (followed by the rest of a very long stack trace showing tomcat starting > one of our servlets, and this servlet eventually calling content.store() > here.) > > > Then, in the last few days, this machine has been apparently hanging > (well, the webserver). Further investigation shows that authentication > is being attempted, but it never returns from the SlideRealm > getPassword() function. This seems to generally happen just after a > whole lot of these transaction errors have occurred, and so I think > they're related (in some way). If I throw our load-tester at it, this > generally starts happening within a couple of minutes of starting. > > If I remove the transaction stuff from our webapp, things seem to work > ok (but I haven't tested this really thoroughly yet). Lots of warnings > get printed because I'm doing stuff when NOT in a transaction, of > course. However, this obviously isn't a good solution as it means that > when things go wrong, slide will have a tendency to end up in an > inconsistent state and break horribly. So that's not a long-term > solution. > > Does anyone know what might be causing this? I've looked through the > first couple of layers of code, but got lost in the transaction handling > stuff in slide, which I don't really have any idea about. I know some of > you guys know it backwards, so any ideas would be very much appreciated.
No idea yet, except that the transaction state is conflicting with the resource manager state. If multiple threads are accessing the token, it could be a race, although each thread should have its own transaction. I'll file a bug to keep track of the issue (#3935). Remy
