That would be great if you could just try it out and report back on how much it changed your perceived throughput. Testing the correctness will be a bigger challenge...I'm not sure how yet, but I've created an issue to track this here: https://jira.duraspace.org/browse/FCREPO-1024
Thanks, Chris On Tue, Nov 8, 2011 at 1:49 PM, Jesper Damkjaer <j...@dbc.dk> wrote: > Hi. > > After looking at the code again I agree with you. It seems like it would > work to just remove the synchronized keyword, though I'm not sure how to > test that it actually works as expected. > > I can give it a try with my test data which I used to locate the > bottleneck in the first place, but It is a lot of objects that all > contains different PIDs. Of course, if you are interested I'll try it > out. > > -Jesper > > > > On Tue, 2011-11-08 at 09:58 -0500, Chris Wilper wrote: >> I find pull requests pretty convenient to work with as opposed to >> patch files. Thanks for taking a look. I'm not sure right off how to >> reduce contention here, but I know it must be possible. >> >> Hmm. Looking at the code again, I wonder if the getWriteLock and >> releaseWriteLock methods are already sufficient. I see that >> getIngestWriter already calls getWriteLock(pid) near the end. >> >> If getWriteLock succeeds, the lock will be held until the >> ingest/modify operation is complete (see >> org.fcrepo.server.management.DefaultManagement#finishModification, >> which is invoked by all API-M methods when they're done). >> >> On the other hand, if getWriteLock fails, another thread must already >> be holding the lock. The behavior here is to just fast-fail subsequent >> requests to ingest/modify the same object, rather than blocking them. >> In reality multiple threads ingesting/modifying the same object should >> be a pretty rare thing, so failing them right away seems reasonable. >> >> So, long story short, I'm starting to wonder whether just removing the >> synchronized keyword is the right move here. >> >> - Chris >> >> On Tue, Nov 8, 2011 at 4:49 AM, Jesper Damkjaer <j...@dbc.dk> wrote: >> > Hi Chris. >> > >> > Actually I just tried to ingest about 20.000 objects in 10 parallel >> > threads, thereby noticing that the number of threads did not seem to >> > improve the performance. I then started visualVM and noticed that the >> > threads seemed to be hanging for a while - and with a threaddump I could >> > see that the threads waited at the synchronization at getIngestWriter. >> > >> > I did not try to just remove it, since I was unsure whether I had >> > completely understood the behaviour - And I had actually missed the >> > point about more objects containing the same PID. >> > >> > I will try to look into whether it is possible to improve the method to >> > not synchronize on all ingests, but only on when new PIDs are generated >> > and if more than one thread are containing the same PID. Are there >> > anything else I should look out for? >> > >> > How do You prefer to get patches to examine? Should I just make a fork >> > of the code at github, try to make some changes and ask You for a >> > pull-request? >> > >> > -Jesper >> > >> > >> > On Mon, 2011-11-07 at 17:03 -0500, Chris Wilper wrote: >> >> Hi Jesper, >> >> >> >> I'm curious how you found out about bottlenecking at this point in the >> >> code. A synchronized keyword could certainly raise a red flag if you >> >> were just following the code path, but I'm really wondering if you did >> >> any tests that led to this as a hot spot. Did you try removing it? >> >> >> >> Yes, this was originally made synchronized in order to try to prevent >> >> multiple objects from being ingested at once with the same PID. It >> >> took a little digging to find this: >> >> >> >> https://github.com/fcrepo/fcrepo-before33/commit/d76078b51d903e18d1725aa37f5e4060f2e7c3c0 >> >> >> >> Anyway, it does seem too aggressive a lock, and I think it'd be great >> >> if we could improve things here. The real requirement is that it >> >> shouldn't be possible to ingest an object with the same PID from >> >> multiple threads simultaneously. But keep in mind that not all PIDs >> >> come from PID generation -- they can be provided in the FOXML to be >> >> ingested. So just synchronizing on pid generation is not enough. >> >> >> >> Your ideas on how to reduce contention here are most welcome. This is >> >> one of the older bits of the Fedora codebase, and fresh eyes would be >> >> good. The theme of the upcoming 3.6 release is performance and >> >> scalability (without major architectural changes), and I think this >> >> would fit right in. >> >> >> >> - Chris >> >> >> >> On Mon, Nov 7, 2011 at 4:29 PM, Jesper Damkjaer <j...@dbc.dk> wrote: >> >> > Hi. >> >> > >> >> > I have tried to ingest a number of documents in parallel, but they seem >> >> > to congest in getIngestWriter in DefaultDOManager. >> >> > When looking at the source code I can see that this method is >> >> > synchronized, but I fail to understand why. >> >> > As far as I can tell ( I admit I have not read through the source for >> >> > all the classes used in the code ) the only place where the >> >> > synchronization is needed is when a new PID is generated. But looking at >> >> > BasicPIDGenerator it seems like the interesting methods are already >> >> > synchronized here. >> >> > Since I would like to speed up the ingest, could You please point me in >> >> > which direction to look in order to remove the synchronization on >> >> > getIngestWriter? >> >> > If You can help me understand which parts to fix I will look into >> >> > develop a patch. >> >> > >> >> > -Jesper >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > ------------------------------------------------------------------------------ >> >> > RSA(R) Conference 2012 >> >> > Save $700 by Nov 18 >> >> > Register now >> >> > http://p.sf.net/sfu/rsa-sfdev2dev1 >> >> > _______________________________________________ >> >> > Fedora-commons-developers mailing list >> >> > Fedora-commons-developers@lists.sourceforge.net >> >> > https://lists.sourceforge.net/lists/listinfo/fedora-commons-developers >> >> > >> >> >> >> >> >> >> > >> > >> > >> > >> >> >> > > > > ------------------------------------------------------------------------------ RSA(R) Conference 2012 Save $700 by Nov 18 Register now http://p.sf.net/sfu/rsa-sfdev2dev1 _______________________________________________ Fedora-commons-developers mailing list Fedora-commons-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/fedora-commons-developers