That would be great if you could just try it out and report back on
how much it changed your perceived throughput. Testing the correctness
will be a bigger challenge...I'm not sure how yet, but I've created an
issue to track this here:
https://jira.duraspace.org/browse/FCREPO-1024

Thanks,
Chris

On Tue, Nov 8, 2011 at 1:49 PM, Jesper Damkjaer <j...@dbc.dk> wrote:
> Hi.
>
> After looking at the code again I agree with you. It seems like it would
> work to just remove the synchronized keyword, though I'm not sure how to
> test that it actually works as expected.
>
> I can give it a try with my test data which I used to locate the
> bottleneck in the first place, but It is a lot of objects that all
> contains different PIDs. Of course, if you are interested I'll try it
> out.
>
> -Jesper
>
>
>
> On Tue, 2011-11-08 at 09:58 -0500, Chris Wilper wrote:
>> I find pull requests pretty convenient to work with as opposed to
>> patch files. Thanks for taking a look. I'm not sure right off how to
>> reduce contention here, but I know it must be possible.
>>
>> Hmm. Looking at the code again, I wonder if the getWriteLock and
>> releaseWriteLock methods are already sufficient. I see that
>> getIngestWriter already calls getWriteLock(pid) near the end.
>>
>> If getWriteLock succeeds, the lock will be held until the
>> ingest/modify operation is complete (see
>> org.fcrepo.server.management.DefaultManagement#finishModification,
>> which is invoked by all API-M methods when they're done).
>>
>> On the other hand, if getWriteLock fails, another thread must already
>> be holding the lock. The behavior here is to just fast-fail subsequent
>> requests to ingest/modify the same object, rather than blocking them.
>> In reality multiple threads ingesting/modifying the same object should
>> be a pretty rare thing, so failing them right away seems reasonable.
>>
>> So, long story short, I'm starting to wonder whether just removing the
>> synchronized keyword is the right move here.
>>
>> - Chris
>>
>> On Tue, Nov 8, 2011 at 4:49 AM, Jesper Damkjaer <j...@dbc.dk> wrote:
>> > Hi Chris.
>> >
>> > Actually I just tried to ingest about 20.000 objects in 10 parallel
>> > threads, thereby noticing that the number of threads did not seem to
>> > improve the performance. I then started visualVM and noticed that the
>> > threads seemed to be hanging for a while - and with a threaddump I could
>> > see that the threads waited at the synchronization at getIngestWriter.
>> >
>> > I did not try to just remove it, since I was unsure whether I had
>> > completely understood the behaviour - And I had actually missed the
>> > point about more objects containing the same PID.
>> >
>> > I will try to look into whether it is possible to improve the method to
>> > not synchronize on all ingests, but only on when new PIDs are generated
>> > and if more than one thread are containing the same PID. Are there
>> > anything else I should look out for?
>> >
>> > How do You prefer to get patches to examine? Should I just make a fork
>> > of the code at github, try to make some changes and ask You for a
>> > pull-request?
>> >
>> > -Jesper
>> >
>> >
>> > On Mon, 2011-11-07 at 17:03 -0500, Chris Wilper wrote:
>> >> Hi Jesper,
>> >>
>> >> I'm curious how you found out about bottlenecking at this point in the
>> >> code. A synchronized keyword could certainly raise a red flag if you
>> >> were just following the code path, but I'm really wondering if you did
>> >> any tests that led to this as a hot spot. Did you try removing it?
>> >>
>> >> Yes, this was originally made synchronized in order to try to prevent
>> >> multiple objects from being ingested at once with the same PID. It
>> >> took a little digging to find this:
>> >>
>> >> https://github.com/fcrepo/fcrepo-before33/commit/d76078b51d903e18d1725aa37f5e4060f2e7c3c0
>> >>
>> >> Anyway, it does seem too aggressive a lock, and I think it'd be great
>> >> if we could improve things here. The real requirement is that it
>> >> shouldn't be possible to ingest an object with the same PID from
>> >> multiple threads simultaneously. But keep in mind that not all PIDs
>> >> come from PID generation -- they can be provided in the FOXML to be
>> >> ingested. So just synchronizing on pid generation is not enough.
>> >>
>> >> Your ideas on how to reduce contention here are most welcome. This is
>> >> one of the older bits of the Fedora codebase, and fresh eyes would be
>> >> good. The theme of the upcoming 3.6 release is performance and
>> >> scalability (without major architectural changes), and I think this
>> >> would fit right in.
>> >>
>> >> - Chris
>> >>
>> >> On Mon, Nov 7, 2011 at 4:29 PM, Jesper Damkjaer <j...@dbc.dk> wrote:
>> >> > Hi.
>> >> >
>> >> > I have tried to ingest a number of documents in parallel, but they seem
>> >> > to congest in getIngestWriter in DefaultDOManager.
>> >> > When looking at the source code I can see that this method is
>> >> > synchronized, but I fail to understand why.
>> >> > As far as I can tell ( I admit I have not read through the source for
>> >> > all the classes used in the code ) the only place where the
>> >> > synchronization is needed is when a new PID is generated. But looking at
>> >> > BasicPIDGenerator it seems like the interesting methods are already
>> >> > synchronized here.
>> >> > Since I would like to speed up the ingest, could You please point me in
>> >> > which direction to look in order to remove the synchronization on
>> >> > getIngestWriter?
>> >> > If You can help me understand which parts to fix I will look into
>> >> > develop a patch.
>> >> >
>> >> > -Jesper
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> > ------------------------------------------------------------------------------
>> >> > RSA(R) Conference 2012
>> >> > Save $700 by Nov 18
>> >> > Register now
>> >> > http://p.sf.net/sfu/rsa-sfdev2dev1
>> >> > _______________________________________________
>> >> > Fedora-commons-developers mailing list
>> >> > Fedora-commons-developers@lists.sourceforge.net
>> >> > https://lists.sourceforge.net/lists/listinfo/fedora-commons-developers
>> >> >
>> >>
>> >>
>> >>
>> >
>> >
>> >
>> >
>>
>>
>>
>
>
>
>

------------------------------------------------------------------------------
RSA(R) Conference 2012
Save $700 by Nov 18
Register now
http://p.sf.net/sfu/rsa-sfdev2dev1
_______________________________________________
Fedora-commons-developers mailing list
Fedora-commons-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/fedora-commons-developers

Reply via email to