On 2012-04-23, at 16:56, Sven Van Caekenberghe wrote:

> Camillo,
> 
> On 23 Apr 2012, at 16:26, Camillo Bruni wrote:
> 
>> I figured a specific bottleneck of the current implementation of MC is the 
>> uniqe version name check.
>> 
>> On gemstone this takes 4 seconds! for the pharo inbox, definitely too long…
> 
> That seems like a problem with the SS3 and/or GemStone server. I should be 
> much faster:
> 
> [ MCHttpRepository new 
>       parseFileNamesFromStream: (ZnClient new 
>               beOneShot; 
>               get: 'http://mc.stfx.eu/ZincHTTPComponents') readStream ] 
> timeToRun   227
> 
> (this is for half that many versions, admitted cached in RAM on the 
> mc.stfx.eu server)

well 500ms would be the standard range I expected (as for all gemstone requests)
- your repos is much faster in that sense
- so will smalltalkhub (from what I saw)

I filed a bug report on gemstone, see if there can be done anything.

>> The culprit can be found here
>> 
>> MCFileBasedRepository >> includesVersionNamed: aString
>>      ^ self allVersionNames includes: aString
>> 
>> assuming that the PharoInbox consists of quite some versions (currently 947) 
>> this is quite some overhead.
> 
> It would be possible to skip the check all together, and fail when there is a 
> conflict.
> In a Distributed VCS you can never be sure anyway, right ?
> For me commiting is always slow due to my large package cache.

as previously discussed yes! the current uniqueness checks are quite flawed 
(closed world assumptions / race conditions)

>> I would suggest the following improvements:
>> 
>> - add a specific includesVersionNamed: server-side service
>> - add a simpler allVersionNames service that returns a newline separated 
>> list of filenames (not an html doc!)
> 
> Could be a solution I guess.
> But checking whether a version exists could be as simple as a HEAD request, 
> no ?

actually true, will try

> Then we would need almost no extra API, just a minor semantic change (with a 
> fallback to a normal real GET).

true

> I don't know if the other format would really make much difference, but I 
> have to admit it would be more logical.

all in all the parsing part doesn't use that much time, but still too much work 
done for what it is :). 

>> By having specific MC repository implementations everything would be 100% 
>> backwards compatible since no services are removed nor is the default http 
>> repository implementation changed.
> 
> Probably doable.
> 
>> what do you think?
> 
> BTW, do all repositories (actual, cache, others) have to be checked all the 
> time ? 
> Can't we at least turn that into an option ?

if you simply do a HEAD / single request it doesn't really matter anymore since 
it's going to be as fast as one single request. I currently do all checks in 
parallel so it's only a matter of the slowest one :)


Reply via email to