I just checked doing a HEAD request on squeaksource (which works but is far 
from being optimal...) on the massive Pharo repos (9000 files)

http://www.squeaksource.com/Pharo/Morphic-StephaneDucasse.918.mcz

list: ~6.5 sec
get:  ~6.5 sec
head: ~4.5 sec

clearly faster! how about adding additional HEAD support?


best
cami

On 2012-04-23, at 18:32, Stéphane Ducasse wrote:

> Thanks dale.
> 
> Stef
> 
> On Apr 23, 2012, at 6:21 PM, Dale Henrichs wrote:
> 
>> Right now we are having disk io problems with the machine that SS3 is hosted 
>> on. 
>> 
>> In general if the entire db can fit in memory you will get very fast 
>> response times once the db grows larger than memory you definitely hit a big 
>> speed bump when you have to hit disk ... with that said, the response times 
>> on SS3 are not acceptable and we are taking measures to address the problem 
>> ...
>> 
>> The ss3 data base is larger than the ram we have allocated on the machine 
>> ... before adding RAM or tweaking other speedup issues we are endeavoring to 
>> address the disk io problem first and while the engineers in our data center 
>> are monkeying with that problem, I am not touching anything else.
>> 
>> Dale
>> ----- Original Message -----
>> | From: "Camillo Bruni" <[email protected]>
>> | To: [email protected]
>> | Sent: Monday, April 23, 2012 8:06:10 AM
>> | Subject: Re: [Pharo-project] monticello repos speedup
>> | 
>> | 
>> | On 2012-04-23, at 16:56, Sven Van Caekenberghe wrote:
>> | 
>> | > Camillo,
>> | > 
>> | > On 23 Apr 2012, at 16:26, Camillo Bruni wrote:
>> | > 
>> | >> I figured a specific bottleneck of the current implementation of
>> | >> MC is the uniqe version name check.
>> | >> 
>> | >> On gemstone this takes 4 seconds! for the pharo inbox, definitely
>> | >> too long…
>> | > 
>> | > That seems like a problem with the SS3 and/or GemStone server. I
>> | > should be much faster:
>> | > 
>> | > [ MCHttpRepository new
>> | >  parseFileNamesFromStream: (ZnClient new
>> | >          beOneShot;
>> | >          get: 'http://mc.stfx.eu/ZincHTTPComponents') readStream ]
>> | >          timeToRun   227
>> | > 
>> | > (this is for half that many versions, admitted cached in RAM on the
>> | > mc.stfx.eu server)
>> | 
>> | well 500ms would be the standard range I expected (as for all
>> | gemstone requests)
>> | - your repos is much faster in that sense
>> | - so will smalltalkhub (from what I saw)
>> | 
>> | I filed a bug report on gemstone, see if there can be done anything.
>> | 
>> | >> The culprit can be found here
>> | >> 
>> | >> MCFileBasedRepository >> includesVersionNamed: aString
>> | >>         ^ self allVersionNames includes: aString
>> | >> 
>> | >> assuming that the PharoInbox consists of quite some versions
>> | >> (currently 947) this is quite some overhead.
>> | > 
>> | > It would be possible to skip the check all together, and fail when
>> | > there is a conflict.
>> | > In a Distributed VCS you can never be sure anyway, right ?
>> | > For me commiting is always slow due to my large package cache.
>> | 
>> | as previously discussed yes! the current uniqueness checks are quite
>> | flawed (closed world assumptions / race conditions)
>> | 
>> | >> I would suggest the following improvements:
>> | >> 
>> | >> - add a specific includesVersionNamed: server-side service
>> | >> - add a simpler allVersionNames service that returns a newline
>> | >> separated list of filenames (not an html doc!)
>> | > 
>> | > Could be a solution I guess.
>> | > But checking whether a version exists could be as simple as a HEAD
>> | > request, no ?
>> | 
>> | actually true, will try
>> | 
>> | > Then we would need almost no extra API, just a minor semantic
>> | > change (with a fallback to a normal real GET).
>> | 
>> | true
>> | 
>> | > I don't know if the other format would really make much difference,
>> | > but I have to admit it would be more logical.
>> | 
>> | all in all the parsing part doesn't use that much time, but still too
>> | much work done for what it is :).
>> | 
>> | >> By having specific MC repository implementations everything would
>> | >> be 100% backwards compatible since no services are removed nor is
>> | >> the default http repository implementation changed.
>> | > 
>> | > Probably doable.
>> | > 
>> | >> what do you think?
>> | > 
>> | > BTW, do all repositories (actual, cache, others) have to be checked
>> | > all the time ?
>> | > Can't we at least turn that into an option ?
>> | 
>> | if you simply do a HEAD / single request it doesn't really matter
>> | anymore since it's going to be as fast as one single request. I
>> | currently do all checks in parallel so it's only a matter of the
>> | slowest one :)
>> | 
>> | 
>> | 
>> 
> 
> 


Reply via email to