I just checked doing a HEAD request on squeaksource (which works but is far from being optimal...) on the massive Pharo repos (9000 files)
http://www.squeaksource.com/Pharo/Morphic-StephaneDucasse.918.mcz list: ~6.5 sec get: ~6.5 sec head: ~4.5 sec clearly faster! how about adding additional HEAD support? best cami On 2012-04-23, at 18:32, Stéphane Ducasse wrote: > Thanks dale. > > Stef > > On Apr 23, 2012, at 6:21 PM, Dale Henrichs wrote: > >> Right now we are having disk io problems with the machine that SS3 is hosted >> on. >> >> In general if the entire db can fit in memory you will get very fast >> response times once the db grows larger than memory you definitely hit a big >> speed bump when you have to hit disk ... with that said, the response times >> on SS3 are not acceptable and we are taking measures to address the problem >> ... >> >> The ss3 data base is larger than the ram we have allocated on the machine >> ... before adding RAM or tweaking other speedup issues we are endeavoring to >> address the disk io problem first and while the engineers in our data center >> are monkeying with that problem, I am not touching anything else. >> >> Dale >> ----- Original Message ----- >> | From: "Camillo Bruni" <[email protected]> >> | To: [email protected] >> | Sent: Monday, April 23, 2012 8:06:10 AM >> | Subject: Re: [Pharo-project] monticello repos speedup >> | >> | >> | On 2012-04-23, at 16:56, Sven Van Caekenberghe wrote: >> | >> | > Camillo, >> | > >> | > On 23 Apr 2012, at 16:26, Camillo Bruni wrote: >> | > >> | >> I figured a specific bottleneck of the current implementation of >> | >> MC is the uniqe version name check. >> | >> >> | >> On gemstone this takes 4 seconds! for the pharo inbox, definitely >> | >> too long… >> | > >> | > That seems like a problem with the SS3 and/or GemStone server. I >> | > should be much faster: >> | > >> | > [ MCHttpRepository new >> | > parseFileNamesFromStream: (ZnClient new >> | > beOneShot; >> | > get: 'http://mc.stfx.eu/ZincHTTPComponents') readStream ] >> | > timeToRun 227 >> | > >> | > (this is for half that many versions, admitted cached in RAM on the >> | > mc.stfx.eu server) >> | >> | well 500ms would be the standard range I expected (as for all >> | gemstone requests) >> | - your repos is much faster in that sense >> | - so will smalltalkhub (from what I saw) >> | >> | I filed a bug report on gemstone, see if there can be done anything. >> | >> | >> The culprit can be found here >> | >> >> | >> MCFileBasedRepository >> includesVersionNamed: aString >> | >> ^ self allVersionNames includes: aString >> | >> >> | >> assuming that the PharoInbox consists of quite some versions >> | >> (currently 947) this is quite some overhead. >> | > >> | > It would be possible to skip the check all together, and fail when >> | > there is a conflict. >> | > In a Distributed VCS you can never be sure anyway, right ? >> | > For me commiting is always slow due to my large package cache. >> | >> | as previously discussed yes! the current uniqueness checks are quite >> | flawed (closed world assumptions / race conditions) >> | >> | >> I would suggest the following improvements: >> | >> >> | >> - add a specific includesVersionNamed: server-side service >> | >> - add a simpler allVersionNames service that returns a newline >> | >> separated list of filenames (not an html doc!) >> | > >> | > Could be a solution I guess. >> | > But checking whether a version exists could be as simple as a HEAD >> | > request, no ? >> | >> | actually true, will try >> | >> | > Then we would need almost no extra API, just a minor semantic >> | > change (with a fallback to a normal real GET). >> | >> | true >> | >> | > I don't know if the other format would really make much difference, >> | > but I have to admit it would be more logical. >> | >> | all in all the parsing part doesn't use that much time, but still too >> | much work done for what it is :). >> | >> | >> By having specific MC repository implementations everything would >> | >> be 100% backwards compatible since no services are removed nor is >> | >> the default http repository implementation changed. >> | > >> | > Probably doable. >> | > >> | >> what do you think? >> | > >> | > BTW, do all repositories (actual, cache, others) have to be checked >> | > all the time ? >> | > Can't we at least turn that into an option ? >> | >> | if you simply do a HEAD / single request it doesn't really matter >> | anymore since it's going to be as fast as one single request. I >> | currently do all checks in parallel so it's only a matter of the >> | slowest one :) >> | >> | >> | >> > >
