Re: [darcs-users] GSoC: network optimisation vs cache vs library?

Max Battcher Thu, 15 Apr 2010 11:33:49 -0700

Lele Gaifax wrote:

On Wed, 14 Apr 2010 20:18:21 -0400
Max Battcher <m...@worldmaker.net> wrote:

On 4/14/2010 19:23, Zooko Wilcox-O'Hearn wrote:

Our project web site was just down for about an hour and a half a
couple of hours ago. The reason turned out to be that there were
about a dozen darcs processes running trying to answer queries like
this:

darcs query contents --quiet --match "hash
20080103234853-92b7f-966e01e6a40dbe94209229f459988e9dea37013a.gz"
"docs/running.html"

This is the query that the trac-darcs plugin issues when you hit
this web page:

http://tahoe-lafs.org/trac/tahoe-lafs/changeset/1782/docs/running.html

All of which goes to show that Trac+darcs still isn't well optimized
for caching darcs queries or dealing gracefully with with long
running command invocations...


As I'm working on forthcoming version 0.8, where I already gained a
good improvement in some poorly-written queries, I'm interested in
that. As a minimum, trac+darcs could avoid spawning multiple identical
processes that in any case all but one are going to be discarded.

Well, my big suggestion remains to use a processing queue (even if itsjust to begin with Python's built-in queue module) and tuning the numberof darcs subprocesses you spawn to best fit the server (you could eventry to check CPU/memory utilization before subprocess spawning). Evenwithout AJAX, if a subprocess takes, say, >50ms to clear from the queue,you can just present a message along the lines of "The cache is beingpopulated, please try again in a few minutes." (Of course, a little AJAXto watch the progress of the operation would go a long way here.)

(Which ends up being quite possibly not a "real" historic
version at all, and which does quite a bit of work to be so easily
susceptible to crawlers/DDoS/accidental DDoS...)


I don't know what you mean with "real" historic version. In my
experience, the "trac+darcs" view always gave me the expected thing, I
mean, the one that corresponds both to my back memory of the change to
the practical sense of the changesets' neighbours.

The fact that darcs commutes so easily is a well known
pr^H^Hincredible feature we all plus or minus inconsciously love. We
discussed the matter, even David gave his opinion, wrt trac+darcs:
AFAICT, for a given single repository, not subject to "darcs optimize"
or other "back-history" reordering/change, the output of a "darcs

query content" or even "darcs diff" is the same.

AFAIK, also Alberto's darcsweb uses a similar approach to examine
historical contents.


So does darcsit.

Darcs' patch order is often a surprisingly good approximation forrepository history, but it will never be a 1:1 equivalent (unlike theDAG-based DVCS' which maintain much more strict histories). I'mcertainly not advocating that darcs become a DAG like git/hg. I'm justquestioning if the different style of history deserves differentapproaches to history. Darcs patch history is certainly not equivalentto svn/git/hg revision history.

To my knowledge its certainly possible for darcs to end up commuting anew patch nearly anywhere in the repository's patch order, even on justa pull/apply. Usually its quite unlikely, but theoretically it is stillpossible for the merger of long divergent branches to result in unusualrepository orders that don't necessarily reflect the history of eitherbranch very well (other than obvious dependencies, of course)...

Even if Trac+Darcs revision numbers don't contort alongside patch order,its still possible for the ``darcs show contents --match "hashpatch-hash"`` to produce subtly different results after only apull/apply due to commutation. That is why I'm not certain that thestate of a repository at every given patch is always necessarilymeaningful. If its a close enough approximation to reality that you seefit to use it, by all means continue. I just can't help but wonder ifthere are more meaningful choices to be made, that need less cachingoverall...


--
--Max Battcher--
http://worldmaker.net
_______________________________________________
darcs-users mailing list
darcs-users@darcs.net
http://lists.osuosl.org/mailman/listinfo/darcs-users

Re: [darcs-users] GSoC: network optimisation vs cache vs library?

Reply via email to