On 4/14/2010 22:49, Isaac Dupree wrote:
On 04/14/10 20:18, Max Battcher wrote:
All of which goes to show that Trac+darcs still isn't well optimized for
caching darcs queries or dealing gracefully with with long running
command invocations... I still say the Trac reliance on CVS/SVN-style
revision numbers means that Trac is absolutely not well-adapted for
serving darcs repositories. It may be "revision 1782" to Trac, but 'show
contents --match "hash 2008..."' is "commute this file to how it would
appear if only the patches preceding or equal to this one with a
timestamp from two years ago were applied" to darcs. (Which ends up
being quite possibly not a "real" historic version at all,
Well, suppose you have a public darcs repository for a project. (Such as
GHC HEAD.) If you look at the history of the real world (as opposed to
darcs' conception of history), this repo contained a series of states
over time. What infrastructure would we need, to be able to look at this
series usefully/efficiently years later? (I am reckoning that this
concept of history is useful enough that it's worth creating whether or
not darcs itself can support it. Does anyone agree/disagree?)
I've put an odd amount of thought into that over the years, and I've
also wondered how important it might be in reality... Different
developers will probably disagree on which bits are important, and I
think some of those philosophical differences are precisely the same
reasons why git and darcs (for instance) can co-exist because developers
may continue to prefer one approach over the other...
First of all, darcs does have one concept of real world history that
already is critical in many areas to darcs performance: the TAG. If
there is an important point in a repository's history, it should be
named and tagged. I can see a distinct case for specifically making sure
that any/all operations --to-tag/--from-tag are as performant as
possible. I could also see a case for some sort of (possibly opt-in)
auto-caching system for tag states (pristines).
Beyond that, darcs itself doesn't have any knowledge of "real world
history"... It doesn't track which patch was pulled/pushed in, only when
the patch was originally written (according to the clock of the system
on which the patch was written). This makes sense to darcs due to the
"fluidity" of patch movement (thanks to cherry-picking) and potential
complexity. (Should darcs try to record the integration history of a
patch across every branch/repository that patch has ever seen/will ever
see? How do you merge "conflicting" integration histories? How controls
it? How do you keep it secure?) Darcs admittedly takes the easiest
possible approach, which is: don't worry about it.
Is that the correct approach? Maybe. Assuming valid timestamps all
around and adequate tagging darcs' commutation-based conception of
history is a close enough approximation to real history to help a
patient human find what they are looking for. (Certainly not a close
enough solution to make "every version" available via direct HTTP GET
requests to darcs commands, but on the order of a file system search for
a human performing a query, for example.)
Assuming that you do critically need/want more historic version
information cached/saved... Here's something of the possibility spectrum:
* The "pig-in-a-blanket" repository: store a darcs repository inside a
git (or svn, or whatever) repository. It sounds silly, but its not all
that different than using some of the "patch queue" tools that
git/hg/svn users already use... you're just using darcs as a more
powerful patch queue and git (or whatever) as the fastest, dumb "store
the state of lots of files at each moment in time that I designate" file
store that you can find and trust. (Slightly less crazy variations might
be to take use directly of a distributed block store like S3, HDFS, or
even a document database...)
* Context-generating pre/posthook: before/after history manipulating
commands (apply, pull, record, amend-record, ...) something like:
darcs changes --context > archive.`date %s`.context
That's the basics you would need to keep track of actual, real-world
historical states. Although, you'd probably want to compress the context
files together for more long term storage, or find some more capable
storage engine. From the generated context files you should be able to
recreate all of the actual historical states. (Unfortunately it may not
be as performant or capable as it should be, because context files need
a bit of love...)
* In-repo branching: There's a long thread on the subject, but the
basics are that the hashed-storage backend could easily store more than
one inventory/pristine state in the same repository. Theoretically you
could build a third-party tool to handle multiple "root pointers" and
then "hold onto" root pointers for historic versions so that those files
don't get garbage collected. (This is sort of an inversion of the
"pig-in-the-blanket" idea: use darcs' own current data storage backend
(hashed-storage), but encourage it/tune it to store more than darcs
alone does.)
* Propose a useful interaction pattern for darcs optionally to track
such things itself and help it get implemented. Certainly, the toughest
path, but it may be possible for someone to come up with a good plan of
attack that darcs could implement directly.
That's how you might go about doing it... I personally don't see a need
for it. I think there are more interesting tools that solve similar
problems that could be improved first: better/stronger interactive file
annotation/blame tools; better/stronger darcs trackdown; tools that
maybe we don't even have names for today. I think it does come down to a
matter of different lifestyles for different DVCS tools: darcs' "bag" of
patches != git's DAG of file states.
In most of my development workflow, when I care about historical states,
I care about 1) tag file states, and 2) individual patch deltas...
historical integration states in between the two are much less common
for me to seek out. Both (1) and (2) are easy enough to get from
darcs... But that's just my approach and I appreciate that other
developers will disagree on this.
Hopefully some of the above is useful,
--
--Max Battcher--
http://worldmaker.net
_______________________________________________
darcs-users mailing list
darcs-users@darcs.net
http://lists.osuosl.org/mailman/listinfo/darcs-users