On Sat, Apr 22, 2006 at 04:33:56PM -0700, Jason Dagit wrote: > On 4/22/06, Ian Lynagh <[EMAIL PROTECTED]> wrote: > > > > On Sat, Apr 22, 2006 at 10:15:07AM -0700, Jason Dagit wrote: > > > > > > I found some time tonight to start implementing a different format for > > > hunks. Very simple but meant to allow skipping over data. > > > > > > hunk <filename> <line#> <#bytes of old content> <#bytes new content> > > > <dump of old lines><dump of new lines> > > > > I don't know where in the list archives or bug trackers it is OTTOMH, > > but what the new format should look like has been discussed before. > > Yes, I thought it had been as well. But for some reason finding those > emails is not easy for me (I searched on gmane mostly). I couldn't > find them after about an hour of searching so I figured I'd just go > for it.
Putting "darcs different hunk format" in gmane's search form gives me http://article.gmane.org/gmane.comp.version-control.darcs.devel/2205/ as the top hit. I don't know if that is the only, or even best, thread on the subject, but it does answer some of the questions you ask and some other details may have slipped my mind, so it's probably worth you giving the thread a quick read (and possibly checking for more recent threads, especially if it doesn't seem to come to a conclusion). > > I think it looked something like > > > > hunk <file> <byte#> <line#> <#old bytes> <#old lines> <#new bytes> <#new > > lines> > > old: > > <the old data> > > new: > > <the new data> > > > > (#lines is probably length . filter ('\n' ==)) > > Why is it useful to store the byte# and #old lines/#new lines? You need #old lines/#new lines to update line# when commuting. You need byte# for more efficient patch application. > > > 1) How many bytes do line endings add to the length of the old or new > > > content? Is it okay to assume line endings are exactly one byte in > > > patches? I know this will hold in unix-land, but what about win32? > > > > You should treat line endings compatibly with current darcs (which, > > again OTTOMH, I think means exactly '\n'). > > Based on Tommy's email that seems to be what everyone agrees on :) I > was mostly asking because showHunk creates a Doc object and since you > don't know at that time how many bytes will be used in the 'rendering' > of the Doc you have to guess. Hmm? If we're trying to get the 26 bytes the user entered into a patch, it had better be 26 bytes when we write the patch or we're going to corrupt their files. Or am I misunderstanding something? > > > My new hunk reading code is > > > essentially: > > > > > > lines (take n s) > > > where > > > n = length in bytes of either old or new > > > s = patch data as a Stringalike > > > > Why are you calling lines? The point of the new format is that, for most > > operations (all but pretty-printing?), it stays as a single string. > > Because I was afraid to change the Hunk data constructor which looks like: > > Hunk !Int [PackedString] [PackedString] > > So I figured that meant I had to break the lines myself before > constructing the hunk object. > > I guess I could try not splitting it and see what happens. You will need to change the internal representation too, yes. Thanks Ian _______________________________________________ darcs-devel mailing list [email protected] http://www.abridgegame.org/cgi-bin/mailman/listinfo/darcs-devel
