Hi, Just a short message to introduce myself and give a shameless plug. I'm Zed A. Shaw and I'm the author of a little unknown SCM called FastCST (http://www.zedshaw.com/projects/fastcst ). While I doubt that Linus would ever adopt fastcst as his tool (and I probably wouldn't want him too since it's not quite ready for prime time) I did find many of the discussions on the list so far very interesting.
Some sent me Linus' message about wanting to do a diff on the whole source tree, and just thought I'd mention that I already tried this in FastCST. FastCST uses a suffix array to construct a delta (not a diff), so I thought it might be possible to simply apply the delta algorithm to the entire source tree and get very small changesets. It worked on small source trees, but when it came to the Linux 2.6 tree it choked hard. Even with an efficient suffix array implementation, you're talking about performing a diff/delta on 225M of source. Added to the problem is that you have to track file locations within the massive blob. In the end, it also wasn't much more efficient from a size/space/time perspective so I dropped it. My current solution to Linus' problem is to use an inverted index to process all the sources and revisions on the fly as they are created. Using the inverted index, I'm able to VERY quickly find any chunk of source in files or revisions. This lets me track things like how functions move through the files, where chunks of code moved to, etc. In the end this turns out to be much more efficient (7 seconds on my computer to find all references to "sprintf" in the Linux 2.6 source) as I can use the super small deltas for distributing changes, and give developers a means tracking content changes across "the world" in a simple search format. Anyway, just thought I'd throw in my experiences attempting what Linus is talking about. I actually agree with him that rename tracking isn't that great, but I've come to the conclusion that tracking renames is actually a specific case of just a general search problem. Different strokes for different folks I guess. Other than that, I'm mostly interested in reading the messages and probably won't write anything unless people ask me directly for something. Thanks! Zed A. Shaw - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html