Re: Re: full kernel history, in patchset format

2005-04-16 Thread Christopher Li
On Sat, Apr 16, 2005 at 07:43:27PM +0200, Petr Baudis wrote:
 Dear diary, on Sat, Apr 16, 2005 at 07:04:31PM CEST, I got a letter
 where Linus Torvalds [EMAIL PROTECTED] told me that...
  So I'd _almost_ suggest just starting from a clean slate after all.  
  Keeping the old history around, of course, but not necessarily putting it
  into git now. It would just force everybody who is getting used to git in 
  the first place to work with a 3GB archive from day one, rather than 
  getting into it a bit more gradually.
  
  Comments?
 
 FWIW, it looks pretty reasonable to me. Perhaps we should have a
 separate GIT repository with the previous history though, and in the
 first new commit the parent could point to the last commit from the
 other repository.
 
 Just if it isn't too much work, though. :-)

I think we can make the git using stackable repository. When it fail
to find an object, it will try it's to read from parent repository.
It is useful to slice the history.

I can have local repository that all the new object create by me will
store in my tree instead of the official one. Clean up the object in the
my local tree will be much easier it only need to work on a much smaller
repository. If all my change is merge to official tree, I just simply
empty my local repository.

About the kernel git repository. I think it is much easier just put
them in one tree.  So I don't need to worry about if I need to see
pre 2.6.12, I need to do this. And the full repository  need to
store in the server some where any way.

However I totally agree that people should not deal with unnecessary the history
when they start using the git tools. We should just make the tools
by default don't download all the histories. Only get it when user specific 
ask for it.

Why 2.6.12-rc2? When kernel grows to 2.6.15, a new user might not even need
pre 2.6.13 most of the time. If we make it very easier for people to get
history if they need, it will make them less motivate to store unnecessary
history locally (just in case I need it).

I think we should not advise using rsync to sync the whole git tree as
way to get update. We need to get use to only have a slice of the history
and get more if we needed.
The server should should provide some small metadata file like the
the rev-tool cache, so the SCM tools can download it to figure out what file
is needed to download to get to certain revision. Instead of download the
whole repository to figure out what is new.

We can even slice that metadata information to smaller pieces base on major 
release point.

Chris
 
-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Merge with git-pasky II.

2005-04-15 Thread Christopher Li
On Fri, Apr 15, 2005 at 12:43:47AM -0700, Junio C Hamano wrote:
  CL == Christopher Li [EMAIL PROTECTED] writes:
 
 CL Is that SHA1 for tree or the file object?
 
 I am talking about a single file here.

Then do you emit the entry for it's parents directory?

e.g. /foo/bar get created. foo doesn't exists. You have
to create foo first. You don't have mode information for
foo yet. If it give the top level tree, the SCM can check it
out by tree. hopefully have the mode on directory correctly.
Well, if they care about those little details.

Chris
 
-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Re: Merge with git-pasky II.

2005-04-14 Thread Christopher Li
Is that some thing you want to see? Maybe clean up the error printing.


Chris

--- /dev/null   2003-01-30 05:24:37.0 -0500
+++ merge.py2005-04-14 16:34:39.0 -0400
@@ -0,0 +1,76 @@
+#!/usr/bin/env python
+
+import re
+import sys
+import os
+from pprint import pprint
+
+def get_tree(commit):
+data = os.popen(cat-file commit %s%commit).read()
+return re.findall(r(?m)^tree (\w+), data)[0]
+
+PREFIX = 0
+PATH = -1
+SHA = -2
+ORIGSHA = -3
+
+def get_difftree(old, new):
+lines = os.popen(diff-tree %s %s%(old, new)).read().split(\x00)
+patterns = (r(\*)(\d+)-(\d+)\s(\w+)\s(\w+)-(\w+)\s(.*),
+   r([+-])(\d+)\s(\w+)\s(\w+)\s(.*))
+res = {}
+for l in lines:
+   if not l: continue
+   for p in patterns:
+   m = re.findall(p, l)
+   if m:
+   m = m[0]
+   res[m[-1]] = m
+   break
+   else:
+   raise difftree: unknow line, l
+return res
+
+def analyze(diff1, diff2):
+diff1only = [ diff1[k] for k in diff1 if k not in diff2 ]
+diff2only = [ diff2[k] for k in diff2 if k not in diff1 ]
+both = [ (diff1[k],diff2[k]) for k in diff2 if k in diff1 ]
+
+action(diff1only)
+action(diff2only)
+action_two(both)
+
+def action(diffs):
+for act in diffs:
+   if act[PREFIX] == *:
+   print modify, act[PATH], act[SHA]
+   elif act[PREFIX] == '-':
+   print remove, act[PATH], act[SHA]
+   elif act[PREFIX] == '+':
+   print add, act[PATH], act[SHA]
+   else:
+   raise unknow action
+
+def action_two(diffs):
+for act1, act2 in diffs:
+   if len(act1) == len(act2):  # same kind type
+   if act1[PREFIX] == act2[PREFIX]:
+   if act1[SHA] == act2[SHA] or act1[PREFIX] == '-': 
+   return action(act1)
+   if act1[PREFIX]=='*':
+   print do_merge, act1[PATH], act1[ORIGSHA], act1[SHA], 
act2[SHA]
+   return
+   print unable to handle, act[PATH]
+   print one side wants, act1[PREFIX]
+   print the other side wants, act2[PREFIX]
+   
+
+args = sys.argv[1:]
+if len(args)!=3:
+print Usage merge.py common rev1 rev2
+trees = map(get_tree, args)
+print checkout-tree, trees[0]
+diff1 = get_difftree(trees[0], trees[1])
+diff2 = get_difftree(trees[0], trees[2])
+analyze(diff1, diff2)
+
-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html