Re: GIT license (Re: Re: Re: Re: Re: [ANNOUNCE] git-pasky-0.1)

2005-04-12 Thread Chris Friesen
Petr Baudis wrote:
Dear diary, on Tue, Apr 12, 2005 at 11:50:48AM CEST, I got a letter

Well, yes, but the last merge point search may not be so simple:
A --1---26---7
B\   `-4-.  /
C `-3-5'
Now, when at 7, your last merge point is not 1, but 2.

...and this is obviously wrong, sorry. You would lose 3 this way.
Wouldn't the delta betweeen 2 and 5 include any contribution from 3?
Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: Re: Re: Re: GIT license (Re: Re: Re: Re: Re: [ANNOUNCE] git-pasky-0.1)

2005-04-12 Thread Petr Baudis
Dear diary, on Tue, Apr 12, 2005 at 11:50:48AM CEST, I got a letter
where Petr Baudis <[EMAIL PROTECTED]> told me that...
> Dear diary, on Tue, Apr 12, 2005 at 10:39:40AM CEST, I got a letter
> where Geert Uytterhoeven <[EMAIL PROTECTED]> told me that...
> > On Tue, 12 Apr 2005, Petr Baudis wrote:
> ..snip..
> > > Basically, when you look at merge(1) :
> > > 
> > > SYNOPSIS
> > >merge [ options ] file1 file2 file3
> > > DESCRIPTION
> > >merge  incorporates  all  changes that lead from file2 to file3
> > > into file1.
> > > 
> > > The only big problem is how to guess the best file2 when you give it
> > > file3 and file1.
> > 
> > That's either the point just before you started modifying the file, or your
> > last merge point. Sounds simple, but if your SCM system doesn't track 
> > merges,
> > your SOL...
> 
> Well, yes, but the last merge point search may not be so simple:
> 
> A --1---26---7
> B\   `-4-.  /
> C `-3-5'
> 
> Now, when at 7, your last merge point is not 1, but 2.

...and this is obviously wrong, sorry. You would lose 3 this way.

-- 
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
98% of the time I am right. Why worry about the other 3%.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: Re: Re: GIT license (Re: Re: Re: Re: Re: [ANNOUNCE] git-pasky-0.1)

2005-04-12 Thread Petr Baudis
Dear diary, on Tue, Apr 12, 2005 at 10:39:40AM CEST, I got a letter
where Geert Uytterhoeven <[EMAIL PROTECTED]> told me that...
> On Tue, 12 Apr 2005, Petr Baudis wrote:
..snip..
> > Basically, when you look at merge(1) :
> > 
> > SYNOPSIS
> >merge [ options ] file1 file2 file3
> > DESCRIPTION
> >merge  incorporates  all  changes that lead from file2 to file3
> > into file1.
> > 
> > The only big problem is how to guess the best file2 when you give it
> > file3 and file1.
> 
> That's either the point just before you started modifying the file, or your
> last merge point. Sounds simple, but if your SCM system doesn't track merges,
> your SOL...

Well, yes, but the last merge point search may not be so simple:

A --1---26---7
B\   `-4-.  /
C `-3-5'

Now, when at 7, your last merge point is not 1, but 2.

What I have proposed at the git mailing list was to have simple merging
tracking - merges/branch1/branch2 directory structure which would store
merges from branch2 to branch1. Then, when merging say to branch3, you
traverse all of them and if any of the branch1/* commits is newer than
branch3/*, you update it.

The disadvantage is that you now need to strictly use gitmerge.sh to do
the merges - Linus' revtree solution is nicer in the regard that it
works without any explicit bookkeeping, and tracks any merges properly
recorded with commit-file; it is more complex and more expensive,
though.

-- 
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
98% of the time I am right. Why worry about the other 3%.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: Re: GIT license (Re: Re: Re: Re: Re: [ANNOUNCE] git-pasky-0.1)

2005-04-12 Thread Geert Uytterhoeven
On Tue, 12 Apr 2005, Petr Baudis wrote:
> Dear diary, on Tue, Apr 12, 2005 at 03:20:18AM CEST, I got a letter
> where "Adam J. Richter" <[EMAIL PROTECTED]> told me that...
> > >Dear diary, on Mon, Apr 11, 2005 at 05:46:38PM CEST, I got a letter
> > >where "Adam J. Richter" <[EMAIL PROTECTED]> told me that...
> > >..snip..
> > >> Graydon Hoare.  (By the way, I would prefer that git just punt to
> > >> user level programs for diff and merge when all of the versions
> > >> involved are different or at least have a very thin interface
> > >> for extending the facility, because I would like to do some character
> > >> based merge stuff.)
> > >..snip..
> > 
> > >But this is what git already does. I agree it could do it even better,
> > >by checking environment variables for the appropriate tools (then you
> > >could use that to pass diff e.g. -p etc.).
> > 
> > This message from Linus seemed to imply that git was going to get
> > its own 3-way merge code:
> > 
> > | Then the bad news: the merge algorithm is going to suck. It's going to be
> > | just plain 3-way merge, the same RCS/CVS thing you've seen before. With no
> > | understanding of renames etc. I'll try to find the best parent to base the
> > | merge off of, although early testers may have to tell the piece of crud
> > | what the most recent common parent was.
> 
> Well, from what I can read it says "just plain 3-way merge, the same
> RCS/CVS thing you've seen before". :-)
> 
> Basically, when you look at merge(1) :
> 
> SYNOPSIS
>merge [ options ] file1 file2 file3
> DESCRIPTION
>merge  incorporates  all  changes that lead from file2 to file3
> into file1.
> 
> The only big problem is how to guess the best file2 when you give it
> file3 and file1.

That's either the point just before you started modifying the file, or your
last merge point. Sounds simple, but if your SCM system doesn't track merges,
your SOL...

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [EMAIL PROTECTED]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: Re: GIT license (Re: Re: Re: Re: Re: [ANNOUNCE] git-pasky-0.1)

2005-04-12 Thread Geert Uytterhoeven
On Tue, 12 Apr 2005, Petr Baudis wrote:
 Dear diary, on Tue, Apr 12, 2005 at 03:20:18AM CEST, I got a letter
 where Adam J. Richter [EMAIL PROTECTED] told me that...
  Dear diary, on Mon, Apr 11, 2005 at 05:46:38PM CEST, I got a letter
  where Adam J. Richter [EMAIL PROTECTED] told me that...
  ..snip..
   Graydon Hoare.  (By the way, I would prefer that git just punt to
   user level programs for diff and merge when all of the versions
   involved are different or at least have a very thin interface
   for extending the facility, because I would like to do some character
   based merge stuff.)
  ..snip..
  
  But this is what git already does. I agree it could do it even better,
  by checking environment variables for the appropriate tools (then you
  could use that to pass diff e.g. -p etc.).
  
  This message from Linus seemed to imply that git was going to get
  its own 3-way merge code:
  
  | Then the bad news: the merge algorithm is going to suck. It's going to be
  | just plain 3-way merge, the same RCS/CVS thing you've seen before. With no
  | understanding of renames etc. I'll try to find the best parent to base the
  | merge off of, although early testers may have to tell the piece of crud
  | what the most recent common parent was.
 
 Well, from what I can read it says just plain 3-way merge, the same
 RCS/CVS thing you've seen before. :-)
 
 Basically, when you look at merge(1) :
 
 SYNOPSIS
merge [ options ] file1 file2 file3
 DESCRIPTION
merge  incorporates  all  changes that lead from file2 to file3
 into file1.
 
 The only big problem is how to guess the best file2 when you give it
 file3 and file1.

That's either the point just before you started modifying the file, or your
last merge point. Sounds simple, but if your SCM system doesn't track merges,
your SOL...

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [EMAIL PROTECTED]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say programmer or something like that.
-- Linus Torvalds
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: Re: Re: GIT license (Re: Re: Re: Re: Re: [ANNOUNCE] git-pasky-0.1)

2005-04-12 Thread Petr Baudis
Dear diary, on Tue, Apr 12, 2005 at 10:39:40AM CEST, I got a letter
where Geert Uytterhoeven [EMAIL PROTECTED] told me that...
 On Tue, 12 Apr 2005, Petr Baudis wrote:
..snip..
  Basically, when you look at merge(1) :
  
  SYNOPSIS
 merge [ options ] file1 file2 file3
  DESCRIPTION
 merge  incorporates  all  changes that lead from file2 to file3
  into file1.
  
  The only big problem is how to guess the best file2 when you give it
  file3 and file1.
 
 That's either the point just before you started modifying the file, or your
 last merge point. Sounds simple, but if your SCM system doesn't track merges,
 your SOL...

Well, yes, but the last merge point search may not be so simple:

A --1---26---7
B\   `-4-.  /
C `-3-5'

Now, when at 7, your last merge point is not 1, but 2.

What I have proposed at the git mailing list was to have simple merging
tracking - merges/branch1/branch2 directory structure which would store
merges from branch2 to branch1. Then, when merging say to branch3, you
traverse all of them and if any of the branch1/* commits is newer than
branch3/*, you update it.

The disadvantage is that you now need to strictly use gitmerge.sh to do
the merges - Linus' revtree solution is nicer in the regard that it
works without any explicit bookkeeping, and tracks any merges properly
recorded with commit-file; it is more complex and more expensive,
though.

-- 
Petr Pasky Baudis
Stuff: http://pasky.or.cz/
98% of the time I am right. Why worry about the other 3%.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: Re: Re: Re: GIT license (Re: Re: Re: Re: Re: [ANNOUNCE] git-pasky-0.1)

2005-04-12 Thread Petr Baudis
Dear diary, on Tue, Apr 12, 2005 at 11:50:48AM CEST, I got a letter
where Petr Baudis [EMAIL PROTECTED] told me that...
 Dear diary, on Tue, Apr 12, 2005 at 10:39:40AM CEST, I got a letter
 where Geert Uytterhoeven [EMAIL PROTECTED] told me that...
  On Tue, 12 Apr 2005, Petr Baudis wrote:
 ..snip..
   Basically, when you look at merge(1) :
   
   SYNOPSIS
  merge [ options ] file1 file2 file3
   DESCRIPTION
  merge  incorporates  all  changes that lead from file2 to file3
   into file1.
   
   The only big problem is how to guess the best file2 when you give it
   file3 and file1.
  
  That's either the point just before you started modifying the file, or your
  last merge point. Sounds simple, but if your SCM system doesn't track 
  merges,
  your SOL...
 
 Well, yes, but the last merge point search may not be so simple:
 
 A --1---26---7
 B\   `-4-.  /
 C `-3-5'
 
 Now, when at 7, your last merge point is not 1, but 2.

...and this is obviously wrong, sorry. You would lose 3 this way.

-- 
Petr Pasky Baudis
Stuff: http://pasky.or.cz/
98% of the time I am right. Why worry about the other 3%.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: GIT license (Re: Re: Re: Re: Re: [ANNOUNCE] git-pasky-0.1)

2005-04-12 Thread Chris Friesen
Petr Baudis wrote:
Dear diary, on Tue, Apr 12, 2005 at 11:50:48AM CEST, I got a letter

Well, yes, but the last merge point search may not be so simple:
A --1---26---7
B\   `-4-.  /
C `-3-5'
Now, when at 7, your last merge point is not 1, but 2.

...and this is obviously wrong, sorry. You would lose 3 this way.
Wouldn't the delta betweeen 2 and 5 include any contribution from 3?
Chris
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: Re: GIT license (Re: Re: Re: Re: Re: [ANNOUNCE] git-pasky-0.1)

2005-04-11 Thread Petr Baudis
Dear diary, on Tue, Apr 12, 2005 at 03:20:18AM CEST, I got a letter
where "Adam J. Richter" <[EMAIL PROTECTED]> told me that...
> >Dear diary, on Mon, Apr 11, 2005 at 05:46:38PM CEST, I got a letter
> >where "Adam J. Richter" <[EMAIL PROTECTED]> told me that...
> >..snip..
> >> Graydon Hoare.  (By the way, I would prefer that git just punt to
> >> user level programs for diff and merge when all of the versions
> >> involved are different or at least have a very thin interface
> >> for extending the facility, because I would like to do some character
> >> based merge stuff.)
> >..snip..
> 
> >But this is what git already does. I agree it could do it even better,
> >by checking environment variables for the appropriate tools (then you
> >could use that to pass diff e.g. -p etc.).
> 
> This message from Linus seemed to imply that git was going to get
> its own 3-way merge code:
> 
> | Then the bad news: the merge algorithm is going to suck. It's going to be
> | just plain 3-way merge, the same RCS/CVS thing you've seen before. With no
> | understanding of renames etc. I'll try to find the best parent to base the
> | merge off of, although early testers may have to tell the piece of crud
> | what the most recent common parent was.

Well, from what I can read it says "just plain 3-way merge, the same
RCS/CVS thing you've seen before". :-)

Basically, when you look at merge(1) :

SYNOPSIS
   merge [ options ] file1 file2 file3
DESCRIPTION
   merge  incorporates  all  changes that lead from file2 to file3
into file1.

The only big problem is how to guess the best file2 when you give it
file3 and file1.

-- 
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
98% of the time I am right. Why worry about the other 3%.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: GIT license (Re: Re: Re: Re: Re: [ANNOUNCE] git-pasky-0.1)

2005-04-11 Thread Adam J. Richter
On Mon, 11 Apr 2005 20:45:38 +0200, Peter Baudis wrote:
>  Hello,

>  please do not trim the cc list so agressively.

Sorry.  I read the list from a web site that does not show the
cc lists.  I'll try to cc more people from the relevant discussions
though.  On the other hand, I've dropped Linus from this message,
as it just points to something he previously said.

>Dear diary, on Mon, Apr 11, 2005 at 05:46:38PM CEST, I got a letter
>where "Adam J. Richter" <[EMAIL PROTECTED]> told me that...
>..snip..
>> Graydon Hoare.  (By the way, I would prefer that git just punt to
>> user level programs for diff and merge when all of the versions
>> involved are different or at least have a very thin interface
>> for extending the facility, because I would like to do some character
>> based merge stuff.)
>..snip..

>But this is what git already does. I agree it could do it even better,
>by checking environment variables for the appropriate tools (then you
>could use that to pass diff e.g. -p etc.).

This message from Linus seemed to imply that git was going to get
its own 3-way merge code:

| Then the bad news: the merge algorithm is going to suck. It's going to be
| just plain 3-way merge, the same RCS/CVS thing you've seen before. With no
| understanding of renames etc. I'll try to find the best parent to base the
| merge off of, although early testers may have to tell the piece of crud
| what the most recent common parent was.

( from http://marc.theaimsgroup.com/?l=linux-kernel=111320013100822=2 )


__ __
Adam J. Richter\ /
[EMAIL PROTECTED]  | g g d r a s i l
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: GIT license (Re: Re: Re: Re: Re: [ANNOUNCE] git-pasky-0.1)

2005-04-11 Thread Petr Baudis
  Hello,

  please do not trim the cc list so agressively.

Dear diary, on Mon, Apr 11, 2005 at 05:46:38PM CEST, I got a letter
where "Adam J. Richter" <[EMAIL PROTECTED]> told me that...
..snip..
> Graydon Hoare.  (By the way, I would prefer that git just punt to
> user level programs for diff and merge when all of the versions
> involved are different or at least have a very thin interface
> for extending the facility, because I would like to do some character
> based merge stuff.)
..snip..

But this is what git already does. I agree it could do it even better,
by checking environment variables for the appropriate tools (then you
could use that to pass diff e.g. -p etc.).

-- 
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
98% of the time I am right. Why worry about the other 3%.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: GIT license (Re: Re: Re: Re: Re: [ANNOUNCE] git-pasky-0.1)

2005-04-11 Thread Florian Weimer
* Petr Baudis:

>> Almost certainly, v3 will be incompatible with v2 because it adds
>> further restrictions.  This means that your proposal would result in
>> software which is not redistributable by third parties.
>
> Hmm, what would be actually the point in introducing further
> restrictions? Anyone who then wants to get around them will just
> distribute the software with the "any later version" provision under
> GPLv2, and GPLv3 will have no impact expect for new software with "v3 or
> any later version" provision. What am I missing?

Software continues to evolve.  The copyright owners can relicense the
code base under v3, and use v3 for all subsequent changes to the
software.  The trouble with relicensing is that you have to contact
all copyright holders (or remove their code).  This tends to be
impossible in long-running projects without contractual agreements
between the developers.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: GIT license (Re: Re: Re: Re: Re: [ANNOUNCE] git-pasky-0.1)

2005-04-11 Thread Adam J. Richter
On 2005-04-11, Linus Torvalds wrote:
>I'm inclined to go with GPLv2 just because it's the most common one [...]

You may want to use a file from GPL'ed monotone that
implements a substantial diff optimization described in the August
1989 paper by Sun Wu, Udi Manber and Gene Myers ("An O(NP) Sequence
Comparison Algorithm").  According to th file, that implementation
was a port of some Scheme code written by Aubrey Jaffer to C++ by
Graydon Hoare.  (By the way, I would prefer that git just punt to
user level programs for diff and merge when all of the versions
involved are different or at least have a very thin interface
for extending the facility, because I would like to do some character
based merge stuff.)

It looks to me like the anti-patent provisions of OSLv2.1
could be circumvented by an offender creating a separate company
to do patent litigation.  So, I think you'll find that the software
reuse benefits (both to GIT and to other software projects) of the
more widely used GPL ougtweigh the anti-patent benefits of OSLv2.1.

Although I like the idea of anti-patent provisions, such
as those in OSLv2.1, I think mutual compatability of free
software is probably more consequential, even from a purely
political perspective.

Perhaps you might want to consider offering the code
under the distributor's choice of either license if you want
to offer the very minor benefits of slightly easier compliance
to those who do not litigate software patents, or, perhaps more
importantly, the ability of the software to be copied into
OSLv2.1 projects (if there are any).

__ __ 
Adam J. Richter\ /
[EMAIL PROTECTED]  | g g d r a s i l
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: GIT license (Re: Re: Re: Re: Re: [ANNOUNCE] git-pasky-0.1)

2005-04-11 Thread Petr Baudis
Dear diary, on Mon, Apr 11, 2005 at 10:40:00AM CEST, I got a letter
where Florian Weimer <[EMAIL PROTECTED]> told me that...
> * Ingo Molnar:
> 
> > is there any fundamental problem with going with v2 right now, and then 
> > once v3 is out and assuming it looks ok, all newly copyrightable bits 
> > (new files, rewrites, substantial contributions, etc.) get a v3 
> > copyright? (and the collection itself would be v3 too) That method 
> > wouldnt make it fully v3 automatically once v3 is out, but with time 
> > there would be enough v3 bits in it to make it essentially v3.
> 
> Almost certainly, v3 will be incompatible with v2 because it adds
> further restrictions.  This means that your proposal would result in
> software which is not redistributable by third parties.

Hmm, what would be actually the point in introducing further
restrictions? Anyone who then wants to get around them will just
distribute the software with the "any later version" provision under
GPLv2, and GPLv3 will have no impact expect for new software with "v3 or
any later version" provision. What am I missing?

I've been doing a lot of LKML catching up, and I remember someone
suggesting using GPLv2 (for kernel, but should apply to git too), with a
provision to let someone trusted (Linus) decide when GPLv3 is out
whether you can use GPLv3 for the kernel too. Does it make sense? And is
it even legally doable without sending signed written documents to
Linus' tropical hacienda?

-- 
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
98% of the time I am right. Why worry about the other 3%.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: GIT license (Re: Re: Re: Re: Re: [ANNOUNCE] git-pasky-0.1)

2005-04-11 Thread Florian Weimer
* Ingo Molnar:

> is there any fundamental problem with going with v2 right now, and then 
> once v3 is out and assuming it looks ok, all newly copyrightable bits 
> (new files, rewrites, substantial contributions, etc.) get a v3 
> copyright? (and the collection itself would be v3 too) That method 
> wouldnt make it fully v3 automatically once v3 is out, but with time 
> there would be enough v3 bits in it to make it essentially v3.

Almost certainly, v3 will be incompatible with v2 because it adds
further restrictions.  This means that your proposal would result in
software which is not redistributable by third parties.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: GIT license (Re: Re: Re: Re: Re: [ANNOUNCE] git-pasky-0.1)

2005-04-11 Thread Ingo Molnar

* Linus Torvalds <[EMAIL PROTECTED]> wrote:

> Btw, does anybody have strong opinions on the license? I didn't put in 
> a COPYING file exactly because I was torn between GPLv2 and OSL2.1.
> 
> I'm inclined to go with GPLv2 just because it's the most common one, 
> but I was wondering if anybody really had strong opinions. For 
> example, I'd really make it "v2 by default" like the kernel, since I'm 
> sure v3 will be fine, but regardless of how sure I am, I'm _not_ a 
> gambling man.

is there any fundamental problem with going with v2 right now, and then 
once v3 is out and assuming it looks ok, all newly copyrightable bits 
(new files, rewrites, substantial contributions, etc.) get a v3 
copyright? (and the collection itself would be v3 too) That method 
wouldnt make it fully v3 automatically once v3 is out, but with time 
there would be enough v3 bits in it to make it essentially v3. This way 
we wouldnt have to blanket trust v3 before having seen it, and wouldnt 
be stuck at v2 either.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: GIT license (Re: Re: Re: Re: Re: [ANNOUNCE] git-pasky-0.1)

2005-04-11 Thread Ingo Molnar

* Linus Torvalds [EMAIL PROTECTED] wrote:

 Btw, does anybody have strong opinions on the license? I didn't put in 
 a COPYING file exactly because I was torn between GPLv2 and OSL2.1.
 
 I'm inclined to go with GPLv2 just because it's the most common one, 
 but I was wondering if anybody really had strong opinions. For 
 example, I'd really make it v2 by default like the kernel, since I'm 
 sure v3 will be fine, but regardless of how sure I am, I'm _not_ a 
 gambling man.

is there any fundamental problem with going with v2 right now, and then 
once v3 is out and assuming it looks ok, all newly copyrightable bits 
(new files, rewrites, substantial contributions, etc.) get a v3 
copyright? (and the collection itself would be v3 too) That method 
wouldnt make it fully v3 automatically once v3 is out, but with time 
there would be enough v3 bits in it to make it essentially v3. This way 
we wouldnt have to blanket trust v3 before having seen it, and wouldnt 
be stuck at v2 either.

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: GIT license (Re: Re: Re: Re: Re: [ANNOUNCE] git-pasky-0.1)

2005-04-11 Thread Florian Weimer
* Ingo Molnar:

 is there any fundamental problem with going with v2 right now, and then 
 once v3 is out and assuming it looks ok, all newly copyrightable bits 
 (new files, rewrites, substantial contributions, etc.) get a v3 
 copyright? (and the collection itself would be v3 too) That method 
 wouldnt make it fully v3 automatically once v3 is out, but with time 
 there would be enough v3 bits in it to make it essentially v3.

Almost certainly, v3 will be incompatible with v2 because it adds
further restrictions.  This means that your proposal would result in
software which is not redistributable by third parties.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: GIT license (Re: Re: Re: Re: Re: [ANNOUNCE] git-pasky-0.1)

2005-04-11 Thread Petr Baudis
Dear diary, on Mon, Apr 11, 2005 at 10:40:00AM CEST, I got a letter
where Florian Weimer [EMAIL PROTECTED] told me that...
 * Ingo Molnar:
 
  is there any fundamental problem with going with v2 right now, and then 
  once v3 is out and assuming it looks ok, all newly copyrightable bits 
  (new files, rewrites, substantial contributions, etc.) get a v3 
  copyright? (and the collection itself would be v3 too) That method 
  wouldnt make it fully v3 automatically once v3 is out, but with time 
  there would be enough v3 bits in it to make it essentially v3.
 
 Almost certainly, v3 will be incompatible with v2 because it adds
 further restrictions.  This means that your proposal would result in
 software which is not redistributable by third parties.

Hmm, what would be actually the point in introducing further
restrictions? Anyone who then wants to get around them will just
distribute the software with the any later version provision under
GPLv2, and GPLv3 will have no impact expect for new software with v3 or
any later version provision. What am I missing?

I've been doing a lot of LKML catching up, and I remember someone
suggesting using GPLv2 (for kernel, but should apply to git too), with a
provision to let someone trusted (Linus) decide when GPLv3 is out
whether you can use GPLv3 for the kernel too. Does it make sense? And is
it even legally doable without sending signed written documents to
Linus' tropical hacienda?

-- 
Petr Pasky Baudis
Stuff: http://pasky.or.cz/
98% of the time I am right. Why worry about the other 3%.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: GIT license (Re: Re: Re: Re: Re: [ANNOUNCE] git-pasky-0.1)

2005-04-11 Thread Adam J. Richter
On 2005-04-11, Linus Torvalds wrote:
I'm inclined to go with GPLv2 just because it's the most common one [...]

You may want to use a file from GPL'ed monotone that
implements a substantial diff optimization described in the August
1989 paper by Sun Wu, Udi Manber and Gene Myers (An O(NP) Sequence
Comparison Algorithm).  According to th file, that implementation
was a port of some Scheme code written by Aubrey Jaffer to C++ by
Graydon Hoare.  (By the way, I would prefer that git just punt to
user level programs for diff and merge when all of the versions
involved are different or at least have a very thin interface
for extending the facility, because I would like to do some character
based merge stuff.)

It looks to me like the anti-patent provisions of OSLv2.1
could be circumvented by an offender creating a separate company
to do patent litigation.  So, I think you'll find that the software
reuse benefits (both to GIT and to other software projects) of the
more widely used GPL ougtweigh the anti-patent benefits of OSLv2.1.

Although I like the idea of anti-patent provisions, such
as those in OSLv2.1, I think mutual compatability of free
software is probably more consequential, even from a purely
political perspective.

Perhaps you might want to consider offering the code
under the distributor's choice of either license if you want
to offer the very minor benefits of slightly easier compliance
to those who do not litigate software patents, or, perhaps more
importantly, the ability of the software to be copied into
OSLv2.1 projects (if there are any).

__ __ 
Adam J. Richter\ /
[EMAIL PROTECTED]  | g g d r a s i l
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: GIT license (Re: Re: Re: Re: Re: [ANNOUNCE] git-pasky-0.1)

2005-04-11 Thread Florian Weimer
* Petr Baudis:

 Almost certainly, v3 will be incompatible with v2 because it adds
 further restrictions.  This means that your proposal would result in
 software which is not redistributable by third parties.

 Hmm, what would be actually the point in introducing further
 restrictions? Anyone who then wants to get around them will just
 distribute the software with the any later version provision under
 GPLv2, and GPLv3 will have no impact expect for new software with v3 or
 any later version provision. What am I missing?

Software continues to evolve.  The copyright owners can relicense the
code base under v3, and use v3 for all subsequent changes to the
software.  The trouble with relicensing is that you have to contact
all copyright holders (or remove their code).  This tends to be
impossible in long-running projects without contractual agreements
between the developers.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: GIT license (Re: Re: Re: Re: Re: [ANNOUNCE] git-pasky-0.1)

2005-04-11 Thread Petr Baudis
  Hello,

  please do not trim the cc list so agressively.

Dear diary, on Mon, Apr 11, 2005 at 05:46:38PM CEST, I got a letter
where Adam J. Richter [EMAIL PROTECTED] told me that...
..snip..
 Graydon Hoare.  (By the way, I would prefer that git just punt to
 user level programs for diff and merge when all of the versions
 involved are different or at least have a very thin interface
 for extending the facility, because I would like to do some character
 based merge stuff.)
..snip..

But this is what git already does. I agree it could do it even better,
by checking environment variables for the appropriate tools (then you
could use that to pass diff e.g. -p etc.).

-- 
Petr Pasky Baudis
Stuff: http://pasky.or.cz/
98% of the time I am right. Why worry about the other 3%.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: GIT license (Re: Re: Re: Re: Re: [ANNOUNCE] git-pasky-0.1)

2005-04-11 Thread Adam J. Richter
On Mon, 11 Apr 2005 20:45:38 +0200, Peter Baudis wrote:
  Hello,

  please do not trim the cc list so agressively.

Sorry.  I read the list from a web site that does not show the
cc lists.  I'll try to cc more people from the relevant discussions
though.  On the other hand, I've dropped Linus from this message,
as it just points to something he previously said.

Dear diary, on Mon, Apr 11, 2005 at 05:46:38PM CEST, I got a letter
where Adam J. Richter [EMAIL PROTECTED] told me that...
..snip..
 Graydon Hoare.  (By the way, I would prefer that git just punt to
 user level programs for diff and merge when all of the versions
 involved are different or at least have a very thin interface
 for extending the facility, because I would like to do some character
 based merge stuff.)
..snip..

But this is what git already does. I agree it could do it even better,
by checking environment variables for the appropriate tools (then you
could use that to pass diff e.g. -p etc.).

This message from Linus seemed to imply that git was going to get
its own 3-way merge code:

| Then the bad news: the merge algorithm is going to suck. It's going to be
| just plain 3-way merge, the same RCS/CVS thing you've seen before. With no
| understanding of renames etc. I'll try to find the best parent to base the
| merge off of, although early testers may have to tell the piece of crud
| what the most recent common parent was.

( from http://marc.theaimsgroup.com/?l=linux-kernelm=111320013100822w=2 )


__ __
Adam J. Richter\ /
[EMAIL PROTECTED]  | g g d r a s i l
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: Re: GIT license (Re: Re: Re: Re: Re: [ANNOUNCE] git-pasky-0.1)

2005-04-11 Thread Petr Baudis
Dear diary, on Tue, Apr 12, 2005 at 03:20:18AM CEST, I got a letter
where Adam J. Richter [EMAIL PROTECTED] told me that...
 Dear diary, on Mon, Apr 11, 2005 at 05:46:38PM CEST, I got a letter
 where Adam J. Richter [EMAIL PROTECTED] told me that...
 ..snip..
  Graydon Hoare.  (By the way, I would prefer that git just punt to
  user level programs for diff and merge when all of the versions
  involved are different or at least have a very thin interface
  for extending the facility, because I would like to do some character
  based merge stuff.)
 ..snip..
 
 But this is what git already does. I agree it could do it even better,
 by checking environment variables for the appropriate tools (then you
 could use that to pass diff e.g. -p etc.).
 
 This message from Linus seemed to imply that git was going to get
 its own 3-way merge code:
 
 | Then the bad news: the merge algorithm is going to suck. It's going to be
 | just plain 3-way merge, the same RCS/CVS thing you've seen before. With no
 | understanding of renames etc. I'll try to find the best parent to base the
 | merge off of, although early testers may have to tell the piece of crud
 | what the most recent common parent was.

Well, from what I can read it says just plain 3-way merge, the same
RCS/CVS thing you've seen before. :-)

Basically, when you look at merge(1) :

SYNOPSIS
   merge [ options ] file1 file2 file3
DESCRIPTION
   merge  incorporates  all  changes that lead from file2 to file3
into file1.

The only big problem is how to guess the best file2 when you give it
file3 and file1.

-- 
Petr Pasky Baudis
Stuff: http://pasky.or.cz/
98% of the time I am right. Why worry about the other 3%.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: GIT license (Re: Re: Re: Re: Re: [ANNOUNCE] git-pasky-0.1)

2005-04-10 Thread Nur Hussein
> Btw, does anybody have strong opinions on the license? I didn't put in a 
> COPYING file exactly because I was torn between GPLv2 and OSL2.1.

I think GPLv2 would create the least amount of objection in the
community, so I'd probably want to go with that.

Nur Hussein
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: Re: [ANNOUNCE] git-pasky-0.1

2005-04-10 Thread Linus Torvalds


On Mon, 11 Apr 2005, Petr Baudis wrote:
>
> Dear diary, on Sun, Apr 10, 2005 at 10:38:11PM CEST, I got a letter
> where Linus Torvalds <[EMAIL PROTECTED]> told me that...
> ..snip..
> > Can you pull my current repo, which has "diff-tree -R" that does what the 
> > name suggests, and which should be faster than the 0.48 sec you see..
> 
> Am I just missing something, or your diff-tree doesn't handle
> added/removed directories?

You're not missing anything, I did it that way on purpose. I thought it 
would be easier to do the expansion in the caller (who knows what it is 
they want to do with the end result).

But now that I look at merging, I realize that was actually the wrong
thing to do. A merge algorithm definitely wants to see the expanded tree,
since it will compare/join several of the diff-tree output things. 

So I'll either fix it or decide to just go with your version instead. I'm 
not overly proud. 

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: Re: [ANNOUNCE] git-pasky-0.1

2005-04-10 Thread Petr Baudis
Dear diary, on Sun, Apr 10, 2005 at 10:38:11PM CEST, I got a letter
where Linus Torvalds <[EMAIL PROTECTED]> told me that...
..snip..
> Can you pull my current repo, which has "diff-tree -R" that does what the 
> name suggests, and which should be faster than the 0.48 sec you see..

Am I just missing something, or your diff-tree doesn't handle
added/removed directories?

(Mine does! *hint* *hint* It also doesn't bother with dynamic
allocation, but someone might consider the static path buffer ugly.
Anyway, I hacked it with a plan to do a massive cleanup of the file
later.)

-- 
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
98% of the time I am right. Why worry about the other 3%.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: GIT license (Re: Re: Re: Re: Re: [ANNOUNCE] git-pasky-0.1)

2005-04-10 Thread Petr Baudis
Dear diary, on Mon, Apr 11, 2005 at 02:20:52AM CEST, I got a letter
where Linus Torvalds <[EMAIL PROTECTED]> told me that...
> Btw, does anybody have strong opinions on the license? I didn't put in a 
> COPYING file exactly because I was torn between GPLv2 and OSL2.1.
> 
> I'm inclined to go with GPLv2 just because it's the most common one, but I 
> was wondering if anybody really had strong opinions. For example, I'd 
> really make it "v2 by default" like the kernel, since I'm sure v3 will be 
> fine, but regardless of how sure I am, I'm _not_ a gambling man.

Oh, I wanted to ask about this too. I'd mostly prefer GPLv2 (I have no
problem with the version restriction, I usually do it too), it's the one
I'm mostly familiar with and OSL appears to be incompatible with GPL (at
least FSF says so about OSL1.0), which might create various annoying
issues. I hate when licenses get in my way and prevent me to possibly
include some useful code.

-- 
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
98% of the time I am right. Why worry about the other 3%.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


GIT license (Re: Re: Re: Re: Re: [ANNOUNCE] git-pasky-0.1)

2005-04-10 Thread Linus Torvalds


Btw, does anybody have strong opinions on the license? I didn't put in a 
COPYING file exactly because I was torn between GPLv2 and OSL2.1.

I'm inclined to go with GPLv2 just because it's the most common one, but I 
was wondering if anybody really had strong opinions. For example, I'd 
really make it "v2 by default" like the kernel, since I'm sure v3 will be 
fine, but regardless of how sure I am, I'm _not_ a gambling man.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] git-pasky-0.1

2005-04-10 Thread Randy.Dunlap
On Sun, 10 Apr 2005 16:23:11 -0700 Paul Jackson wrote:

| Petr wrote:
| > That reminds me, is there any
| > tool which will take .rej files and throw them into the file to create
| > rcsmerge-like conflicts?
| 
|   Check out 'wiggle'
| http://www.cse.unsw.edu.au/~neilb/source/wiggle/

or Chris Mason's 'rej' program:
ftp://ftp.suse.com/pub/people/mason/rej/


---
~Randy
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: Re: Re: Re: [ANNOUNCE] git-pasky-0.1

2005-04-10 Thread Petr Baudis
Dear diary, on Mon, Apr 11, 2005 at 01:46:50AM CEST, I got a letter
where Linus Torvalds <[EMAIL PROTECTED]> told me that...
> 
> 
> On Mon, 11 Apr 2005, Petr Baudis wrote:
> > 
> > (BTW, it would be useful to have a tool which just blindly takes what
> > you give it on input and throws it to an object of given type; I will
> > need to construct arbitrary commits during the rebuild if I'm to keep
> > the correct dates.)
> 
> Hah. That's what "COMMITTER_NAME" "COMMITTER_EMAIL" and "COMMITTER_DATE" 
> are there for.
> 
> There's two things to commits: when (and by whom) it was committed to a
> tree, and when the changes were really done.
> 
> So set the COMMITTER_xxx things to the person/time you want to consider 
> the _original_ one, and let "commit-tree" author you as the creator of the 
> commit itself. The regular "ChangeLog" thing should only show the author 
> and original time, but it's nice to see who created the commit itself.

I already use those - look at my ChangeLog. (That's because for certain
reasons I'm working on git in a half-broken chrooted environment.)

When rebuilding the tree from scratch, I wanted like to do it
transparently - that is, so that noone could notice that I rebuilt it,
since it effectively still _is_ the original tree from the data
standpoint, just the history flow is actually correct this time.

> Btw, the "COMMITTER_" environment variables are very confusingly
> named. They actually go into the _author_ line in the commit object. I'm a
> total retard, and I really don't know why I called it "COMMITTER_xxx"  
> instead of "AUTHOR_xxx".

So, who will fix it in his tree first! ;-)

-- 
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
98% of the time I am right. Why worry about the other 3%.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: Re: [ANNOUNCE] git-pasky-0.1

2005-04-10 Thread Petr Baudis
Dear diary, on Sun, Apr 10, 2005 at 11:39:02PM CEST, I got a letter
where Linus Torvalds <[EMAIL PROTECTED]> told me that...
> On Sun, 10 Apr 2005, Linus Torvalds wrote:
> > 
> > Can you pull my current repo, which has "diff-tree -R" that does what the 
> > name suggests, and which should be faster than the 0.48 sec you see..
> 
> Actually, I changed things around. Everybody hated the "<" ">" lines, so I 
> put a changed thing on a line of its own with a "*" instead.
> 
> So you'd now see lines like
> 
>   *100644->100644 
> 1874e031abf6631ea51cf6177b82a1e662f6183e->e8181df8499f165cacc6a0d8783be7143013d410
>  CREDITS
> 
> which means that the CREDITS file has changed, and it shows you the mode
> -> mode transition (that didn't change in this case) and the sha1 -> sha1
> transition.
> 
> So now it's always just one line per change. Firthermore, the filename is 
> always field 3, if you use spaces as delimeters, regardless of whether 
> it's a +/-/* field.

That's great, just when I finally managed to properly fix the xargs
boundary case in gitdiff-do (without throwing away the NUL-termination).
You know how to please people! ;-)

(Not that I'd have *anything* against the change. The logic is simpler
and you'll be actually able to work with diff-tree a little sanely.)

BTW, it is quite handy to have the entry type in the listing (guessing
that from mode in the script just doesn't feel right and doing explicit
cat-file kills the performance). I would also really prefer the fields
separated by tabs. It looks nicer on the screen (aligned, e.g. modes and
type are varsized), and is also easier to parse (cut defaults to tabs as
delimiters, for example).

> So let's say you want to merge two trees (dst1 and dst2) from a common
> parent (src), what you would do is:
> 
>  - get the list of files to merge:
> 
>   diff-tree -R   | tr '\0' '\n' > merge-files

...oh, I probably forgot to ask - why did you choose -R instead of -r?
It looks rather alien to me; if it starts by 'diff', my hand writes -r
without thinking.

>  - Which of those were changed by  -> ?
> 
>   diff-tree -R   | tr '\0' '\n' | join -j 3 - merge-files > 
> dst1-change
>   diff-tree -R   | tr '\0' '\n' | join -j 3 - merge-files > 
> dst2-change
> 
>  - Which of those are common to both? Let's see what the merge list is:
> 
>   join dst1-change dst2-change > merge-list
> 
> and hopefully you'd usually be working on a very small list of files by 
> then (everything else you'd just pick from one of the destination trees 
> directly - you've got the name, the sha-file, everything: no need to even 
> look at the data).

Ok, this looks reasonable. (Provided that I DWYM regarding the joins.)

> Does this sound sane? Pasky? Wanna try a "git merge" thing? Starting off
> with the user having to tell what the common parent tree is - we can try
> to do the "automatically find best common parent" crud later. THAT may be 
> expensive.

I will definitively try "git merge", but maybe not this night anymore
(it's already 1:32 here now).

-- 
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
98% of the time I am right. Why worry about the other 3%.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: Re: Re: [ANNOUNCE] git-pasky-0.1

2005-04-10 Thread Linus Torvalds


On Mon, 11 Apr 2005, Petr Baudis wrote:
> 
> (BTW, it would be useful to have a tool which just blindly takes what
> you give it on input and throws it to an object of given type; I will
> need to construct arbitrary commits during the rebuild if I'm to keep
> the correct dates.)

Hah. That's what "COMMITTER_NAME" "COMMITTER_EMAIL" and "COMMITTER_DATE" 
are there for.

There's two things to commits: when (and by whom) it was committed to a
tree, and when the changes were really done.

So set the COMMITTER_xxx things to the person/time you want to consider 
the _original_ one, and let "commit-tree" author you as the creator of the 
commit itself. The regular "ChangeLog" thing should only show the author 
and original time, but it's nice to see who created the commit itself.

I did this very much on purpose: see how I always try to attribute 
authorship in BK to the person who actually wrote the code. At the same 
time, I think it's interesting from a tracking standpoint to also see 
when/where that change got introduced into a tree.

I _tried_ to get this right in the sparse tree conversion. I won't 
guarantee that it's all correct, but the top commit in the sparse tree 
looks like this:

tree 67607f05a66e36b2f038c77cfb61350d2110f7e8
parent 9c59995fef9b52386e5f7242f44720a7aca287d7
author Christopher Li <[EMAIL PROTECTED]> Sat Apr  2 09:30:09 PST 2005
committer Linus Torvalds <[EMAIL PROTECTED]> Thu Apr  7 20:06:31 2005

...

exactly because I tracked when I committed it to the sparse tree 
_separately_ from tracking when it was created.

So when I re-create the sparse-tree, I'll also end up re-writing the 
"committer" information. And that's proper. That's really saying "this 
sha1 object was created by Xxxx at time Xxxx".

Btw, the "COMMITTER_" environment variables are very confusingly
named. They actually go into the _author_ line in the commit object. I'm a
total retard, and I really don't know why I called it "COMMITTER_xxx"  
instead of "AUTHOR_xxx".

Linus "retard" Torvalds
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] git-pasky-0.1

2005-04-10 Thread Paul Jackson
Petr wrote:
> That reminds me, is there any
> tool which will take .rej files and throw them into the file to create
> rcsmerge-like conflicts?

  Check out 'wiggle'
http://www.cse.unsw.edu.au/~neilb/source/wiggle/

-- 
  I won't rest till it's the best ...
  Programmer, Linux Scalability
  Paul Jackson <[EMAIL PROTECTED]> 1.650.933.1373, 
1.925.600.0401
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: Re: Re: [ANNOUNCE] git-pasky-0.1

2005-04-10 Thread Petr Baudis
Dear diary, on Mon, Apr 11, 2005 at 01:10:58AM CEST, I got a letter
where Linus Torvalds <[EMAIL PROTECTED]> told me that...
> 
> 
> On Mon, 11 Apr 2005, Petr Baudis wrote:
> > 
> > I currently already do a merge when you track someone's source - it will
> > throw away your previous HEAD record though
> 
> Not only that, it doesn't do what I consider a "merge". 
> 
> A real merge should have two or more parents. The "commit-tree" command
> already allows that: just add any arbitrary number of "-p x"  
> switches (well, I think I limited it to 16 parents, but that's just a
> totally random number, there's nothing in the file format or anything 
> else that limits it).
> 
> So while you've merged my "data", but you've not actually merged my
> revision history in your tree.

Well, that's exactly what I was (am) going to do. :-) That's also why I
said that I (virtually) throw the local commits away now. Instead, if
there were any local commits, I will do git merge:

commit-tree $(write-tree) -p $local_head -p $tracked_tree

Note that I will need to make this two-phase - first applying the
changes, then doing the commit; between those two phases, the user
should resolve potential conflicts and check if the merge went right.

I think I will name the first phase git merge and the second phase will
be just git commit, and I will store the merge information in
.dircache/. (BTW, I think the directory name is pretty awful; what about
.git/ ?)

> And the reason a real merge _has_ to show both parents properly is that 
> unless you do that, you can never merge sanely another time without 
> getting lots of clashes from the previous merge. So it's important that a 
> merge really shows both trees it got data from.
> 
> This is, btw, also the reason I haven't merged with your tree - I want to 
> get to the point where I really _can_ merge without throwing away the 
> information. In fact, at this point I'd rather not merge with your tree at 
> all, because I consider your tree to be "corrupt" thanks to lacking the 
> merge history.
> 
> So you've done the data merge, but not the history merge.
> 
> And because you didn't do the history merge, there's no way to
> automatically find out what point of my tree you merged _with_. See?
> 
> And since I have no way to see what point in time you merged with me, now
> I can't generate a nice 3-way diff against the last common ancestor of
> both of our trees.
> 
> So now I can't do a three-way merge with you based on any sane ancestor,
> unless I start guessing which ancestor of mine you merged with. Now, that
> "guess" is easy enough to do with a project like "git" which currently has
> just a few tens of commits and effectively only two parallell development
> trees, but the whole point is to get to a system where that isn't true..

Well, I've wanted to get the basic things working first before doing git
merge. (Especially since until recently, diff-tree was PITA to work
with, and before that it didn't even exist.) If you want, I can rebuild
my tree with doing the merging properly, after I have git merge working.

(BTW, it would be useful to have a tool which just blindly takes what
you give it on input and throws it to an object of given type; I will
need to construct arbitrary commits during the rebuild if I'm to keep
the correct dates.)

-- 
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
98% of the time I am right. Why worry about the other 3%.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: Re: [ANNOUNCE] git-pasky-0.1

2005-04-10 Thread Linus Torvalds


On Mon, 11 Apr 2005, Petr Baudis wrote:
> 
> I currently already do a merge when you track someone's source - it will
> throw away your previous HEAD record though

Not only that, it doesn't do what I consider a "merge". 

A real merge should have two or more parents. The "commit-tree" command
already allows that: just add any arbitrary number of "-p x"  
switches (well, I think I limited it to 16 parents, but that's just a
totally random number, there's nothing in the file format or anything 
else that limits it).

So while you've merged my "data", but you've not actually merged my
revision history in your tree.

And the reason a real merge _has_ to show both parents properly is that 
unless you do that, you can never merge sanely another time without 
getting lots of clashes from the previous merge. So it's important that a 
merge really shows both trees it got data from.

This is, btw, also the reason I haven't merged with your tree - I want to 
get to the point where I really _can_ merge without throwing away the 
information. In fact, at this point I'd rather not merge with your tree at 
all, because I consider your tree to be "corrupt" thanks to lacking the 
merge history.

So you've done the data merge, but not the history merge.

And because you didn't do the history merge, there's no way to
automatically find out what point of my tree you merged _with_. See?

And since I have no way to see what point in time you merged with me, now
I can't generate a nice 3-way diff against the last common ancestor of
both of our trees.

So now I can't do a three-way merge with you based on any sane ancestor,
unless I start guessing which ancestor of mine you merged with. Now, that
"guess" is easy enough to do with a project like "git" which currently has
just a few tens of commits and effectively only two parallell development
trees, but the whole point is to get to a system where that isn't true..

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: Re: [ANNOUNCE] git-pasky-0.1

2005-04-10 Thread Petr Baudis
Dear diary, on Sun, Apr 10, 2005 at 10:38:11PM CEST, I got a letter
where Linus Torvalds <[EMAIL PROTECTED]> told me that...
> On Sun, 10 Apr 2005, Petr Baudis wrote:
> > 
> > It turns out to be the forks for doing all the cuts and such what is
> > bogging it down so awfully (doing diff-tree takes 0.48s ;-). I do about
> > 15 forks per change, I guess, and for some reason cut takes a long of
> > time on its own.
> 
> Heh.
> 
> Can you pull my current repo, which has "diff-tree -R" that does what the 
> name suggests, and which should be faster than the 0.48 sec you see..

Funnily enough, now after some more cache teasing it's ~0.185. Your one
still ~0.17, though. :/ (That might be because of the format changes,
though, since you do less printing now.) (BTW, all those measurements
are done on my AMD K6 walking on 1600MHz, 512M RAM, about 200M available
for caches.)

Just out of interest, did you have a look at my diff-tree -r
implementation and decided that you don't like it, or you weren't aware
of it?

I will probably take most of your diff-tree change, but I'd prefer to do
the sha1->tree mapping directly in diff_tree().

> It may not matter a lot, since actually generating the diff from the file 
> contents is what is expensive, but remember my goal: I want the expense of 
> a diff-tree to be relative to the size of the diff, so that implies that 
> small diffs haev to be basically instantaenous. So I care.

Me too, of course.

> So I just tried the 2.6.7->2.6.8 diff, and for me the new recursive
> "diff-tree" can generate the _list_ of files changed in zero time:
> 
>   real0m0.079s
>   user0m0.067s
>   sys 0m0.024s
> 
> but then _doing_ the diff is pretty expensive (in this case 3800+ files
> changed, so you have to unpack 7600+ objects - and even unpacking isn't
> the expensive part, the expense is literally in the diff operation
> itself).
> 
> Me, the stuff I automate is the small steps. Doing a single checkin. So
> that's the case I care about going fast, when a "diff-tree" will likely
> have maybe five files or something. That's why I want the small
> incremental cases to go fast - it it takes me a minute to generate a diff
> for a _release_, that's not a big deal. I make one release every other
> month, but I work with lots of small patches all the time.

I see.

> Anyway, with a fast diff-tree, you should be able to generate the list of 
> objects for a fast "merge". That's next. 
> 
> (And by "merge", I of course mean "suck". I'm talking about the old CVS
> three-way merge, and you have to specify the common parent explicitly and
> it won't handle any renames or any other crud. But it would get us to 
> something that might actually be useful for simple things. Which is why 
> "diff-tree" is important - it gives the information about what to tell 
> merge).

I currently already do a merge when you track someone's source - it will
throw away your previous HEAD record though, so if you committed some
local changes after the previous pull, you will get orphaned commits and
the changes will turn to uncommitted ones. I have some ideas regarding
how to do it properly (and do any arbitrary merging, for that matter), I
hope to get to it as soon as I catch up with you. :-)

BTW, the three-way merge comes from RCS. That reminds me, is there any
tool which will take .rej files and throw them into the file to create
rcsmerge-like conflicts? Perhaps it's fault of my bad tools, but I
prefer to work with the inline rejects much more to .rej files (except
to actually notice the rejects).

-- 
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
98% of the time I am right. Why worry about the other 3%.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: [ANNOUNCE] git-pasky-0.1

2005-04-10 Thread Linus Torvalds


On Sun, 10 Apr 2005, Linus Torvalds wrote:
> 
> Can you pull my current repo, which has "diff-tree -R" that does what the 
> name suggests, and which should be faster than the 0.48 sec you see..

Actually, I changed things around. Everybody hated the "<" ">" lines, so I 
put a changed thing on a line of its own with a "*" instead.

So you'd now see lines like

*100644->100644 
1874e031abf6631ea51cf6177b82a1e662f6183e->e8181df8499f165cacc6a0d8783be7143013d410
 CREDITS

which means that the CREDITS file has changed, and it shows you the mode
-> mode transition (that didn't change in this case) and the sha1 -> sha1
transition.

So now it's always just one line per change. Firthermore, the filename is 
always field 3, if you use spaces as delimeters, regardless of whether 
it's a +/-/* field.

So let's say you want to merge two trees (dst1 and dst2) from a common
parent (src), what you would do is:

 - get the list of files to merge:

diff-tree -R   | tr '\0' '\n' > merge-files

 - Which of those were changed by  -> ?

diff-tree -R   | tr '\0' '\n' | join -j 3 - merge-files > 
dst1-change
diff-tree -R   | tr '\0' '\n' | join -j 3 - merge-files > 
dst2-change

 - Which of those are common to both? Let's see what the merge list is:

join dst1-change dst2-change > merge-list

and hopefully you'd usually be working on a very small list of files by 
then (everything else you'd just pick from one of the destination trees 
directly - you've got the name, the sha-file, everything: no need to even 
look at the data).

Does this sound sane? Pasky? Wanna try a "git merge" thing? Starting off
with the user having to tell what the common parent tree is - we can try
to do the "automatically find best common parent" crud later. THAT may be 
expensive.

(Btw, this is why I think "diff-tree" is more important than actually
generating the real diff itself - the above uses diff-tree three times
just to cut down to the point where _hopefully_ you don't actually need to
generate very much diffs at all. So I want "diff-tree" to be really fast, 
even if it then can take a minute to actually generate a big diff between 
releases etc).

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: Re: [ANNOUNCE] git-pasky-0.1

2005-04-10 Thread Petr Baudis
Dear diary, on Sun, Apr 10, 2005 at 09:13:19PM CEST, I got a letter
where Willy Tarreau <[EMAIL PROTECTED]> told me that...
> On Sun, Apr 10, 2005 at 08:45:22PM +0200, Petr Baudis wrote:
>  
> > It turns out to be the forks for doing all the cuts and such what is
> > bogging it down so awfully (doing diff-tree takes 0.48s ;-). I do about
> > 15 forks per change, I guess, and for some reason cut takes a long of
> > time on its own.
> > 
> > I've rewritten the cuts with the use of bash arrays and other smart
> > stuff. I somehow don't feel comfortable using this and prefer the
> > old-fashioned ways, but it would be plain unusable without this.
> 
> I've encountered the same problem in a config-generation script a while
> ago. Fortunately, bash provides enough ways to remove most of the forks,
> but the result is less portable.
> 
> I've downloaded your code, but it does not compile here because of the
> tv_nsec fields in struct stat (2.4, glibc 2.2), so I cannot use it to
> get the most up to date version to take a look at the script. Basically,

Ok, I decided to stop this nsec madness (since it broke show-diff
anyway at least on my ext3), and you get it only if you pass -DNSEC
to CFLAGS now. Hope this fixes things for you. :-)

BTW, I regularly update the public copy as accessible on the web.

> all the 'cut' and 'sed' can be removed, as well as the 'dirname'. You
> can also call mkdir only if the dirs don't exist. I really think you
> should end up with only one fork in the loop to call 'diff'.

You still need to extract the file by cat-file too. ;-) And rm the files
after it compares them (so that we don't fill /tmp with crap like
certain awful programs like to do). But I will conditionalize the mkdir
calls, thanks for the suggestion - I think that's the last bit to be
squeezed from this loop (I'll yet check on the read proposal - I
considered it before and turned down for some reason, can't remember why
anymore, though).

Thanks,

-- 
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
98% of the time I am right. Why worry about the other 3%.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] git-pasky-0.1

2005-04-10 Thread Paul Jackson
Good lord - you don't need to use arrays for this.

The old-fashioned ways have their ways.  Both the 'set'
command and the 'read' command can split args and assign
to distinct variable names.

Try something like the following:

  diff-tree -r $id1 $id2 |
sed -e '/^/ / }' -e 's/./& /' |
while read op mode1 sha1 name1 mode2 sha2 name2
do
... various common stuff ...
case "$op" in
"+")
...
;;
"-")
...
;;
"<")
test $name1 = $name2 || die mismatched names
label1=$(mkbanner "$loc1" $id1 "$name1" $mode1 $sha1)
label2=$(mkbanner "$loc2" $id2 "$name1" $mode2 $sha2)
diff -L "$label1" -L "$label2" -u "$loc1" "$loc2"
;;
esac
done

-- 
  I won't rest till it's the best ...
  Programmer, Linux Scalability
  Paul Jackson <[EMAIL PROTECTED]> 1.650.933.1373, 
1.925.600.0401
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: [ANNOUNCE] git-pasky-0.1

2005-04-10 Thread Linus Torvalds


On Sun, 10 Apr 2005, Petr Baudis wrote:
> 
> It turns out to be the forks for doing all the cuts and such what is
> bogging it down so awfully (doing diff-tree takes 0.48s ;-). I do about
> 15 forks per change, I guess, and for some reason cut takes a long of
> time on its own.

Heh.

Can you pull my current repo, which has "diff-tree -R" that does what the 
name suggests, and which should be faster than the 0.48 sec you see..

It may not matter a lot, since actually generating the diff from the file 
contents is what is expensive, but remember my goal: I want the expense of 
a diff-tree to be relative to the size of the diff, so that implies that 
small diffs haev to be basically instantaenous. So I care.

So I just tried the 2.6.7->2.6.8 diff, and for me the new recursive
"diff-tree" can generate the _list_ of files changed in zero time:

real0m0.079s
user0m0.067s
sys 0m0.024s

but then _doing_ the diff is pretty expensive (in this case 3800+ files
changed, so you have to unpack 7600+ objects - and even unpacking isn't
the expensive part, the expense is literally in the diff operation
itself).

Me, the stuff I automate is the small steps. Doing a single checkin. So
that's the case I care about going fast, when a "diff-tree" will likely
have maybe five files or something. That's why I want the small
incremental cases to go fast - it it takes me a minute to generate a diff
for a _release_, that's not a big deal. I make one release every other
month, but I work with lots of small patches all the time.

Anyway, with a fast diff-tree, you should be able to generate the list of 
objects for a fast "merge". That's next. 

(And by "merge", I of course mean "suck". I'm talking about the old CVS
three-way merge, and you have to specify the common parent explicitly and
it won't handle any renames or any other crud. But it would get us to 
something that might actually be useful for simple things. Which is why 
"diff-tree" is important - it gives the information about what to tell 
merge).

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] git-pasky-0.1

2005-04-10 Thread Sean
On Sun, April 10, 2005 12:55 pm, Linus Torvalds said:

> Larry was ok with the idea to make my export format actually be natively
> supported by BK (ie the same way you have "bk export -tpatch"), but
> Tridge wanted to instead get at the native data and be difficult about
> it. As a result, I can now not only use BK any more, but we also don't
> have a nice export format from BK.
>
> Yeah, I'm a bit bitter about it.
>

Linus,

With all due respect, Larry could have dealt with this years ago and
removed the motivation for Tridge and others to pursue reverse
engineering.   Instead he chose to insult and question the motives of
everyone that wanted open-source access to the Linux history data.  The
blame for the current situation falls firmly on the choice to use a
closed-source SCM for Linux and the actions of the company that owned it.

Sean


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: [ANNOUNCE] git-pasky-0.1

2005-04-10 Thread Willy Tarreau
On Sun, Apr 10, 2005 at 08:45:22PM +0200, Petr Baudis wrote:
 
> It turns out to be the forks for doing all the cuts and such what is
> bogging it down so awfully (doing diff-tree takes 0.48s ;-). I do about
> 15 forks per change, I guess, and for some reason cut takes a long of
> time on its own.
> 
> I've rewritten the cuts with the use of bash arrays and other smart
> stuff. I somehow don't feel comfortable using this and prefer the
> old-fashioned ways, but it would be plain unusable without this.

I've encountered the same problem in a config-generation script a while
ago. Fortunately, bash provides enough ways to remove most of the forks,
but the result is less portable.

I've downloaded your code, but it does not compile here because of the
tv_nsec fields in struct stat (2.4, glibc 2.2), so I cannot use it to
get the most up to date version to take a look at the script. Basically,
all the 'cut' and 'sed' can be removed, as well as the 'dirname'. You
can also call mkdir only if the dirs don't exist. I really think you
should end up with only one fork in the loop to call 'diff'.

> Now I'm down to
> 
>   real1m21.440s
>   user0m32.374s
>   sys 0m42.200s
> 
> and I kinda doubt if it is possible to cut this much down. Almost no
> disk activity, I have almost everything cached by now, apparently.

It is very common to cut times by a factor of 10 or more when replacing
common unix tools by pure shell. Dynamic library initialization also
takes a lot of time nowadays, and probably you have localisation which
is big too. Sometimes, just wiping a few variables at the top of the
shell might remove some useless overhead.

> Anyway, you can git pull to get the optimized version.
> 
> Thanks for the help,

Willy

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: [ANNOUNCE] git-pasky-0.1

2005-04-10 Thread Petr Baudis
Dear diary, on Sun, Apr 10, 2005 at 07:45:12PM CEST, I got a letter
where Ingo Molnar <[EMAIL PROTECTED]> told me that...
> 
> * Willy Tarreau <[EMAIL PROTECTED]> wrote:
> 
> > > >   I will also need to do more testing on the linux kernel tree.
> > > > Committing patch-2.6.7 on 2.6.6 kernel and then diffing results in
> > > > 
> > > > $ time gitdiff.sh `parent-id` `tree-id` >p
> > > > real5m37.434s
> > > > user1m27.113s
> > > > sys 2m41.036s
> > > > 
> > > > which is pretty horrible, it seems to me. Any benchmarking help is of
> > > > course welcomed, as well as any other feedback.
> > > 
> > > it seems from the numbers that your system doesnt have enough RAM for 
> > > this and is getting IO-bound?
> > 
> > Not the only problem, without I/O, he will go down to 4m8s (u+s) which 
> > is still in the same order of magnitude.
> 
> probably not the only problem - but if we are lucky then his system was 
> just trashing within the kernel repository and then most of the overhead 
> is the _unnecessary_ IO that happened due to that (which causes CPU 
> overhead just as much). The dominant system time suggests so, to a 
> certain degree. Maybe this is wishful thinking.

It turns out to be the forks for doing all the cuts and such what is
bogging it down so awfully (doing diff-tree takes 0.48s ;-). I do about
15 forks per change, I guess, and for some reason cut takes a long of
time on its own.

I've rewritten the cuts with the use of bash arrays and other smart
stuff. I somehow don't feel comfortable using this and prefer the
old-fashioned ways, but it would be plain unusable without this.

Now I'm down to

real1m21.440s
user0m32.374s
sys 0m42.200s

and I kinda doubt if it is possible to cut this much down. Almost no
disk activity, I have almost everything cached by now, apparently.

Anyway, you can git pull to get the optimized version.

Thanks for the help,

-- 
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
98% of the time I am right. Why worry about the other 3%.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] git-pasky-0.1

2005-04-10 Thread Ingo Molnar

* Willy Tarreau <[EMAIL PROTECTED]> wrote:

> > >   I will also need to do more testing on the linux kernel tree.
> > > Committing patch-2.6.7 on 2.6.6 kernel and then diffing results in
> > > 
> > >   $ time gitdiff.sh `parent-id` `tree-id` >p
> > >   real5m37.434s
> > >   user1m27.113s
> > >   sys 2m41.036s
> > > 
> > > which is pretty horrible, it seems to me. Any benchmarking help is of
> > > course welcomed, as well as any other feedback.
> > 
> > it seems from the numbers that your system doesnt have enough RAM for 
> > this and is getting IO-bound?
> 
> Not the only problem, without I/O, he will go down to 4m8s (u+s) which 
> is still in the same order of magnitude.

probably not the only problem - but if we are lucky then his system was 
just trashing within the kernel repository and then most of the overhead 
is the _unnecessary_ IO that happened due to that (which causes CPU 
overhead just as much). The dominant system time suggests so, to a 
certain degree. Maybe this is wishful thinking.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] git-pasky-0.1

2005-04-10 Thread Willy Tarreau
On Sun, Apr 10, 2005 at 07:33:49PM +0200, Ingo Molnar wrote:
> 
> * Petr Baudis <[EMAIL PROTECTED]> wrote:
> 
> >   I will also need to do more testing on the linux kernel tree.
> > Committing patch-2.6.7 on 2.6.6 kernel and then diffing results in
> > 
> > $ time gitdiff.sh `parent-id` `tree-id` >p
> > real5m37.434s
> > user1m27.113s
> > sys 2m41.036s
> > 
> > which is pretty horrible, it seems to me. Any benchmarking help is of
> > course welcomed, as well as any other feedback.
> 
> it seems from the numbers that your system doesnt have enough RAM for 
> this and is getting IO-bound?

Not the only problem, without I/O, he will go down to 4m8s (u+s) which
is still in the same order of magnitude.

willy

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] git-pasky-0.1

2005-04-10 Thread Ingo Molnar

* Petr Baudis <[EMAIL PROTECTED]> wrote:

>   I will also need to do more testing on the linux kernel tree.
> Committing patch-2.6.7 on 2.6.6 kernel and then diffing results in
> 
>   $ time gitdiff.sh `parent-id` `tree-id` >p
>   real5m37.434s
>   user1m27.113s
>   sys 2m41.036s
> 
> which is pretty horrible, it seems to me. Any benchmarking help is of
> course welcomed, as well as any other feedback.

it seems from the numbers that your system doesnt have enough RAM for 
this and is getting IO-bound?

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] git-pasky-0.1

2005-04-10 Thread Ingo Molnar

* Petr Baudis [EMAIL PROTECTED] wrote:

   I will also need to do more testing on the linux kernel tree.
 Committing patch-2.6.7 on 2.6.6 kernel and then diffing results in
 
   $ time gitdiff.sh `parent-id` `tree-id` p
   real5m37.434s
   user1m27.113s
   sys 2m41.036s
 
 which is pretty horrible, it seems to me. Any benchmarking help is of
 course welcomed, as well as any other feedback.

it seems from the numbers that your system doesnt have enough RAM for 
this and is getting IO-bound?

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] git-pasky-0.1

2005-04-10 Thread Willy Tarreau
On Sun, Apr 10, 2005 at 07:33:49PM +0200, Ingo Molnar wrote:
 
 * Petr Baudis [EMAIL PROTECTED] wrote:
 
I will also need to do more testing on the linux kernel tree.
  Committing patch-2.6.7 on 2.6.6 kernel and then diffing results in
  
  $ time gitdiff.sh `parent-id` `tree-id` p
  real5m37.434s
  user1m27.113s
  sys 2m41.036s
  
  which is pretty horrible, it seems to me. Any benchmarking help is of
  course welcomed, as well as any other feedback.
 
 it seems from the numbers that your system doesnt have enough RAM for 
 this and is getting IO-bound?

Not the only problem, without I/O, he will go down to 4m8s (u+s) which
is still in the same order of magnitude.

willy

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] git-pasky-0.1

2005-04-10 Thread Ingo Molnar

* Willy Tarreau [EMAIL PROTECTED] wrote:

 I will also need to do more testing on the linux kernel tree.
   Committing patch-2.6.7 on 2.6.6 kernel and then diffing results in
   
 $ time gitdiff.sh `parent-id` `tree-id` p
 real5m37.434s
 user1m27.113s
 sys 2m41.036s
   
   which is pretty horrible, it seems to me. Any benchmarking help is of
   course welcomed, as well as any other feedback.
  
  it seems from the numbers that your system doesnt have enough RAM for 
  this and is getting IO-bound?
 
 Not the only problem, without I/O, he will go down to 4m8s (u+s) which 
 is still in the same order of magnitude.

probably not the only problem - but if we are lucky then his system was 
just trashing within the kernel repository and then most of the overhead 
is the _unnecessary_ IO that happened due to that (which causes CPU 
overhead just as much). The dominant system time suggests so, to a 
certain degree. Maybe this is wishful thinking.

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: [ANNOUNCE] git-pasky-0.1

2005-04-10 Thread Petr Baudis
Dear diary, on Sun, Apr 10, 2005 at 07:45:12PM CEST, I got a letter
where Ingo Molnar [EMAIL PROTECTED] told me that...
 
 * Willy Tarreau [EMAIL PROTECTED] wrote:
 
  I will also need to do more testing on the linux kernel tree.
Committing patch-2.6.7 on 2.6.6 kernel and then diffing results in

$ time gitdiff.sh `parent-id` `tree-id` p
real5m37.434s
user1m27.113s
sys 2m41.036s

which is pretty horrible, it seems to me. Any benchmarking help is of
course welcomed, as well as any other feedback.
   
   it seems from the numbers that your system doesnt have enough RAM for 
   this and is getting IO-bound?
  
  Not the only problem, without I/O, he will go down to 4m8s (u+s) which 
  is still in the same order of magnitude.
 
 probably not the only problem - but if we are lucky then his system was 
 just trashing within the kernel repository and then most of the overhead 
 is the _unnecessary_ IO that happened due to that (which causes CPU 
 overhead just as much). The dominant system time suggests so, to a 
 certain degree. Maybe this is wishful thinking.

It turns out to be the forks for doing all the cuts and such what is
bogging it down so awfully (doing diff-tree takes 0.48s ;-). I do about
15 forks per change, I guess, and for some reason cut takes a long of
time on its own.

I've rewritten the cuts with the use of bash arrays and other smart
stuff. I somehow don't feel comfortable using this and prefer the
old-fashioned ways, but it would be plain unusable without this.

Now I'm down to

real1m21.440s
user0m32.374s
sys 0m42.200s

and I kinda doubt if it is possible to cut this much down. Almost no
disk activity, I have almost everything cached by now, apparently.

Anyway, you can git pull to get the optimized version.

Thanks for the help,

-- 
Petr Pasky Baudis
Stuff: http://pasky.or.cz/
98% of the time I am right. Why worry about the other 3%.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: [ANNOUNCE] git-pasky-0.1

2005-04-10 Thread Willy Tarreau
On Sun, Apr 10, 2005 at 08:45:22PM +0200, Petr Baudis wrote:
 
 It turns out to be the forks for doing all the cuts and such what is
 bogging it down so awfully (doing diff-tree takes 0.48s ;-). I do about
 15 forks per change, I guess, and for some reason cut takes a long of
 time on its own.
 
 I've rewritten the cuts with the use of bash arrays and other smart
 stuff. I somehow don't feel comfortable using this and prefer the
 old-fashioned ways, but it would be plain unusable without this.

I've encountered the same problem in a config-generation script a while
ago. Fortunately, bash provides enough ways to remove most of the forks,
but the result is less portable.

I've downloaded your code, but it does not compile here because of the
tv_nsec fields in struct stat (2.4, glibc 2.2), so I cannot use it to
get the most up to date version to take a look at the script. Basically,
all the 'cut' and 'sed' can be removed, as well as the 'dirname'. You
can also call mkdir only if the dirs don't exist. I really think you
should end up with only one fork in the loop to call 'diff'.

 Now I'm down to
 
   real1m21.440s
   user0m32.374s
   sys 0m42.200s
 
 and I kinda doubt if it is possible to cut this much down. Almost no
 disk activity, I have almost everything cached by now, apparently.

It is very common to cut times by a factor of 10 or more when replacing
common unix tools by pure shell. Dynamic library initialization also
takes a lot of time nowadays, and probably you have localisation which
is big too. Sometimes, just wiping a few variables at the top of the
shell might remove some useless overhead.

 Anyway, you can git pull to get the optimized version.
 
 Thanks for the help,

Willy

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] git-pasky-0.1

2005-04-10 Thread Sean
On Sun, April 10, 2005 12:55 pm, Linus Torvalds said:

 Larry was ok with the idea to make my export format actually be natively
 supported by BK (ie the same way you have bk export -tpatch), but
 Tridge wanted to instead get at the native data and be difficult about
 it. As a result, I can now not only use BK any more, but we also don't
 have a nice export format from BK.

 Yeah, I'm a bit bitter about it.


Linus,

With all due respect, Larry could have dealt with this years ago and
removed the motivation for Tridge and others to pursue reverse
engineering.   Instead he chose to insult and question the motives of
everyone that wanted open-source access to the Linux history data.  The
blame for the current situation falls firmly on the choice to use a
closed-source SCM for Linux and the actions of the company that owned it.

Sean


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: [ANNOUNCE] git-pasky-0.1

2005-04-10 Thread Linus Torvalds


On Sun, 10 Apr 2005, Petr Baudis wrote:
 
 It turns out to be the forks for doing all the cuts and such what is
 bogging it down so awfully (doing diff-tree takes 0.48s ;-). I do about
 15 forks per change, I guess, and for some reason cut takes a long of
 time on its own.

Heh.

Can you pull my current repo, which has diff-tree -R that does what the 
name suggests, and which should be faster than the 0.48 sec you see..

It may not matter a lot, since actually generating the diff from the file 
contents is what is expensive, but remember my goal: I want the expense of 
a diff-tree to be relative to the size of the diff, so that implies that 
small diffs haev to be basically instantaenous. So I care.

So I just tried the 2.6.7-2.6.8 diff, and for me the new recursive
diff-tree can generate the _list_ of files changed in zero time:

real0m0.079s
user0m0.067s
sys 0m0.024s

but then _doing_ the diff is pretty expensive (in this case 3800+ files
changed, so you have to unpack 7600+ objects - and even unpacking isn't
the expensive part, the expense is literally in the diff operation
itself).

Me, the stuff I automate is the small steps. Doing a single checkin. So
that's the case I care about going fast, when a diff-tree will likely
have maybe five files or something. That's why I want the small
incremental cases to go fast - it it takes me a minute to generate a diff
for a _release_, that's not a big deal. I make one release every other
month, but I work with lots of small patches all the time.

Anyway, with a fast diff-tree, you should be able to generate the list of 
objects for a fast merge. That's next. 

(And by merge, I of course mean suck. I'm talking about the old CVS
three-way merge, and you have to specify the common parent explicitly and
it won't handle any renames or any other crud. But it would get us to 
something that might actually be useful for simple things. Which is why 
diff-tree is important - it gives the information about what to tell 
merge).

Linus
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] git-pasky-0.1

2005-04-10 Thread Paul Jackson
Good lord - you don't need to use arrays for this.

The old-fashioned ways have their ways.  Both the 'set'
command and the 'read' command can split args and assign
to distinct variable names.

Try something like the following:

  diff-tree -r $id1 $id2 |
sed -e '/^/ { N; s/\n/ / }' -e 's/./ /' |
while read op mode1 sha1 name1 mode2 sha2 name2
do
... various common stuff ...
case $op in
+)
...
;;
-)
...
;;
)
test $name1 = $name2 || die mismatched names
label1=$(mkbanner $loc1 $id1 $name1 $mode1 $sha1)
label2=$(mkbanner $loc2 $id2 $name1 $mode2 $sha2)
diff -L $label1 -L $label2 -u $loc1 $loc2
;;
esac
done

-- 
  I won't rest till it's the best ...
  Programmer, Linux Scalability
  Paul Jackson [EMAIL PROTECTED] 1.650.933.1373, 
1.925.600.0401
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: Re: [ANNOUNCE] git-pasky-0.1

2005-04-10 Thread Petr Baudis
Dear diary, on Sun, Apr 10, 2005 at 09:13:19PM CEST, I got a letter
where Willy Tarreau [EMAIL PROTECTED] told me that...
 On Sun, Apr 10, 2005 at 08:45:22PM +0200, Petr Baudis wrote:
  
  It turns out to be the forks for doing all the cuts and such what is
  bogging it down so awfully (doing diff-tree takes 0.48s ;-). I do about
  15 forks per change, I guess, and for some reason cut takes a long of
  time on its own.
  
  I've rewritten the cuts with the use of bash arrays and other smart
  stuff. I somehow don't feel comfortable using this and prefer the
  old-fashioned ways, but it would be plain unusable without this.
 
 I've encountered the same problem in a config-generation script a while
 ago. Fortunately, bash provides enough ways to remove most of the forks,
 but the result is less portable.
 
 I've downloaded your code, but it does not compile here because of the
 tv_nsec fields in struct stat (2.4, glibc 2.2), so I cannot use it to
 get the most up to date version to take a look at the script. Basically,

Ok, I decided to stop this nsec madness (since it broke show-diff
anyway at least on my ext3), and you get it only if you pass -DNSEC
to CFLAGS now. Hope this fixes things for you. :-)

BTW, I regularly update the public copy as accessible on the web.

 all the 'cut' and 'sed' can be removed, as well as the 'dirname'. You
 can also call mkdir only if the dirs don't exist. I really think you
 should end up with only one fork in the loop to call 'diff'.

You still need to extract the file by cat-file too. ;-) And rm the files
after it compares them (so that we don't fill /tmp with crap like
certain awful programs like to do). But I will conditionalize the mkdir
calls, thanks for the suggestion - I think that's the last bit to be
squeezed from this loop (I'll yet check on the read proposal - I
considered it before and turned down for some reason, can't remember why
anymore, though).

Thanks,

-- 
Petr Pasky Baudis
Stuff: http://pasky.or.cz/
98% of the time I am right. Why worry about the other 3%.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: [ANNOUNCE] git-pasky-0.1

2005-04-10 Thread Linus Torvalds


On Sun, 10 Apr 2005, Linus Torvalds wrote:
 
 Can you pull my current repo, which has diff-tree -R that does what the 
 name suggests, and which should be faster than the 0.48 sec you see..

Actually, I changed things around. Everybody hated the   lines, so I 
put a changed thing on a line of its own with a * instead.

So you'd now see lines like

*100644-100644 
1874e031abf6631ea51cf6177b82a1e662f6183e-e8181df8499f165cacc6a0d8783be7143013d410
 CREDITS

which means that the CREDITS file has changed, and it shows you the mode
- mode transition (that didn't change in this case) and the sha1 - sha1
transition.

So now it's always just one line per change. Firthermore, the filename is 
always field 3, if you use spaces as delimeters, regardless of whether 
it's a +/-/* field.

So let's say you want to merge two trees (dst1 and dst2) from a common
parent (src), what you would do is:

 - get the list of files to merge:

diff-tree -R dst1 dst2 | tr '\0' '\n'  merge-files

 - Which of those were changed by src - dstX?

diff-tree -R src dst1 | tr '\0' '\n' | join -j 3 - merge-files  
dst1-change
diff-tree -R src dst2 | tr '\0' '\n' | join -j 3 - merge-files  
dst2-change

 - Which of those are common to both? Let's see what the merge list is:

join dst1-change dst2-change  merge-list

and hopefully you'd usually be working on a very small list of files by 
then (everything else you'd just pick from one of the destination trees 
directly - you've got the name, the sha-file, everything: no need to even 
look at the data).

Does this sound sane? Pasky? Wanna try a git merge thing? Starting off
with the user having to tell what the common parent tree is - we can try
to do the automatically find best common parent crud later. THAT may be 
expensive.

(Btw, this is why I think diff-tree is more important than actually
generating the real diff itself - the above uses diff-tree three times
just to cut down to the point where _hopefully_ you don't actually need to
generate very much diffs at all. So I want diff-tree to be really fast, 
even if it then can take a minute to actually generate a big diff between 
releases etc).

Linus
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: Re: [ANNOUNCE] git-pasky-0.1

2005-04-10 Thread Petr Baudis
Dear diary, on Sun, Apr 10, 2005 at 10:38:11PM CEST, I got a letter
where Linus Torvalds [EMAIL PROTECTED] told me that...
 On Sun, 10 Apr 2005, Petr Baudis wrote:
  
  It turns out to be the forks for doing all the cuts and such what is
  bogging it down so awfully (doing diff-tree takes 0.48s ;-). I do about
  15 forks per change, I guess, and for some reason cut takes a long of
  time on its own.
 
 Heh.
 
 Can you pull my current repo, which has diff-tree -R that does what the 
 name suggests, and which should be faster than the 0.48 sec you see..

Funnily enough, now after some more cache teasing it's ~0.185. Your one
still ~0.17, though. :/ (That might be because of the format changes,
though, since you do less printing now.) (BTW, all those measurements
are done on my AMD K6 walking on 1600MHz, 512M RAM, about 200M available
for caches.)

Just out of interest, did you have a look at my diff-tree -r
implementation and decided that you don't like it, or you weren't aware
of it?

I will probably take most of your diff-tree change, but I'd prefer to do
the sha1-tree mapping directly in diff_tree().

 It may not matter a lot, since actually generating the diff from the file 
 contents is what is expensive, but remember my goal: I want the expense of 
 a diff-tree to be relative to the size of the diff, so that implies that 
 small diffs haev to be basically instantaenous. So I care.

Me too, of course.

 So I just tried the 2.6.7-2.6.8 diff, and for me the new recursive
 diff-tree can generate the _list_ of files changed in zero time:
 
   real0m0.079s
   user0m0.067s
   sys 0m0.024s
 
 but then _doing_ the diff is pretty expensive (in this case 3800+ files
 changed, so you have to unpack 7600+ objects - and even unpacking isn't
 the expensive part, the expense is literally in the diff operation
 itself).
 
 Me, the stuff I automate is the small steps. Doing a single checkin. So
 that's the case I care about going fast, when a diff-tree will likely
 have maybe five files or something. That's why I want the small
 incremental cases to go fast - it it takes me a minute to generate a diff
 for a _release_, that's not a big deal. I make one release every other
 month, but I work with lots of small patches all the time.

I see.

 Anyway, with a fast diff-tree, you should be able to generate the list of 
 objects for a fast merge. That's next. 
 
 (And by merge, I of course mean suck. I'm talking about the old CVS
 three-way merge, and you have to specify the common parent explicitly and
 it won't handle any renames or any other crud. But it would get us to 
 something that might actually be useful for simple things. Which is why 
 diff-tree is important - it gives the information about what to tell 
 merge).

I currently already do a merge when you track someone's source - it will
throw away your previous HEAD record though, so if you committed some
local changes after the previous pull, you will get orphaned commits and
the changes will turn to uncommitted ones. I have some ideas regarding
how to do it properly (and do any arbitrary merging, for that matter), I
hope to get to it as soon as I catch up with you. :-)

BTW, the three-way merge comes from RCS. That reminds me, is there any
tool which will take .rej files and throw them into the file to create
rcsmerge-like conflicts? Perhaps it's fault of my bad tools, but I
prefer to work with the inline rejects much more to .rej files (except
to actually notice the rejects).

-- 
Petr Pasky Baudis
Stuff: http://pasky.or.cz/
98% of the time I am right. Why worry about the other 3%.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: Re: [ANNOUNCE] git-pasky-0.1

2005-04-10 Thread Linus Torvalds


On Mon, 11 Apr 2005, Petr Baudis wrote:
 
 I currently already do a merge when you track someone's source - it will
 throw away your previous HEAD record though

Not only that, it doesn't do what I consider a merge. 

A real merge should have two or more parents. The commit-tree command
already allows that: just add any arbitrary number of -p x  
switches (well, I think I limited it to 16 parents, but that's just a
totally random number, there's nothing in the file format or anything 
else that limits it).

So while you've merged my data, but you've not actually merged my
revision history in your tree.

And the reason a real merge _has_ to show both parents properly is that 
unless you do that, you can never merge sanely another time without 
getting lots of clashes from the previous merge. So it's important that a 
merge really shows both trees it got data from.

This is, btw, also the reason I haven't merged with your tree - I want to 
get to the point where I really _can_ merge without throwing away the 
information. In fact, at this point I'd rather not merge with your tree at 
all, because I consider your tree to be corrupt thanks to lacking the 
merge history.

So you've done the data merge, but not the history merge.

And because you didn't do the history merge, there's no way to
automatically find out what point of my tree you merged _with_. See?

And since I have no way to see what point in time you merged with me, now
I can't generate a nice 3-way diff against the last common ancestor of
both of our trees.

So now I can't do a three-way merge with you based on any sane ancestor,
unless I start guessing which ancestor of mine you merged with. Now, that
guess is easy enough to do with a project like git which currently has
just a few tens of commits and effectively only two parallell development
trees, but the whole point is to get to a system where that isn't true..

Linus
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] git-pasky-0.1

2005-04-10 Thread Paul Jackson
Petr wrote:
 That reminds me, is there any
 tool which will take .rej files and throw them into the file to create
 rcsmerge-like conflicts?

  Check out 'wiggle'
http://www.cse.unsw.edu.au/~neilb/source/wiggle/

-- 
  I won't rest till it's the best ...
  Programmer, Linux Scalability
  Paul Jackson [EMAIL PROTECTED] 1.650.933.1373, 
1.925.600.0401
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: Re: Re: [ANNOUNCE] git-pasky-0.1

2005-04-10 Thread Petr Baudis
Dear diary, on Mon, Apr 11, 2005 at 01:10:58AM CEST, I got a letter
where Linus Torvalds [EMAIL PROTECTED] told me that...
 
 
 On Mon, 11 Apr 2005, Petr Baudis wrote:
  
  I currently already do a merge when you track someone's source - it will
  throw away your previous HEAD record though
 
 Not only that, it doesn't do what I consider a merge. 
 
 A real merge should have two or more parents. The commit-tree command
 already allows that: just add any arbitrary number of -p x  
 switches (well, I think I limited it to 16 parents, but that's just a
 totally random number, there's nothing in the file format or anything 
 else that limits it).
 
 So while you've merged my data, but you've not actually merged my
 revision history in your tree.

Well, that's exactly what I was (am) going to do. :-) That's also why I
said that I (virtually) throw the local commits away now. Instead, if
there were any local commits, I will do git merge:

commit-tree $(write-tree) -p $local_head -p $tracked_tree

Note that I will need to make this two-phase - first applying the
changes, then doing the commit; between those two phases, the user
should resolve potential conflicts and check if the merge went right.

I think I will name the first phase git merge and the second phase will
be just git commit, and I will store the merge information in
.dircache/. (BTW, I think the directory name is pretty awful; what about
.git/ ?)

 And the reason a real merge _has_ to show both parents properly is that 
 unless you do that, you can never merge sanely another time without 
 getting lots of clashes from the previous merge. So it's important that a 
 merge really shows both trees it got data from.
 
 This is, btw, also the reason I haven't merged with your tree - I want to 
 get to the point where I really _can_ merge without throwing away the 
 information. In fact, at this point I'd rather not merge with your tree at 
 all, because I consider your tree to be corrupt thanks to lacking the 
 merge history.
 
 So you've done the data merge, but not the history merge.
 
 And because you didn't do the history merge, there's no way to
 automatically find out what point of my tree you merged _with_. See?
 
 And since I have no way to see what point in time you merged with me, now
 I can't generate a nice 3-way diff against the last common ancestor of
 both of our trees.
 
 So now I can't do a three-way merge with you based on any sane ancestor,
 unless I start guessing which ancestor of mine you merged with. Now, that
 guess is easy enough to do with a project like git which currently has
 just a few tens of commits and effectively only two parallell development
 trees, but the whole point is to get to a system where that isn't true..

Well, I've wanted to get the basic things working first before doing git
merge. (Especially since until recently, diff-tree was PITA to work
with, and before that it didn't even exist.) If you want, I can rebuild
my tree with doing the merging properly, after I have git merge working.

(BTW, it would be useful to have a tool which just blindly takes what
you give it on input and throws it to an object of given type; I will
need to construct arbitrary commits during the rebuild if I'm to keep
the correct dates.)

-- 
Petr Pasky Baudis
Stuff: http://pasky.or.cz/
98% of the time I am right. Why worry about the other 3%.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: Re: Re: [ANNOUNCE] git-pasky-0.1

2005-04-10 Thread Linus Torvalds


On Mon, 11 Apr 2005, Petr Baudis wrote:
 
 (BTW, it would be useful to have a tool which just blindly takes what
 you give it on input and throws it to an object of given type; I will
 need to construct arbitrary commits during the rebuild if I'm to keep
 the correct dates.)

Hah. That's what COMMITTER_NAME COMMITTER_EMAIL and COMMITTER_DATE 
are there for.

There's two things to commits: when (and by whom) it was committed to a
tree, and when the changes were really done.

So set the COMMITTER_xxx things to the person/time you want to consider 
the _original_ one, and let commit-tree author you as the creator of the 
commit itself. The regular ChangeLog thing should only show the author 
and original time, but it's nice to see who created the commit itself.

I did this very much on purpose: see how I always try to attribute 
authorship in BK to the person who actually wrote the code. At the same 
time, I think it's interesting from a tracking standpoint to also see 
when/where that change got introduced into a tree.

I _tried_ to get this right in the sparse tree conversion. I won't 
guarantee that it's all correct, but the top commit in the sparse tree 
looks like this:

tree 67607f05a66e36b2f038c77cfb61350d2110f7e8
parent 9c59995fef9b52386e5f7242f44720a7aca287d7
author Christopher Li [EMAIL PROTECTED] Sat Apr  2 09:30:09 PST 2005
committer Linus Torvalds [EMAIL PROTECTED] Thu Apr  7 20:06:31 2005

...

exactly because I tracked when I committed it to the sparse tree 
_separately_ from tracking when it was created.

So when I re-create the sparse-tree, I'll also end up re-writing the 
committer information. And that's proper. That's really saying this 
sha1 object was created by Xxxx at time Xxxx.

Btw, the COMMITTER_ environment variables are very confusingly
named. They actually go into the _author_ line in the commit object. I'm a
total retard, and I really don't know why I called it COMMITTER_xxx  
instead of AUTHOR_xxx.

Linus retard Torvalds
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: Re: [ANNOUNCE] git-pasky-0.1

2005-04-10 Thread Petr Baudis
Dear diary, on Sun, Apr 10, 2005 at 11:39:02PM CEST, I got a letter
where Linus Torvalds [EMAIL PROTECTED] told me that...
 On Sun, 10 Apr 2005, Linus Torvalds wrote:
  
  Can you pull my current repo, which has diff-tree -R that does what the 
  name suggests, and which should be faster than the 0.48 sec you see..
 
 Actually, I changed things around. Everybody hated the   lines, so I 
 put a changed thing on a line of its own with a * instead.
 
 So you'd now see lines like
 
   *100644-100644 
 1874e031abf6631ea51cf6177b82a1e662f6183e-e8181df8499f165cacc6a0d8783be7143013d410
  CREDITS
 
 which means that the CREDITS file has changed, and it shows you the mode
 - mode transition (that didn't change in this case) and the sha1 - sha1
 transition.
 
 So now it's always just one line per change. Firthermore, the filename is 
 always field 3, if you use spaces as delimeters, regardless of whether 
 it's a +/-/* field.

That's great, just when I finally managed to properly fix the xargs
boundary case in gitdiff-do (without throwing away the NUL-termination).
You know how to please people! ;-)

(Not that I'd have *anything* against the change. The logic is simpler
and you'll be actually able to work with diff-tree a little sanely.)

BTW, it is quite handy to have the entry type in the listing (guessing
that from mode in the script just doesn't feel right and doing explicit
cat-file kills the performance). I would also really prefer the fields
separated by tabs. It looks nicer on the screen (aligned, e.g. modes and
type are varsized), and is also easier to parse (cut defaults to tabs as
delimiters, for example).

 So let's say you want to merge two trees (dst1 and dst2) from a common
 parent (src), what you would do is:
 
  - get the list of files to merge:
 
   diff-tree -R dst1 dst2 | tr '\0' '\n'  merge-files

...oh, I probably forgot to ask - why did you choose -R instead of -r?
It looks rather alien to me; if it starts by 'diff', my hand writes -r
without thinking.

  - Which of those were changed by src - dstX?
 
   diff-tree -R src dst1 | tr '\0' '\n' | join -j 3 - merge-files  
 dst1-change
   diff-tree -R src dst2 | tr '\0' '\n' | join -j 3 - merge-files  
 dst2-change
 
  - Which of those are common to both? Let's see what the merge list is:
 
   join dst1-change dst2-change  merge-list
 
 and hopefully you'd usually be working on a very small list of files by 
 then (everything else you'd just pick from one of the destination trees 
 directly - you've got the name, the sha-file, everything: no need to even 
 look at the data).

Ok, this looks reasonable. (Provided that I DWYM regarding the joins.)

 Does this sound sane? Pasky? Wanna try a git merge thing? Starting off
 with the user having to tell what the common parent tree is - we can try
 to do the automatically find best common parent crud later. THAT may be 
 expensive.

I will definitively try git merge, but maybe not this night anymore
(it's already 1:32 here now).

-- 
Petr Pasky Baudis
Stuff: http://pasky.or.cz/
98% of the time I am right. Why worry about the other 3%.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: Re: Re: Re: [ANNOUNCE] git-pasky-0.1

2005-04-10 Thread Petr Baudis
Dear diary, on Mon, Apr 11, 2005 at 01:46:50AM CEST, I got a letter
where Linus Torvalds [EMAIL PROTECTED] told me that...
 
 
 On Mon, 11 Apr 2005, Petr Baudis wrote:
  
  (BTW, it would be useful to have a tool which just blindly takes what
  you give it on input and throws it to an object of given type; I will
  need to construct arbitrary commits during the rebuild if I'm to keep
  the correct dates.)
 
 Hah. That's what COMMITTER_NAME COMMITTER_EMAIL and COMMITTER_DATE 
 are there for.
 
 There's two things to commits: when (and by whom) it was committed to a
 tree, and when the changes were really done.
 
 So set the COMMITTER_xxx things to the person/time you want to consider 
 the _original_ one, and let commit-tree author you as the creator of the 
 commit itself. The regular ChangeLog thing should only show the author 
 and original time, but it's nice to see who created the commit itself.

I already use those - look at my ChangeLog. (That's because for certain
reasons I'm working on git in a half-broken chrooted environment.)

When rebuilding the tree from scratch, I wanted like to do it
transparently - that is, so that noone could notice that I rebuilt it,
since it effectively still _is_ the original tree from the data
standpoint, just the history flow is actually correct this time.

 Btw, the COMMITTER_ environment variables are very confusingly
 named. They actually go into the _author_ line in the commit object. I'm a
 total retard, and I really don't know why I called it COMMITTER_xxx  
 instead of AUTHOR_xxx.

So, who will fix it in his tree first! ;-)

-- 
Petr Pasky Baudis
Stuff: http://pasky.or.cz/
98% of the time I am right. Why worry about the other 3%.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] git-pasky-0.1

2005-04-10 Thread Randy.Dunlap
On Sun, 10 Apr 2005 16:23:11 -0700 Paul Jackson wrote:

| Petr wrote:
|  That reminds me, is there any
|  tool which will take .rej files and throw them into the file to create
|  rcsmerge-like conflicts?
| 
|   Check out 'wiggle'
| http://www.cse.unsw.edu.au/~neilb/source/wiggle/

or Chris Mason's 'rej' program:
ftp://ftp.suse.com/pub/people/mason/rej/


---
~Randy
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


GIT license (Re: Re: Re: Re: Re: [ANNOUNCE] git-pasky-0.1)

2005-04-10 Thread Linus Torvalds


Btw, does anybody have strong opinions on the license? I didn't put in a 
COPYING file exactly because I was torn between GPLv2 and OSL2.1.

I'm inclined to go with GPLv2 just because it's the most common one, but I 
was wondering if anybody really had strong opinions. For example, I'd 
really make it v2 by default like the kernel, since I'm sure v3 will be 
fine, but regardless of how sure I am, I'm _not_ a gambling man.

Linus
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: GIT license (Re: Re: Re: Re: Re: [ANNOUNCE] git-pasky-0.1)

2005-04-10 Thread Petr Baudis
Dear diary, on Mon, Apr 11, 2005 at 02:20:52AM CEST, I got a letter
where Linus Torvalds [EMAIL PROTECTED] told me that...
 Btw, does anybody have strong opinions on the license? I didn't put in a 
 COPYING file exactly because I was torn between GPLv2 and OSL2.1.
 
 I'm inclined to go with GPLv2 just because it's the most common one, but I 
 was wondering if anybody really had strong opinions. For example, I'd 
 really make it v2 by default like the kernel, since I'm sure v3 will be 
 fine, but regardless of how sure I am, I'm _not_ a gambling man.

Oh, I wanted to ask about this too. I'd mostly prefer GPLv2 (I have no
problem with the version restriction, I usually do it too), it's the one
I'm mostly familiar with and OSL appears to be incompatible with GPL (at
least FSF says so about OSL1.0), which might create various annoying
issues. I hate when licenses get in my way and prevent me to possibly
include some useful code.

-- 
Petr Pasky Baudis
Stuff: http://pasky.or.cz/
98% of the time I am right. Why worry about the other 3%.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: Re: [ANNOUNCE] git-pasky-0.1

2005-04-10 Thread Petr Baudis
Dear diary, on Sun, Apr 10, 2005 at 10:38:11PM CEST, I got a letter
where Linus Torvalds [EMAIL PROTECTED] told me that...
..snip..
 Can you pull my current repo, which has diff-tree -R that does what the 
 name suggests, and which should be faster than the 0.48 sec you see..

Am I just missing something, or your diff-tree doesn't handle
added/removed directories?

(Mine does! *hint* *hint* It also doesn't bother with dynamic
allocation, but someone might consider the static path buffer ugly.
Anyway, I hacked it with a plan to do a massive cleanup of the file
later.)

-- 
Petr Pasky Baudis
Stuff: http://pasky.or.cz/
98% of the time I am right. Why worry about the other 3%.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: Re: [ANNOUNCE] git-pasky-0.1

2005-04-10 Thread Linus Torvalds


On Mon, 11 Apr 2005, Petr Baudis wrote:

 Dear diary, on Sun, Apr 10, 2005 at 10:38:11PM CEST, I got a letter
 where Linus Torvalds [EMAIL PROTECTED] told me that...
 ..snip..
  Can you pull my current repo, which has diff-tree -R that does what the 
  name suggests, and which should be faster than the 0.48 sec you see..
 
 Am I just missing something, or your diff-tree doesn't handle
 added/removed directories?

You're not missing anything, I did it that way on purpose. I thought it 
would be easier to do the expansion in the caller (who knows what it is 
they want to do with the end result).

But now that I look at merging, I realize that was actually the wrong
thing to do. A merge algorithm definitely wants to see the expanded tree,
since it will compare/join several of the diff-tree output things. 

So I'll either fix it or decide to just go with your version instead. I'm 
not overly proud. 

Linus
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: GIT license (Re: Re: Re: Re: Re: [ANNOUNCE] git-pasky-0.1)

2005-04-10 Thread Nur Hussein
 Btw, does anybody have strong opinions on the license? I didn't put in a 
 COPYING file exactly because I was torn between GPLv2 and OSL2.1.

I think GPLv2 would create the least amount of objection in the
community, so I'd probably want to go with that.

Nur Hussein
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/