Re: GIT license (Re: Re: Re: Re: Re: [ANNOUNCE] git-pasky-0.1)
Petr Baudis wrote: Dear diary, on Tue, Apr 12, 2005 at 11:50:48AM CEST, I got a letter Well, yes, but the last merge point search may not be so simple: A --1---26---7 B\ `-4-. / C `-3-5' Now, when at 7, your last merge point is not 1, but 2. ...and this is obviously wrong, sorry. You would lose 3 this way. Wouldn't the delta betweeen 2 and 5 include any contribution from 3? Chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: Re: Re: Re: GIT license (Re: Re: Re: Re: Re: [ANNOUNCE] git-pasky-0.1)
Dear diary, on Tue, Apr 12, 2005 at 11:50:48AM CEST, I got a letter where Petr Baudis <[EMAIL PROTECTED]> told me that... > Dear diary, on Tue, Apr 12, 2005 at 10:39:40AM CEST, I got a letter > where Geert Uytterhoeven <[EMAIL PROTECTED]> told me that... > > On Tue, 12 Apr 2005, Petr Baudis wrote: > ..snip.. > > > Basically, when you look at merge(1) : > > > > > > SYNOPSIS > > >merge [ options ] file1 file2 file3 > > > DESCRIPTION > > >merge incorporates all changes that lead from file2 to file3 > > > into file1. > > > > > > The only big problem is how to guess the best file2 when you give it > > > file3 and file1. > > > > That's either the point just before you started modifying the file, or your > > last merge point. Sounds simple, but if your SCM system doesn't track > > merges, > > your SOL... > > Well, yes, but the last merge point search may not be so simple: > > A --1---26---7 > B\ `-4-. / > C `-3-5' > > Now, when at 7, your last merge point is not 1, but 2. ...and this is obviously wrong, sorry. You would lose 3 this way. -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ 98% of the time I am right. Why worry about the other 3%. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: Re: Re: GIT license (Re: Re: Re: Re: Re: [ANNOUNCE] git-pasky-0.1)
Dear diary, on Tue, Apr 12, 2005 at 10:39:40AM CEST, I got a letter where Geert Uytterhoeven <[EMAIL PROTECTED]> told me that... > On Tue, 12 Apr 2005, Petr Baudis wrote: ..snip.. > > Basically, when you look at merge(1) : > > > > SYNOPSIS > >merge [ options ] file1 file2 file3 > > DESCRIPTION > >merge incorporates all changes that lead from file2 to file3 > > into file1. > > > > The only big problem is how to guess the best file2 when you give it > > file3 and file1. > > That's either the point just before you started modifying the file, or your > last merge point. Sounds simple, but if your SCM system doesn't track merges, > your SOL... Well, yes, but the last merge point search may not be so simple: A --1---26---7 B\ `-4-. / C `-3-5' Now, when at 7, your last merge point is not 1, but 2. What I have proposed at the git mailing list was to have simple merging tracking - merges/branch1/branch2 directory structure which would store merges from branch2 to branch1. Then, when merging say to branch3, you traverse all of them and if any of the branch1/* commits is newer than branch3/*, you update it. The disadvantage is that you now need to strictly use gitmerge.sh to do the merges - Linus' revtree solution is nicer in the regard that it works without any explicit bookkeeping, and tracks any merges properly recorded with commit-file; it is more complex and more expensive, though. -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ 98% of the time I am right. Why worry about the other 3%. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: Re: GIT license (Re: Re: Re: Re: Re: [ANNOUNCE] git-pasky-0.1)
On Tue, 12 Apr 2005, Petr Baudis wrote: > Dear diary, on Tue, Apr 12, 2005 at 03:20:18AM CEST, I got a letter > where "Adam J. Richter" <[EMAIL PROTECTED]> told me that... > > >Dear diary, on Mon, Apr 11, 2005 at 05:46:38PM CEST, I got a letter > > >where "Adam J. Richter" <[EMAIL PROTECTED]> told me that... > > >..snip.. > > >> Graydon Hoare. (By the way, I would prefer that git just punt to > > >> user level programs for diff and merge when all of the versions > > >> involved are different or at least have a very thin interface > > >> for extending the facility, because I would like to do some character > > >> based merge stuff.) > > >..snip.. > > > > >But this is what git already does. I agree it could do it even better, > > >by checking environment variables for the appropriate tools (then you > > >could use that to pass diff e.g. -p etc.). > > > > This message from Linus seemed to imply that git was going to get > > its own 3-way merge code: > > > > | Then the bad news: the merge algorithm is going to suck. It's going to be > > | just plain 3-way merge, the same RCS/CVS thing you've seen before. With no > > | understanding of renames etc. I'll try to find the best parent to base the > > | merge off of, although early testers may have to tell the piece of crud > > | what the most recent common parent was. > > Well, from what I can read it says "just plain 3-way merge, the same > RCS/CVS thing you've seen before". :-) > > Basically, when you look at merge(1) : > > SYNOPSIS >merge [ options ] file1 file2 file3 > DESCRIPTION >merge incorporates all changes that lead from file2 to file3 > into file1. > > The only big problem is how to guess the best file2 when you give it > file3 and file1. That's either the point just before you started modifying the file, or your last merge point. Sounds simple, but if your SCM system doesn't track merges, your SOL... Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [EMAIL PROTECTED] In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: Re: GIT license (Re: Re: Re: Re: Re: [ANNOUNCE] git-pasky-0.1)
Dear diary, on Tue, Apr 12, 2005 at 03:20:18AM CEST, I got a letter where "Adam J. Richter" <[EMAIL PROTECTED]> told me that... > >Dear diary, on Mon, Apr 11, 2005 at 05:46:38PM CEST, I got a letter > >where "Adam J. Richter" <[EMAIL PROTECTED]> told me that... > >..snip.. > >> Graydon Hoare. (By the way, I would prefer that git just punt to > >> user level programs for diff and merge when all of the versions > >> involved are different or at least have a very thin interface > >> for extending the facility, because I would like to do some character > >> based merge stuff.) > >..snip.. > > >But this is what git already does. I agree it could do it even better, > >by checking environment variables for the appropriate tools (then you > >could use that to pass diff e.g. -p etc.). > > This message from Linus seemed to imply that git was going to get > its own 3-way merge code: > > | Then the bad news: the merge algorithm is going to suck. It's going to be > | just plain 3-way merge, the same RCS/CVS thing you've seen before. With no > | understanding of renames etc. I'll try to find the best parent to base the > | merge off of, although early testers may have to tell the piece of crud > | what the most recent common parent was. Well, from what I can read it says "just plain 3-way merge, the same RCS/CVS thing you've seen before". :-) Basically, when you look at merge(1) : SYNOPSIS merge [ options ] file1 file2 file3 DESCRIPTION merge incorporates all changes that lead from file2 to file3 into file1. The only big problem is how to guess the best file2 when you give it file3 and file1. -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ 98% of the time I am right. Why worry about the other 3%. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: GIT license (Re: Re: Re: Re: Re: [ANNOUNCE] git-pasky-0.1)
On Mon, 11 Apr 2005 20:45:38 +0200, Peter Baudis wrote: > Hello, > please do not trim the cc list so agressively. Sorry. I read the list from a web site that does not show the cc lists. I'll try to cc more people from the relevant discussions though. On the other hand, I've dropped Linus from this message, as it just points to something he previously said. >Dear diary, on Mon, Apr 11, 2005 at 05:46:38PM CEST, I got a letter >where "Adam J. Richter" <[EMAIL PROTECTED]> told me that... >..snip.. >> Graydon Hoare. (By the way, I would prefer that git just punt to >> user level programs for diff and merge when all of the versions >> involved are different or at least have a very thin interface >> for extending the facility, because I would like to do some character >> based merge stuff.) >..snip.. >But this is what git already does. I agree it could do it even better, >by checking environment variables for the appropriate tools (then you >could use that to pass diff e.g. -p etc.). This message from Linus seemed to imply that git was going to get its own 3-way merge code: | Then the bad news: the merge algorithm is going to suck. It's going to be | just plain 3-way merge, the same RCS/CVS thing you've seen before. With no | understanding of renames etc. I'll try to find the best parent to base the | merge off of, although early testers may have to tell the piece of crud | what the most recent common parent was. ( from http://marc.theaimsgroup.com/?l=linux-kernel&m=111320013100822&w=2 ) __ __ Adam J. Richter\ / [EMAIL PROTECTED] | g g d r a s i l - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: GIT license (Re: Re: Re: Re: Re: [ANNOUNCE] git-pasky-0.1)
Hello, please do not trim the cc list so agressively. Dear diary, on Mon, Apr 11, 2005 at 05:46:38PM CEST, I got a letter where "Adam J. Richter" <[EMAIL PROTECTED]> told me that... ..snip.. > Graydon Hoare. (By the way, I would prefer that git just punt to > user level programs for diff and merge when all of the versions > involved are different or at least have a very thin interface > for extending the facility, because I would like to do some character > based merge stuff.) ..snip.. But this is what git already does. I agree it could do it even better, by checking environment variables for the appropriate tools (then you could use that to pass diff e.g. -p etc.). -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ 98% of the time I am right. Why worry about the other 3%. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: GIT license (Re: Re: Re: Re: Re: [ANNOUNCE] git-pasky-0.1)
* Petr Baudis: >> Almost certainly, v3 will be incompatible with v2 because it adds >> further restrictions. This means that your proposal would result in >> software which is not redistributable by third parties. > > Hmm, what would be actually the point in introducing further > restrictions? Anyone who then wants to get around them will just > distribute the software with the "any later version" provision under > GPLv2, and GPLv3 will have no impact expect for new software with "v3 or > any later version" provision. What am I missing? Software continues to evolve. The copyright owners can relicense the code base under v3, and use v3 for all subsequent changes to the software. The trouble with relicensing is that you have to contact all copyright holders (or remove their code). This tends to be impossible in long-running projects without contractual agreements between the developers. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: GIT license (Re: Re: Re: Re: Re: [ANNOUNCE] git-pasky-0.1)
On 2005-04-11, Linus Torvalds wrote: >I'm inclined to go with GPLv2 just because it's the most common one [...] You may want to use a file from GPL'ed monotone that implements a substantial diff optimization described in the August 1989 paper by Sun Wu, Udi Manber and Gene Myers ("An O(NP) Sequence Comparison Algorithm"). According to th file, that implementation was a port of some Scheme code written by Aubrey Jaffer to C++ by Graydon Hoare. (By the way, I would prefer that git just punt to user level programs for diff and merge when all of the versions involved are different or at least have a very thin interface for extending the facility, because I would like to do some character based merge stuff.) It looks to me like the anti-patent provisions of OSLv2.1 could be circumvented by an offender creating a separate company to do patent litigation. So, I think you'll find that the software reuse benefits (both to GIT and to other software projects) of the more widely used GPL ougtweigh the anti-patent benefits of OSLv2.1. Although I like the idea of anti-patent provisions, such as those in OSLv2.1, I think mutual compatability of free software is probably more consequential, even from a purely political perspective. Perhaps you might want to consider offering the code under the distributor's choice of either license if you want to offer the very minor benefits of slightly easier compliance to those who do not litigate software patents, or, perhaps more importantly, the ability of the software to be copied into OSLv2.1 projects (if there are any). __ __ Adam J. Richter\ / [EMAIL PROTECTED] | g g d r a s i l - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: GIT license (Re: Re: Re: Re: Re: [ANNOUNCE] git-pasky-0.1)
Dear diary, on Mon, Apr 11, 2005 at 10:40:00AM CEST, I got a letter where Florian Weimer <[EMAIL PROTECTED]> told me that... > * Ingo Molnar: > > > is there any fundamental problem with going with v2 right now, and then > > once v3 is out and assuming it looks ok, all newly copyrightable bits > > (new files, rewrites, substantial contributions, etc.) get a v3 > > copyright? (and the collection itself would be v3 too) That method > > wouldnt make it fully v3 automatically once v3 is out, but with time > > there would be enough v3 bits in it to make it essentially v3. > > Almost certainly, v3 will be incompatible with v2 because it adds > further restrictions. This means that your proposal would result in > software which is not redistributable by third parties. Hmm, what would be actually the point in introducing further restrictions? Anyone who then wants to get around them will just distribute the software with the "any later version" provision under GPLv2, and GPLv3 will have no impact expect for new software with "v3 or any later version" provision. What am I missing? I've been doing a lot of LKML catching up, and I remember someone suggesting using GPLv2 (for kernel, but should apply to git too), with a provision to let someone trusted (Linus) decide when GPLv3 is out whether you can use GPLv3 for the kernel too. Does it make sense? And is it even legally doable without sending signed written documents to Linus' tropical hacienda? -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ 98% of the time I am right. Why worry about the other 3%. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: GIT license (Re: Re: Re: Re: Re: [ANNOUNCE] git-pasky-0.1)
* Ingo Molnar: > is there any fundamental problem with going with v2 right now, and then > once v3 is out and assuming it looks ok, all newly copyrightable bits > (new files, rewrites, substantial contributions, etc.) get a v3 > copyright? (and the collection itself would be v3 too) That method > wouldnt make it fully v3 automatically once v3 is out, but with time > there would be enough v3 bits in it to make it essentially v3. Almost certainly, v3 will be incompatible with v2 because it adds further restrictions. This means that your proposal would result in software which is not redistributable by third parties. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: GIT license (Re: Re: Re: Re: Re: [ANNOUNCE] git-pasky-0.1)
* Linus Torvalds <[EMAIL PROTECTED]> wrote: > Btw, does anybody have strong opinions on the license? I didn't put in > a COPYING file exactly because I was torn between GPLv2 and OSL2.1. > > I'm inclined to go with GPLv2 just because it's the most common one, > but I was wondering if anybody really had strong opinions. For > example, I'd really make it "v2 by default" like the kernel, since I'm > sure v3 will be fine, but regardless of how sure I am, I'm _not_ a > gambling man. is there any fundamental problem with going with v2 right now, and then once v3 is out and assuming it looks ok, all newly copyrightable bits (new files, rewrites, substantial contributions, etc.) get a v3 copyright? (and the collection itself would be v3 too) That method wouldnt make it fully v3 automatically once v3 is out, but with time there would be enough v3 bits in it to make it essentially v3. This way we wouldnt have to blanket trust v3 before having seen it, and wouldnt be stuck at v2 either. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: GIT license (Re: Re: Re: Re: Re: [ANNOUNCE] git-pasky-0.1)
> Btw, does anybody have strong opinions on the license? I didn't put in a > COPYING file exactly because I was torn between GPLv2 and OSL2.1. I think GPLv2 would create the least amount of objection in the community, so I'd probably want to go with that. Nur Hussein - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: Re: [ANNOUNCE] git-pasky-0.1
On Mon, 11 Apr 2005, Petr Baudis wrote: > > Dear diary, on Sun, Apr 10, 2005 at 10:38:11PM CEST, I got a letter > where Linus Torvalds <[EMAIL PROTECTED]> told me that... > ..snip.. > > Can you pull my current repo, which has "diff-tree -R" that does what the > > name suggests, and which should be faster than the 0.48 sec you see.. > > Am I just missing something, or your diff-tree doesn't handle > added/removed directories? You're not missing anything, I did it that way on purpose. I thought it would be easier to do the expansion in the caller (who knows what it is they want to do with the end result). But now that I look at merging, I realize that was actually the wrong thing to do. A merge algorithm definitely wants to see the expanded tree, since it will compare/join several of the diff-tree output things. So I'll either fix it or decide to just go with your version instead. I'm not overly proud. Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: Re: [ANNOUNCE] git-pasky-0.1
Dear diary, on Sun, Apr 10, 2005 at 10:38:11PM CEST, I got a letter where Linus Torvalds <[EMAIL PROTECTED]> told me that... ..snip.. > Can you pull my current repo, which has "diff-tree -R" that does what the > name suggests, and which should be faster than the 0.48 sec you see.. Am I just missing something, or your diff-tree doesn't handle added/removed directories? (Mine does! *hint* *hint* It also doesn't bother with dynamic allocation, but someone might consider the static path buffer ugly. Anyway, I hacked it with a plan to do a massive cleanup of the file later.) -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ 98% of the time I am right. Why worry about the other 3%. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: GIT license (Re: Re: Re: Re: Re: [ANNOUNCE] git-pasky-0.1)
Dear diary, on Mon, Apr 11, 2005 at 02:20:52AM CEST, I got a letter where Linus Torvalds <[EMAIL PROTECTED]> told me that... > Btw, does anybody have strong opinions on the license? I didn't put in a > COPYING file exactly because I was torn between GPLv2 and OSL2.1. > > I'm inclined to go with GPLv2 just because it's the most common one, but I > was wondering if anybody really had strong opinions. For example, I'd > really make it "v2 by default" like the kernel, since I'm sure v3 will be > fine, but regardless of how sure I am, I'm _not_ a gambling man. Oh, I wanted to ask about this too. I'd mostly prefer GPLv2 (I have no problem with the version restriction, I usually do it too), it's the one I'm mostly familiar with and OSL appears to be incompatible with GPL (at least FSF says so about OSL1.0), which might create various annoying issues. I hate when licenses get in my way and prevent me to possibly include some useful code. -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ 98% of the time I am right. Why worry about the other 3%. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
GIT license (Re: Re: Re: Re: Re: [ANNOUNCE] git-pasky-0.1)
Btw, does anybody have strong opinions on the license? I didn't put in a COPYING file exactly because I was torn between GPLv2 and OSL2.1. I'm inclined to go with GPLv2 just because it's the most common one, but I was wondering if anybody really had strong opinions. For example, I'd really make it "v2 by default" like the kernel, since I'm sure v3 will be fine, but regardless of how sure I am, I'm _not_ a gambling man. Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE] git-pasky-0.1
On Sun, 10 Apr 2005 16:23:11 -0700 Paul Jackson wrote: | Petr wrote: | > That reminds me, is there any | > tool which will take .rej files and throw them into the file to create | > rcsmerge-like conflicts? | | Check out 'wiggle' | http://www.cse.unsw.edu.au/~neilb/source/wiggle/ or Chris Mason's 'rej' program: ftp://ftp.suse.com/pub/people/mason/rej/ --- ~Randy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: Re: Re: Re: [ANNOUNCE] git-pasky-0.1
Dear diary, on Mon, Apr 11, 2005 at 01:46:50AM CEST, I got a letter where Linus Torvalds <[EMAIL PROTECTED]> told me that... > > > On Mon, 11 Apr 2005, Petr Baudis wrote: > > > > (BTW, it would be useful to have a tool which just blindly takes what > > you give it on input and throws it to an object of given type; I will > > need to construct arbitrary commits during the rebuild if I'm to keep > > the correct dates.) > > Hah. That's what "COMMITTER_NAME" "COMMITTER_EMAIL" and "COMMITTER_DATE" > are there for. > > There's two things to commits: when (and by whom) it was committed to a > tree, and when the changes were really done. > > So set the COMMITTER_xxx things to the person/time you want to consider > the _original_ one, and let "commit-tree" author you as the creator of the > commit itself. The regular "ChangeLog" thing should only show the author > and original time, but it's nice to see who created the commit itself. I already use those - look at my ChangeLog. (That's because for certain reasons I'm working on git in a half-broken chrooted environment.) When rebuilding the tree from scratch, I wanted like to do it transparently - that is, so that noone could notice that I rebuilt it, since it effectively still _is_ the original tree from the data standpoint, just the history flow is actually correct this time. > Btw, the "COMMITTER_" environment variables are very confusingly > named. They actually go into the _author_ line in the commit object. I'm a > total retard, and I really don't know why I called it "COMMITTER_xxx" > instead of "AUTHOR_xxx". So, who will fix it in his tree first! ;-) -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ 98% of the time I am right. Why worry about the other 3%. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: Re: [ANNOUNCE] git-pasky-0.1
Dear diary, on Sun, Apr 10, 2005 at 11:39:02PM CEST, I got a letter where Linus Torvalds <[EMAIL PROTECTED]> told me that... > On Sun, 10 Apr 2005, Linus Torvalds wrote: > > > > Can you pull my current repo, which has "diff-tree -R" that does what the > > name suggests, and which should be faster than the 0.48 sec you see.. > > Actually, I changed things around. Everybody hated the "<" ">" lines, so I > put a changed thing on a line of its own with a "*" instead. > > So you'd now see lines like > > *100644->100644 > 1874e031abf6631ea51cf6177b82a1e662f6183e->e8181df8499f165cacc6a0d8783be7143013d410 > CREDITS > > which means that the CREDITS file has changed, and it shows you the mode > -> mode transition (that didn't change in this case) and the sha1 -> sha1 > transition. > > So now it's always just one line per change. Firthermore, the filename is > always field 3, if you use spaces as delimeters, regardless of whether > it's a +/-/* field. That's great, just when I finally managed to properly fix the xargs boundary case in gitdiff-do (without throwing away the NUL-termination). You know how to please people! ;-) (Not that I'd have *anything* against the change. The logic is simpler and you'll be actually able to work with diff-tree a little sanely.) BTW, it is quite handy to have the entry type in the listing (guessing that from mode in the script just doesn't feel right and doing explicit cat-file kills the performance). I would also really prefer the fields separated by tabs. It looks nicer on the screen (aligned, e.g. modes and type are varsized), and is also easier to parse (cut defaults to tabs as delimiters, for example). > So let's say you want to merge two trees (dst1 and dst2) from a common > parent (src), what you would do is: > > - get the list of files to merge: > > diff-tree -R | tr '\0' '\n' > merge-files ...oh, I probably forgot to ask - why did you choose -R instead of -r? It looks rather alien to me; if it starts by 'diff', my hand writes -r without thinking. > - Which of those were changed by -> ? > > diff-tree -R | tr '\0' '\n' | join -j 3 - merge-files > > dst1-change > diff-tree -R | tr '\0' '\n' | join -j 3 - merge-files > > dst2-change > > - Which of those are common to both? Let's see what the merge list is: > > join dst1-change dst2-change > merge-list > > and hopefully you'd usually be working on a very small list of files by > then (everything else you'd just pick from one of the destination trees > directly - you've got the name, the sha-file, everything: no need to even > look at the data). Ok, this looks reasonable. (Provided that I DWYM regarding the joins.) > Does this sound sane? Pasky? Wanna try a "git merge" thing? Starting off > with the user having to tell what the common parent tree is - we can try > to do the "automatically find best common parent" crud later. THAT may be > expensive. I will definitively try "git merge", but maybe not this night anymore (it's already 1:32 here now). -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ 98% of the time I am right. Why worry about the other 3%. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: Re: Re: [ANNOUNCE] git-pasky-0.1
On Mon, 11 Apr 2005, Petr Baudis wrote: > > (BTW, it would be useful to have a tool which just blindly takes what > you give it on input and throws it to an object of given type; I will > need to construct arbitrary commits during the rebuild if I'm to keep > the correct dates.) Hah. That's what "COMMITTER_NAME" "COMMITTER_EMAIL" and "COMMITTER_DATE" are there for. There's two things to commits: when (and by whom) it was committed to a tree, and when the changes were really done. So set the COMMITTER_xxx things to the person/time you want to consider the _original_ one, and let "commit-tree" author you as the creator of the commit itself. The regular "ChangeLog" thing should only show the author and original time, but it's nice to see who created the commit itself. I did this very much on purpose: see how I always try to attribute authorship in BK to the person who actually wrote the code. At the same time, I think it's interesting from a tracking standpoint to also see when/where that change got introduced into a tree. I _tried_ to get this right in the sparse tree conversion. I won't guarantee that it's all correct, but the top commit in the sparse tree looks like this: tree 67607f05a66e36b2f038c77cfb61350d2110f7e8 parent 9c59995fef9b52386e5f7242f44720a7aca287d7 author Christopher Li <[EMAIL PROTECTED]> Sat Apr 2 09:30:09 PST 2005 committer Linus Torvalds <[EMAIL PROTECTED]> Thu Apr 7 20:06:31 2005 ... exactly because I tracked when I committed it to the sparse tree _separately_ from tracking when it was created. So when I re-create the sparse-tree, I'll also end up re-writing the "committer" information. And that's proper. That's really saying "this sha1 object was created by Xxxx at time Xxxx". Btw, the "COMMITTER_" environment variables are very confusingly named. They actually go into the _author_ line in the commit object. I'm a total retard, and I really don't know why I called it "COMMITTER_xxx" instead of "AUTHOR_xxx". Linus "retard" Torvalds - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE] git-pasky-0.1
Petr wrote: > That reminds me, is there any > tool which will take .rej files and throw them into the file to create > rcsmerge-like conflicts? Check out 'wiggle' http://www.cse.unsw.edu.au/~neilb/source/wiggle/ -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <[EMAIL PROTECTED]> 1.650.933.1373, 1.925.600.0401 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: Re: Re: [ANNOUNCE] git-pasky-0.1
Dear diary, on Mon, Apr 11, 2005 at 01:10:58AM CEST, I got a letter where Linus Torvalds <[EMAIL PROTECTED]> told me that... > > > On Mon, 11 Apr 2005, Petr Baudis wrote: > > > > I currently already do a merge when you track someone's source - it will > > throw away your previous HEAD record though > > Not only that, it doesn't do what I consider a "merge". > > A real merge should have two or more parents. The "commit-tree" command > already allows that: just add any arbitrary number of "-p x" > switches (well, I think I limited it to 16 parents, but that's just a > totally random number, there's nothing in the file format or anything > else that limits it). > > So while you've merged my "data", but you've not actually merged my > revision history in your tree. Well, that's exactly what I was (am) going to do. :-) That's also why I said that I (virtually) throw the local commits away now. Instead, if there were any local commits, I will do git merge: commit-tree $(write-tree) -p $local_head -p $tracked_tree Note that I will need to make this two-phase - first applying the changes, then doing the commit; between those two phases, the user should resolve potential conflicts and check if the merge went right. I think I will name the first phase git merge and the second phase will be just git commit, and I will store the merge information in .dircache/. (BTW, I think the directory name is pretty awful; what about .git/ ?) > And the reason a real merge _has_ to show both parents properly is that > unless you do that, you can never merge sanely another time without > getting lots of clashes from the previous merge. So it's important that a > merge really shows both trees it got data from. > > This is, btw, also the reason I haven't merged with your tree - I want to > get to the point where I really _can_ merge without throwing away the > information. In fact, at this point I'd rather not merge with your tree at > all, because I consider your tree to be "corrupt" thanks to lacking the > merge history. > > So you've done the data merge, but not the history merge. > > And because you didn't do the history merge, there's no way to > automatically find out what point of my tree you merged _with_. See? > > And since I have no way to see what point in time you merged with me, now > I can't generate a nice 3-way diff against the last common ancestor of > both of our trees. > > So now I can't do a three-way merge with you based on any sane ancestor, > unless I start guessing which ancestor of mine you merged with. Now, that > "guess" is easy enough to do with a project like "git" which currently has > just a few tens of commits and effectively only two parallell development > trees, but the whole point is to get to a system where that isn't true.. Well, I've wanted to get the basic things working first before doing git merge. (Especially since until recently, diff-tree was PITA to work with, and before that it didn't even exist.) If you want, I can rebuild my tree with doing the merging properly, after I have git merge working. (BTW, it would be useful to have a tool which just blindly takes what you give it on input and throws it to an object of given type; I will need to construct arbitrary commits during the rebuild if I'm to keep the correct dates.) -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ 98% of the time I am right. Why worry about the other 3%. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: Re: [ANNOUNCE] git-pasky-0.1
On Mon, 11 Apr 2005, Petr Baudis wrote: > > I currently already do a merge when you track someone's source - it will > throw away your previous HEAD record though Not only that, it doesn't do what I consider a "merge". A real merge should have two or more parents. The "commit-tree" command already allows that: just add any arbitrary number of "-p x" switches (well, I think I limited it to 16 parents, but that's just a totally random number, there's nothing in the file format or anything else that limits it). So while you've merged my "data", but you've not actually merged my revision history in your tree. And the reason a real merge _has_ to show both parents properly is that unless you do that, you can never merge sanely another time without getting lots of clashes from the previous merge. So it's important that a merge really shows both trees it got data from. This is, btw, also the reason I haven't merged with your tree - I want to get to the point where I really _can_ merge without throwing away the information. In fact, at this point I'd rather not merge with your tree at all, because I consider your tree to be "corrupt" thanks to lacking the merge history. So you've done the data merge, but not the history merge. And because you didn't do the history merge, there's no way to automatically find out what point of my tree you merged _with_. See? And since I have no way to see what point in time you merged with me, now I can't generate a nice 3-way diff against the last common ancestor of both of our trees. So now I can't do a three-way merge with you based on any sane ancestor, unless I start guessing which ancestor of mine you merged with. Now, that "guess" is easy enough to do with a project like "git" which currently has just a few tens of commits and effectively only two parallell development trees, but the whole point is to get to a system where that isn't true.. Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: Re: [ANNOUNCE] git-pasky-0.1
Dear diary, on Sun, Apr 10, 2005 at 10:38:11PM CEST, I got a letter where Linus Torvalds <[EMAIL PROTECTED]> told me that... > On Sun, 10 Apr 2005, Petr Baudis wrote: > > > > It turns out to be the forks for doing all the cuts and such what is > > bogging it down so awfully (doing diff-tree takes 0.48s ;-). I do about > > 15 forks per change, I guess, and for some reason cut takes a long of > > time on its own. > > Heh. > > Can you pull my current repo, which has "diff-tree -R" that does what the > name suggests, and which should be faster than the 0.48 sec you see.. Funnily enough, now after some more cache teasing it's ~0.185. Your one still ~0.17, though. :/ (That might be because of the format changes, though, since you do less printing now.) (BTW, all those measurements are done on my AMD K6 walking on 1600MHz, 512M RAM, about 200M available for caches.) Just out of interest, did you have a look at my diff-tree -r implementation and decided that you don't like it, or you weren't aware of it? I will probably take most of your diff-tree change, but I'd prefer to do the sha1->tree mapping directly in diff_tree(). > It may not matter a lot, since actually generating the diff from the file > contents is what is expensive, but remember my goal: I want the expense of > a diff-tree to be relative to the size of the diff, so that implies that > small diffs haev to be basically instantaenous. So I care. Me too, of course. > So I just tried the 2.6.7->2.6.8 diff, and for me the new recursive > "diff-tree" can generate the _list_ of files changed in zero time: > > real0m0.079s > user0m0.067s > sys 0m0.024s > > but then _doing_ the diff is pretty expensive (in this case 3800+ files > changed, so you have to unpack 7600+ objects - and even unpacking isn't > the expensive part, the expense is literally in the diff operation > itself). > > Me, the stuff I automate is the small steps. Doing a single checkin. So > that's the case I care about going fast, when a "diff-tree" will likely > have maybe five files or something. That's why I want the small > incremental cases to go fast - it it takes me a minute to generate a diff > for a _release_, that's not a big deal. I make one release every other > month, but I work with lots of small patches all the time. I see. > Anyway, with a fast diff-tree, you should be able to generate the list of > objects for a fast "merge". That's next. > > (And by "merge", I of course mean "suck". I'm talking about the old CVS > three-way merge, and you have to specify the common parent explicitly and > it won't handle any renames or any other crud. But it would get us to > something that might actually be useful for simple things. Which is why > "diff-tree" is important - it gives the information about what to tell > merge). I currently already do a merge when you track someone's source - it will throw away your previous HEAD record though, so if you committed some local changes after the previous pull, you will get orphaned commits and the changes will turn to uncommitted ones. I have some ideas regarding how to do it properly (and do any arbitrary merging, for that matter), I hope to get to it as soon as I catch up with you. :-) BTW, the three-way merge comes from RCS. That reminds me, is there any tool which will take .rej files and throw them into the file to create rcsmerge-like conflicts? Perhaps it's fault of my bad tools, but I prefer to work with the inline rejects much more to .rej files (except to actually notice the rejects). -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ 98% of the time I am right. Why worry about the other 3%. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: [ANNOUNCE] git-pasky-0.1
On Sun, 10 Apr 2005, Linus Torvalds wrote: > > Can you pull my current repo, which has "diff-tree -R" that does what the > name suggests, and which should be faster than the 0.48 sec you see.. Actually, I changed things around. Everybody hated the "<" ">" lines, so I put a changed thing on a line of its own with a "*" instead. So you'd now see lines like *100644->100644 1874e031abf6631ea51cf6177b82a1e662f6183e->e8181df8499f165cacc6a0d8783be7143013d410 CREDITS which means that the CREDITS file has changed, and it shows you the mode -> mode transition (that didn't change in this case) and the sha1 -> sha1 transition. So now it's always just one line per change. Firthermore, the filename is always field 3, if you use spaces as delimeters, regardless of whether it's a +/-/* field. So let's say you want to merge two trees (dst1 and dst2) from a common parent (src), what you would do is: - get the list of files to merge: diff-tree -R | tr '\0' '\n' > merge-files - Which of those were changed by -> ? diff-tree -R | tr '\0' '\n' | join -j 3 - merge-files > dst1-change diff-tree -R | tr '\0' '\n' | join -j 3 - merge-files > dst2-change - Which of those are common to both? Let's see what the merge list is: join dst1-change dst2-change > merge-list and hopefully you'd usually be working on a very small list of files by then (everything else you'd just pick from one of the destination trees directly - you've got the name, the sha-file, everything: no need to even look at the data). Does this sound sane? Pasky? Wanna try a "git merge" thing? Starting off with the user having to tell what the common parent tree is - we can try to do the "automatically find best common parent" crud later. THAT may be expensive. (Btw, this is why I think "diff-tree" is more important than actually generating the real diff itself - the above uses diff-tree three times just to cut down to the point where _hopefully_ you don't actually need to generate very much diffs at all. So I want "diff-tree" to be really fast, even if it then can take a minute to actually generate a big diff between releases etc). Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: Re: [ANNOUNCE] git-pasky-0.1
Dear diary, on Sun, Apr 10, 2005 at 09:13:19PM CEST, I got a letter where Willy Tarreau <[EMAIL PROTECTED]> told me that... > On Sun, Apr 10, 2005 at 08:45:22PM +0200, Petr Baudis wrote: > > > It turns out to be the forks for doing all the cuts and such what is > > bogging it down so awfully (doing diff-tree takes 0.48s ;-). I do about > > 15 forks per change, I guess, and for some reason cut takes a long of > > time on its own. > > > > I've rewritten the cuts with the use of bash arrays and other smart > > stuff. I somehow don't feel comfortable using this and prefer the > > old-fashioned ways, but it would be plain unusable without this. > > I've encountered the same problem in a config-generation script a while > ago. Fortunately, bash provides enough ways to remove most of the forks, > but the result is less portable. > > I've downloaded your code, but it does not compile here because of the > tv_nsec fields in struct stat (2.4, glibc 2.2), so I cannot use it to > get the most up to date version to take a look at the script. Basically, Ok, I decided to stop this nsec madness (since it broke show-diff anyway at least on my ext3), and you get it only if you pass -DNSEC to CFLAGS now. Hope this fixes things for you. :-) BTW, I regularly update the public copy as accessible on the web. > all the 'cut' and 'sed' can be removed, as well as the 'dirname'. You > can also call mkdir only if the dirs don't exist. I really think you > should end up with only one fork in the loop to call 'diff'. You still need to extract the file by cat-file too. ;-) And rm the files after it compares them (so that we don't fill /tmp with crap like certain awful programs like to do). But I will conditionalize the mkdir calls, thanks for the suggestion - I think that's the last bit to be squeezed from this loop (I'll yet check on the read proposal - I considered it before and turned down for some reason, can't remember why anymore, though). Thanks, -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ 98% of the time I am right. Why worry about the other 3%. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE] git-pasky-0.1
Good lord - you don't need to use arrays for this. The old-fashioned ways have their ways. Both the 'set' command and the 'read' command can split args and assign to distinct variable names. Try something like the following: diff-tree -r $id1 $id2 | sed -e '/^/ / }' -e 's/./& /' | while read op mode1 sha1 name1 mode2 sha2 name2 do ... various common stuff ... case "$op" in "+") ... ;; "-") ... ;; "<") test $name1 = $name2 || die mismatched names label1=$(mkbanner "$loc1" $id1 "$name1" $mode1 $sha1) label2=$(mkbanner "$loc2" $id2 "$name1" $mode2 $sha2) diff -L "$label1" -L "$label2" -u "$loc1" "$loc2" ;; esac done -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <[EMAIL PROTECTED]> 1.650.933.1373, 1.925.600.0401 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: [ANNOUNCE] git-pasky-0.1
On Sun, 10 Apr 2005, Petr Baudis wrote: > > It turns out to be the forks for doing all the cuts and such what is > bogging it down so awfully (doing diff-tree takes 0.48s ;-). I do about > 15 forks per change, I guess, and for some reason cut takes a long of > time on its own. Heh. Can you pull my current repo, which has "diff-tree -R" that does what the name suggests, and which should be faster than the 0.48 sec you see.. It may not matter a lot, since actually generating the diff from the file contents is what is expensive, but remember my goal: I want the expense of a diff-tree to be relative to the size of the diff, so that implies that small diffs haev to be basically instantaenous. So I care. So I just tried the 2.6.7->2.6.8 diff, and for me the new recursive "diff-tree" can generate the _list_ of files changed in zero time: real0m0.079s user0m0.067s sys 0m0.024s but then _doing_ the diff is pretty expensive (in this case 3800+ files changed, so you have to unpack 7600+ objects - and even unpacking isn't the expensive part, the expense is literally in the diff operation itself). Me, the stuff I automate is the small steps. Doing a single checkin. So that's the case I care about going fast, when a "diff-tree" will likely have maybe five files or something. That's why I want the small incremental cases to go fast - it it takes me a minute to generate a diff for a _release_, that's not a big deal. I make one release every other month, but I work with lots of small patches all the time. Anyway, with a fast diff-tree, you should be able to generate the list of objects for a fast "merge". That's next. (And by "merge", I of course mean "suck". I'm talking about the old CVS three-way merge, and you have to specify the common parent explicitly and it won't handle any renames or any other crud. But it would get us to something that might actually be useful for simple things. Which is why "diff-tree" is important - it gives the information about what to tell merge). Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE] git-pasky-0.1
On Sun, April 10, 2005 12:55 pm, Linus Torvalds said: > Larry was ok with the idea to make my export format actually be natively > supported by BK (ie the same way you have "bk export -tpatch"), but > Tridge wanted to instead get at the native data and be difficult about > it. As a result, I can now not only use BK any more, but we also don't > have a nice export format from BK. > > Yeah, I'm a bit bitter about it. > Linus, With all due respect, Larry could have dealt with this years ago and removed the motivation for Tridge and others to pursue reverse engineering. Instead he chose to insult and question the motives of everyone that wanted open-source access to the Linux history data. The blame for the current situation falls firmly on the choice to use a closed-source SCM for Linux and the actions of the company that owned it. Sean - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: [ANNOUNCE] git-pasky-0.1
On Sun, Apr 10, 2005 at 08:45:22PM +0200, Petr Baudis wrote: > It turns out to be the forks for doing all the cuts and such what is > bogging it down so awfully (doing diff-tree takes 0.48s ;-). I do about > 15 forks per change, I guess, and for some reason cut takes a long of > time on its own. > > I've rewritten the cuts with the use of bash arrays and other smart > stuff. I somehow don't feel comfortable using this and prefer the > old-fashioned ways, but it would be plain unusable without this. I've encountered the same problem in a config-generation script a while ago. Fortunately, bash provides enough ways to remove most of the forks, but the result is less portable. I've downloaded your code, but it does not compile here because of the tv_nsec fields in struct stat (2.4, glibc 2.2), so I cannot use it to get the most up to date version to take a look at the script. Basically, all the 'cut' and 'sed' can be removed, as well as the 'dirname'. You can also call mkdir only if the dirs don't exist. I really think you should end up with only one fork in the loop to call 'diff'. > Now I'm down to > > real1m21.440s > user0m32.374s > sys 0m42.200s > > and I kinda doubt if it is possible to cut this much down. Almost no > disk activity, I have almost everything cached by now, apparently. It is very common to cut times by a factor of 10 or more when replacing common unix tools by pure shell. Dynamic library initialization also takes a lot of time nowadays, and probably you have localisation which is big too. Sometimes, just wiping a few variables at the top of the shell might remove some useless overhead. > Anyway, you can git pull to get the optimized version. > > Thanks for the help, Willy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: [ANNOUNCE] git-pasky-0.1
Dear diary, on Sun, Apr 10, 2005 at 07:45:12PM CEST, I got a letter where Ingo Molnar <[EMAIL PROTECTED]> told me that... > > * Willy Tarreau <[EMAIL PROTECTED]> wrote: > > > > > I will also need to do more testing on the linux kernel tree. > > > > Committing patch-2.6.7 on 2.6.6 kernel and then diffing results in > > > > > > > > $ time gitdiff.sh `parent-id` `tree-id` >p > > > > real5m37.434s > > > > user1m27.113s > > > > sys 2m41.036s > > > > > > > > which is pretty horrible, it seems to me. Any benchmarking help is of > > > > course welcomed, as well as any other feedback. > > > > > > it seems from the numbers that your system doesnt have enough RAM for > > > this and is getting IO-bound? > > > > Not the only problem, without I/O, he will go down to 4m8s (u+s) which > > is still in the same order of magnitude. > > probably not the only problem - but if we are lucky then his system was > just trashing within the kernel repository and then most of the overhead > is the _unnecessary_ IO that happened due to that (which causes CPU > overhead just as much). The dominant system time suggests so, to a > certain degree. Maybe this is wishful thinking. It turns out to be the forks for doing all the cuts and such what is bogging it down so awfully (doing diff-tree takes 0.48s ;-). I do about 15 forks per change, I guess, and for some reason cut takes a long of time on its own. I've rewritten the cuts with the use of bash arrays and other smart stuff. I somehow don't feel comfortable using this and prefer the old-fashioned ways, but it would be plain unusable without this. Now I'm down to real1m21.440s user0m32.374s sys 0m42.200s and I kinda doubt if it is possible to cut this much down. Almost no disk activity, I have almost everything cached by now, apparently. Anyway, you can git pull to get the optimized version. Thanks for the help, -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ 98% of the time I am right. Why worry about the other 3%. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE] git-pasky-0.1
* Willy Tarreau <[EMAIL PROTECTED]> wrote: > > > I will also need to do more testing on the linux kernel tree. > > > Committing patch-2.6.7 on 2.6.6 kernel and then diffing results in > > > > > > $ time gitdiff.sh `parent-id` `tree-id` >p > > > real5m37.434s > > > user1m27.113s > > > sys 2m41.036s > > > > > > which is pretty horrible, it seems to me. Any benchmarking help is of > > > course welcomed, as well as any other feedback. > > > > it seems from the numbers that your system doesnt have enough RAM for > > this and is getting IO-bound? > > Not the only problem, without I/O, he will go down to 4m8s (u+s) which > is still in the same order of magnitude. probably not the only problem - but if we are lucky then his system was just trashing within the kernel repository and then most of the overhead is the _unnecessary_ IO that happened due to that (which causes CPU overhead just as much). The dominant system time suggests so, to a certain degree. Maybe this is wishful thinking. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE] git-pasky-0.1
On Sun, Apr 10, 2005 at 07:33:49PM +0200, Ingo Molnar wrote: > > * Petr Baudis <[EMAIL PROTECTED]> wrote: > > > I will also need to do more testing on the linux kernel tree. > > Committing patch-2.6.7 on 2.6.6 kernel and then diffing results in > > > > $ time gitdiff.sh `parent-id` `tree-id` >p > > real5m37.434s > > user1m27.113s > > sys 2m41.036s > > > > which is pretty horrible, it seems to me. Any benchmarking help is of > > course welcomed, as well as any other feedback. > > it seems from the numbers that your system doesnt have enough RAM for > this and is getting IO-bound? Not the only problem, without I/O, he will go down to 4m8s (u+s) which is still in the same order of magnitude. willy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE] git-pasky-0.1
* Petr Baudis <[EMAIL PROTECTED]> wrote: > I will also need to do more testing on the linux kernel tree. > Committing patch-2.6.7 on 2.6.6 kernel and then diffing results in > > $ time gitdiff.sh `parent-id` `tree-id` >p > real5m37.434s > user1m27.113s > sys 2m41.036s > > which is pretty horrible, it seems to me. Any benchmarking help is of > course welcomed, as well as any other feedback. it seems from the numbers that your system doesnt have enough RAM for this and is getting IO-bound? Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE] git-pasky-0.1
On Sun, 10 Apr 2005, Petr Baudis wrote: > > Where did you get the sparse git database from, Linus? (BTW, it > would be nice to get sparse.git with the directories as separate.) When we were trying to figure out how to avert the BK disaster, and one of Tridges concerns (and, in my opinion, the only really valid one) was that you couldn't get the BK data in some SCM-independent way. So I wrote some very preliminary scripts (on top of BK itself) to extract the data, to show that BK could generate a SCM-neutral file format (a very stupid one and horribly useless for anything but interoperability, but still...). I was hoping that that would convince Tridge that trying to muck around with the internal BK file format was not worth it, and avert the BK trainwreck. Larry was ok with the idea to make my export format actually be natively supported by BK (ie the same way you have "bk export -tpatch"), but Tridge wanted to instead get at the native data and be difficult about it. As a result, I can now not only use BK any more, but we also don't have a nice export format from BK. Yeah, I'm a bit bitter about it. Anyway, the sparse data came out of my hack. It's very inefficient, and I estimated that doing the same for the kernel would have taken ten solid days of conversion, mainly because my hack was really just that: a quick hack to show that BK could do it. Larry could have done it a lot better. I'll re-generate the sparse git-database at some point (and I'll probably do so from the old GIT database itself, rather than re-generating it from my old BK data). Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[ANNOUNCE] git-pasky-0.1
Hello, so I "released" git-pasky-0.1, my set of patches and scripts upon Linus' git, aimed at human usability and to an extent a SCM-like usage. You can get it at http://pasky.or.cz/~pasky/dev/git/git-pasky-base.tar.bz2 and after unpacking and building (make) do git pull pasky to get the latest changes from my branch. If you already have some git from my branch which can do pulling, you can bring yourself up to date by doing just gitpull.sh pasky (but this style of usage is deprecated now). Please see the README for some details regarding usage etc. You can find the changes from the last announcement in the ChangeLog (the previous announcement corresponds to commit id 5125d089ad862f16a306b4942155092e1dce1c2d). The most important change is probably recursive diff addition, and making git ignore the nsec of ctime and mtime, since it is totally unreliable and likes to taint random files as modified. My near future plans include especially some merge support; I think it should be rather easy, actually. I'll also add some simple tagging mechanism. I've decided to postpone the file moving detection, since there's no big demand for it now. ;-) I will also need to do more testing on the linux kernel tree. Committing patch-2.6.7 on 2.6.6 kernel and then diffing results in $ time gitdiff.sh `parent-id` `tree-id` >p real5m37.434s user1m27.113s sys 2m41.036s which is pretty horrible, it seems to me. Any benchmarking help is of course welcomed, as well as any other feedback. BTW, what would be the best (most complete) source for the BK tree metadata? Should I dig it from the BKCVS gateway, or is there a better source? Where did you get the sparse git database from, Linus? (BTW, it would be nice to get sparse.git with the directories as separate.) Have fun, -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ 98% of the time I am right. Why worry about the other 3%. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/