Re: [rfc] git: combo-blobs

2005-04-12 Thread Barry K. Nathan
On Mon, Apr 11, 2005 at 06:33:58PM +0200, Ingo Molnar wrote: > ok. Meanwhile i found another counter-argument: the average committed > file size is 36K, which with gzip -9 would compress down to roughly 8K, > with the commit message being another block. That's 2+1 blocks used per > commit,

Re: [rfc] git: combo-blobs

2005-04-12 Thread Barry K. Nathan
On Mon, Apr 11, 2005 at 06:33:58PM +0200, Ingo Molnar wrote: ok. Meanwhile i found another counter-argument: the average committed file size is 36K, which with gzip -9 would compress down to roughly 8K, with the commit message being another block. That's 2+1 blocks used per commit, while

Re: [rfc] git: combo-blobs

2005-04-11 Thread Linus Torvalds
On Mon, 11 Apr 2005, Linus Torvalds wrote: > > bk changes -R > > > > bk changes -L > > You'd dowload all the sha1 objects (they don't actually do anything to > _your_ state - they only show the possible other states), and then it's a > "simple thing" to generate a full tree of your

Re: Re: [rfc] git: combo-blobs

2005-04-11 Thread Petr Baudis
Dear diary, on Mon, Apr 11, 2005 at 08:13:19PM CEST, I got a letter where Chris Wedgwood <[EMAIL PROTECTED]> told me that... > On Mon, Apr 11, 2005 at 09:01:51AM -0700, Linus Torvalds wrote: > > > I disagree. Yes, the thing is designed to be replicated, so most of > > the time the easiest thing

Re: [rfc] git: combo-blobs

2005-04-11 Thread Linus Torvalds
On Mon, 11 Apr 2005, Chris Wedgwood wrote: > > On Mon, Apr 11, 2005 at 09:01:51AM -0700, Linus Torvalds wrote: > > > I disagree. Yes, the thing is designed to be replicated, so most of > > the time the easiest thing to do is to just rsync with another copy. > > It's not clear how any of this

Re: [rfc] git: combo-blobs

2005-04-11 Thread Chris Wedgwood
On Mon, Apr 11, 2005 at 09:01:51AM -0700, Linus Torvalds wrote: > I disagree. Yes, the thing is designed to be replicated, so most of > the time the easiest thing to do is to just rsync with another copy. It's not clear how any of this is going to give me something like bk changes -R or

Re: [rfc] git: combo-blobs

2005-04-11 Thread Paul Jackson
Ingo wrote: > actually, git would just include by reference the previous blob. Ok - kind of like a patch blob. I can see now where under some conditions this saves space. I agree with conclusion this thread has already reached. Keep it simple. -- I won't rest till it's the

Re: [rfc] git: combo-blobs

2005-04-11 Thread Ingo Molnar
* Linus Torvalds <[EMAIL PROTECTED]> wrote: > > Also, with a 'replicate the full object on every 8th commit' > > rule the risk would be somewhat mitigated as well. > > ..but not the complexity. > > The fact is, I want to trust this thing. Dammit, one reason I like GIT > is that I can

Re: [rfc] git: combo-blobs

2005-04-11 Thread Linus Torvalds
On Mon, 11 Apr 2005, Ingo Molnar wrote: > > if a repository is corrupted then it pretty much needs to be dropped > anyway. I disagree. Yes, the thing is designed to be replicated, so most of the time the easiest thing to do is to just rsync with another copy. But dammit, I don't want to

Re: [rfc] git: combo-blobs

2005-04-11 Thread Ingo Molnar
* Ingo Molnar <[EMAIL PROTECTED]> wrote: > > * Linus Torvalds <[EMAIL PROTECTED]> wrote: > > > > to construct the combo blob later on, we do have to unpack sched.c (and > > > if it's already a combo-blob that is not cached then we'd have to unpack > > > all parents until we arrive at some

Re: [rfc] git: combo-blobs

2005-04-11 Thread Ingo Molnar
* Linus Torvalds <[EMAIL PROTECTED]> wrote: > > to construct the combo blob later on, we do have to unpack sched.c (and > > if it's already a combo-blob that is not cached then we'd have to unpack > > all parents until we arrive at some full blob). > > I really don't want to have this. Having

Re: [rfc] git: combo-blobs

2005-04-11 Thread Linus Torvalds
On Mon, 11 Apr 2005, Ingo Molnar wrote: > > to construct the combo blob later on, we do have to unpack sched.c (and > if it's already a combo-blob that is not cached then we'd have to unpack > all parents until we arrive at some full blob). I really don't want to have this. Having chains of

Re: [rfc] git: combo-blobs

2005-04-11 Thread Ingo Molnar
* Ingo Molnar <[EMAIL PROTECTED]> wrote: > here are some stats: of the last 34160 files modified in the Linux > kernel tree in the past 1 year, the file sizes total to 1 GB, and the > average file-size per file committed is 31220 bytes. The changes > themselves amount to: > > 22404 files

Re: [rfc] git: combo-blobs

2005-04-11 Thread Ingo Molnar
here are some stats: of the last 34160 files modified in the Linux kernel tree in the past 1 year, the file sizes total to 1 GB, and the average file-size per file committed is 31220 bytes. The changes themselves amount to: 22404 files changed, 1996494 insertions(+), 1396644 deletions(-)

Re: [rfc] git: combo-blobs

2005-04-11 Thread Ingo Molnar
* Paul Jackson <[EMAIL PROTECTED]> wrote: > Hmmm ... I have this strong sense that I am about 2 hours away from > smacking my forehead and groaning "Duh - so that's what Ingo meant!" > > However, one must play out one's destiny. > > Could you provide an example scenario, which results in the

Re: [rfc] git: combo-blobs

2005-04-11 Thread Paul Jackson
Hmmm ... I have this strong sense that I am about 2 hours away from smacking my forehead and groaning "Duh - so that's what Ingo meant!" However, one must play out one's destiny. Could you provide an example scenario, which results in the creation of a combo-blob? The best I can come up with is

[rfc] git: combo-blobs

2005-04-11 Thread Ingo Molnar
i think all of the 'repository size' and 'bandwidth' concerns could be solved via a new (and pretty much simple and transparent) object type: the 'combo-blob'. Summary: This is a space/bandwidth-efficient blob that 'includes' arbitrary portions of (one, two, or more) simple blobs by

[rfc] git: combo-blobs

2005-04-11 Thread Ingo Molnar
i think all of the 'repository size' and 'bandwidth' concerns could be solved via a new (and pretty much simple and transparent) object type: the 'combo-blob'. Summary: This is a space/bandwidth-efficient blob that 'includes' arbitrary portions of (one, two, or more) simple blobs by

Re: [rfc] git: combo-blobs

2005-04-11 Thread Paul Jackson
Hmmm ... I have this strong sense that I am about 2 hours away from smacking my forehead and groaning Duh - so that's what Ingo meant! However, one must play out one's destiny. Could you provide an example scenario, which results in the creation of a combo-blob? The best I can come up with is

Re: [rfc] git: combo-blobs

2005-04-11 Thread Ingo Molnar
* Paul Jackson [EMAIL PROTECTED] wrote: Hmmm ... I have this strong sense that I am about 2 hours away from smacking my forehead and groaning Duh - so that's what Ingo meant! However, one must play out one's destiny. Could you provide an example scenario, which results in the creation

Re: [rfc] git: combo-blobs

2005-04-11 Thread Ingo Molnar
here are some stats: of the last 34160 files modified in the Linux kernel tree in the past 1 year, the file sizes total to 1 GB, and the average file-size per file committed is 31220 bytes. The changes themselves amount to: 22404 files changed, 1996494 insertions(+), 1396644 deletions(-)

Re: [rfc] git: combo-blobs

2005-04-11 Thread Ingo Molnar
* Ingo Molnar [EMAIL PROTECTED] wrote: here are some stats: of the last 34160 files modified in the Linux kernel tree in the past 1 year, the file sizes total to 1 GB, and the average file-size per file committed is 31220 bytes. The changes themselves amount to: 22404 files changed,

Re: [rfc] git: combo-blobs

2005-04-11 Thread Linus Torvalds
On Mon, 11 Apr 2005, Ingo Molnar wrote: to construct the combo blob later on, we do have to unpack sched.c (and if it's already a combo-blob that is not cached then we'd have to unpack all parents until we arrive at some full blob). I really don't want to have this. Having chains of

Re: [rfc] git: combo-blobs

2005-04-11 Thread Ingo Molnar
* Linus Torvalds [EMAIL PROTECTED] wrote: to construct the combo blob later on, we do have to unpack sched.c (and if it's already a combo-blob that is not cached then we'd have to unpack all parents until we arrive at some full blob). I really don't want to have this. Having chains of

Re: [rfc] git: combo-blobs

2005-04-11 Thread Ingo Molnar
* Ingo Molnar [EMAIL PROTECTED] wrote: * Linus Torvalds [EMAIL PROTECTED] wrote: to construct the combo blob later on, we do have to unpack sched.c (and if it's already a combo-blob that is not cached then we'd have to unpack all parents until we arrive at some full blob). I

Re: [rfc] git: combo-blobs

2005-04-11 Thread Linus Torvalds
On Mon, 11 Apr 2005, Ingo Molnar wrote: if a repository is corrupted then it pretty much needs to be dropped anyway. I disagree. Yes, the thing is designed to be replicated, so most of the time the easiest thing to do is to just rsync with another copy. But dammit, I don't want to just

Re: [rfc] git: combo-blobs

2005-04-11 Thread Ingo Molnar
* Linus Torvalds [EMAIL PROTECTED] wrote: Also, with a 'replicate the full object on every 8th commit' rule the risk would be somewhat mitigated as well. ..but not the complexity. The fact is, I want to trust this thing. Dammit, one reason I like GIT is that I can mentally

Re: [rfc] git: combo-blobs

2005-04-11 Thread Paul Jackson
Ingo wrote: actually, git would just include by reference the previous blob. Ok - kind of like a patch blob. I can see now where under some conditions this saves space. I agree with conclusion this thread has already reached. Keep it simple. -- I won't rest till it's the

Re: [rfc] git: combo-blobs

2005-04-11 Thread Chris Wedgwood
On Mon, Apr 11, 2005 at 09:01:51AM -0700, Linus Torvalds wrote: I disagree. Yes, the thing is designed to be replicated, so most of the time the easiest thing to do is to just rsync with another copy. It's not clear how any of this is going to give me something like bk changes -R or

Re: [rfc] git: combo-blobs

2005-04-11 Thread Linus Torvalds
On Mon, 11 Apr 2005, Chris Wedgwood wrote: On Mon, Apr 11, 2005 at 09:01:51AM -0700, Linus Torvalds wrote: I disagree. Yes, the thing is designed to be replicated, so most of the time the easiest thing to do is to just rsync with another copy. It's not clear how any of this is going

Re: Re: [rfc] git: combo-blobs

2005-04-11 Thread Petr Baudis
Dear diary, on Mon, Apr 11, 2005 at 08:13:19PM CEST, I got a letter where Chris Wedgwood [EMAIL PROTECTED] told me that... On Mon, Apr 11, 2005 at 09:01:51AM -0700, Linus Torvalds wrote: I disagree. Yes, the thing is designed to be replicated, so most of the time the easiest thing to do is

Re: [rfc] git: combo-blobs

2005-04-11 Thread Linus Torvalds
On Mon, 11 Apr 2005, Linus Torvalds wrote: bk changes -R bk changes -L You'd dowload all the sha1 objects (they don't actually do anything to _your_ state - they only show the possible other states), and then it's a simple thing to generate a full tree of your local HEAD