Re: [PATCH 2/6] commit: add generation number to struct commmit
On Wed, Apr 04, 2018 at 12:17:06AM +0100, Ramsay Jones wrote: > >> Is there any reason to believe this would be too small of a value in the > >> future? Or is a 32 bit unsigned good enough? > > > > The linux kernel took ~10 years to produce 500k commits. Even assuming > > those were all linear (and they're not), that gives us ~80,000 years of > > leeway. So even if the pace of development speeds up or we have a > > quicker project, it still seems we have a pretty reasonable safety > > margin. > > I didn't read the patches closely, but isn't it ~20,000 years? > > Given that '#define GENERATION_NUMBER_MAX 0x3FFF', that is. ;-) What, I'm supposed to read the patches before responding? Heresy. -Peff
Re: [PATCH 2/6] commit: add generation number to struct commmit
On 03/04/18 19:28, Jeff King wrote: > On Tue, Apr 03, 2018 at 11:05:36AM -0700, Brandon Williams wrote: > >> On 04/03, Derrick Stolee wrote: >>> The generation number of a commit is defined recursively as follows: >>> >>> * If a commit A has no parents, then the generation number of A is one. >>> * If a commit A has parents, then the generation number of A is one >>> more than the maximum generation number among the parents of A. >>> >>> Add a uint32_t generation field to struct commit so we can pass this >> >> Is there any reason to believe this would be too small of a value in the >> future? Or is a 32 bit unsigned good enough? > > The linux kernel took ~10 years to produce 500k commits. Even assuming > those were all linear (and they're not), that gives us ~80,000 years of > leeway. So even if the pace of development speeds up or we have a > quicker project, it still seems we have a pretty reasonable safety > margin. I didn't read the patches closely, but isn't it ~20,000 years? Given that '#define GENERATION_NUMBER_MAX 0x3FFF', that is. ;-) ATB, Ramsay Jones
Re: [PATCH 2/6] commit: add generation number to struct commmit
On Tue, Apr 3, 2018 at 11:28 AM, Jeff King wrote: > On Tue, Apr 03, 2018 at 11:05:36AM -0700, Brandon Williams wrote: > >> On 04/03, Derrick Stolee wrote: >> > The generation number of a commit is defined recursively as follows: >> > >> > * If a commit A has no parents, then the generation number of A is one. >> > * If a commit A has parents, then the generation number of A is one >> > more than the maximum generation number among the parents of A. >> > >> > Add a uint32_t generation field to struct commit so we can pass this >> >> Is there any reason to believe this would be too small of a value in the >> future? Or is a 32 bit unsigned good enough? > > The linux kernel took ~10 years to produce 500k commits. Even assuming > those were all linear (and they're not), ... which you meant in terms of DAG, where a linear history is the worst case for generation numbers. I first read it the other way round, as the best case w.r.t. timing ~/linux$ git log --oneline |wc -l 721223 $ git log --oneline --since 2012 |wc -l 421853 $ git log --oneline --since 2011 |wc -l 477155 The number of commits is growing exponentially, though the exponential part is very small and the YoY growth can be estimated using linear interpolation. In linux, the release is a natural synchronization point IIUC as well as on a regular schedule. So an interesting question to ask there would be whether the delta in generation number goes up over time, or if the DAG just gets wider (=more parallel) > that gives us ~80,000 years of > leeway. So even if the pace of development speeds up or we have a > quicker project, it still seems we have a pretty reasonable safety > margin. Thanks for the estimate. Stefan
Re: [PATCH 2/6] commit: add generation number to struct commmit
On 04/03, Jeff King wrote: > On Tue, Apr 03, 2018 at 11:05:36AM -0700, Brandon Williams wrote: > > > On 04/03, Derrick Stolee wrote: > > > The generation number of a commit is defined recursively as follows: > > > > > > * If a commit A has no parents, then the generation number of A is one. > > > * If a commit A has parents, then the generation number of A is one > > > more than the maximum generation number among the parents of A. > > > > > > Add a uint32_t generation field to struct commit so we can pass this > > > > Is there any reason to believe this would be too small of a value in the > > future? Or is a 32 bit unsigned good enough? > > The linux kernel took ~10 years to produce 500k commits. Even assuming > those were all linear (and they're not), that gives us ~80,000 years of > leeway. So even if the pace of development speeds up or we have a > quicker project, it still seems we have a pretty reasonable safety > margin. > > -Peff I figured as much, but just wanted to check since the windows folks seems to produce commits pretty quickly. -- Brandon Williams
Re: [PATCH 2/6] commit: add generation number to struct commmit
On 4/3/2018 2:28 PM, Jeff King wrote: On Tue, Apr 03, 2018 at 11:05:36AM -0700, Brandon Williams wrote: On 04/03, Derrick Stolee wrote: The generation number of a commit is defined recursively as follows: * If a commit A has no parents, then the generation number of A is one. * If a commit A has parents, then the generation number of A is one more than the maximum generation number among the parents of A. Add a uint32_t generation field to struct commit so we can pass this Is there any reason to believe this would be too small of a value in the future? Or is a 32 bit unsigned good enough? The linux kernel took ~10 years to produce 500k commits. Even assuming those were all linear (and they're not), that gives us ~80,000 years of leeway. So even if the pace of development speeds up or we have a quicker project, it still seems we have a pretty reasonable safety margin. That, and larger projects do not have linear histories. Despite having almost 2 million reachable commits, the Windows repository has maximum generation number ~100,000. Thanks, -Stolee
Re: [PATCH 2/6] commit: add generation number to struct commmit
On Tue, Apr 03, 2018 at 11:05:36AM -0700, Brandon Williams wrote: > On 04/03, Derrick Stolee wrote: > > The generation number of a commit is defined recursively as follows: > > > > * If a commit A has no parents, then the generation number of A is one. > > * If a commit A has parents, then the generation number of A is one > > more than the maximum generation number among the parents of A. > > > > Add a uint32_t generation field to struct commit so we can pass this > > Is there any reason to believe this would be too small of a value in the > future? Or is a 32 bit unsigned good enough? The linux kernel took ~10 years to produce 500k commits. Even assuming those were all linear (and they're not), that gives us ~80,000 years of leeway. So even if the pace of development speeds up or we have a quicker project, it still seems we have a pretty reasonable safety margin. -Peff
Re: [PATCH 2/6] commit: add generation number to struct commmit
On Tue, 3 Apr 2018 12:51:39 -0400 Derrick Stolee wrote: > The generation number of a commit is defined recursively as follows: > > * If a commit A has no parents, then the generation number of A is one. > * If a commit A has parents, then the generation number of A is one > more than the maximum generation number among the parents of A. > > Add a uint32_t generation field to struct commit so we can pass this > information to revision walks. We use two special values to signal > the generation number is invalid: > > GENERATION_NUMBER_UNDEF 0x > GENERATION_NUMBER_NONE 0 > > The first (_UNDEF) means the generation number has not been loaded or > computed. The second (_NONE) means the generation number was loaded > from a commit graph file that was stored before generation numbers > were computed. > > Signed-off-by: Derrick Stolee This looks straightforward and correct, thanks. I think some of the description above should appear as code comments. > +#define GENERATION_NUMBER_UNDEF 0x > +#define GENERATION_NUMBER_NONE 0 I would include the description above here as documentation, and would replace "was stored before generation numbers were computed" by "was written by a version of Git that did not support generation numbers".
Re: [PATCH 2/6] commit: add generation number to struct commmit
On 04/03, Derrick Stolee wrote: > The generation number of a commit is defined recursively as follows: > > * If a commit A has no parents, then the generation number of A is one. > * If a commit A has parents, then the generation number of A is one > more than the maximum generation number among the parents of A. > > Add a uint32_t generation field to struct commit so we can pass this Is there any reason to believe this would be too small of a value in the future? Or is a 32 bit unsigned good enough? > information to revision walks. We use two special values to signal > the generation number is invalid: > > GENERATION_NUMBER_UNDEF 0x > GENERATION_NUMBER_NONE 0 > > The first (_UNDEF) means the generation number has not been loaded or > computed. The second (_NONE) means the generation number was loaded > from a commit graph file that was stored before generation numbers > were computed. > > Signed-off-by: Derrick Stolee > --- > alloc.c| 1 + > commit-graph.c | 2 ++ > commit.h | 3 +++ > 3 files changed, 6 insertions(+) > > diff --git a/alloc.c b/alloc.c > index cf4f8b61e1..1a62e85ac3 100644 > --- a/alloc.c > +++ b/alloc.c > @@ -94,6 +94,7 @@ void *alloc_commit_node(void) > c->object.type = OBJ_COMMIT; > c->index = alloc_commit_index(); > c->graph_pos = COMMIT_NOT_FROM_GRAPH; > + c->generation = GENERATION_NUMBER_UNDEF; > return c; > } > > diff --git a/commit-graph.c b/commit-graph.c > index 1fc63d541b..d24b947525 100644 > --- a/commit-graph.c > +++ b/commit-graph.c > @@ -264,6 +264,8 @@ static int fill_commit_in_graph(struct commit *item, > struct commit_graph *g, uin > date_low = get_be32(commit_data + g->hash_len + 12); > item->date = (timestamp_t)((date_high << 32) | date_low); > > + item->generation = get_be32(commit_data + g->hash_len + 8) >> 2; > + > pptr = &item->parents; > > edge_value = get_be32(commit_data + g->hash_len); > diff --git a/commit.h b/commit.h > index e57ae4b583..3cadd386f3 100644 > --- a/commit.h > +++ b/commit.h > @@ -10,6 +10,8 @@ > #include "pretty.h" > > #define COMMIT_NOT_FROM_GRAPH 0x > +#define GENERATION_NUMBER_UNDEF 0x > +#define GENERATION_NUMBER_NONE 0 > > struct commit_list { > struct commit *item; > @@ -24,6 +26,7 @@ struct commit { > struct commit_list *parents; > struct tree *tree; > uint32_t graph_pos; > + uint32_t generation; > }; > > extern int save_commit_buffer; > -- > 2.17.0.rc0 > -- Brandon Williams