Re: [PATCH v2 07/10] commit-graph.txt: update future work

2018-04-13 Thread Jakub Narebski
Derrick Stolee  writes:

> On 4/12/2018 5:12 AM, Junio C Hamano wrote:
>> Derrick Stolee  writes:
>>
>>> +Here is a diagram to visualize the shape of the full commit graph, and
>>> +how different generation numbers relate:
>>> +
>>> ++-+
>>> +| GENERATION_NUMBER_INFINITY = 0x |
>>> ++-+
>>> +   ||  ^
>>> +   ||  |
>>> +   |+--+
>>> +   | [gen(A) = gen(B)]
>>> +   V
>>> ++-+
>>> +| 0 < commit->generation < 0x4000 |
>>> ++-+
>>> +   ||  ^
>>> +   ||  |
>>> +   |+--+
>>> +   |[gen(A) > gen(B)]
>>> +   V
>>> ++-+
>>> +| GENERATION_NUMBER_ZERO = 0  |
>>> ++-+
>>> +|  ^
>>> +|  |
>>> ++--+
>>> +[gen(A) = gen(B)]
>>
>> It may be just me but all I can read out of the above is that

It's not just you.

>> commit->generation may store 0x, a value between 0 and
>> 0x4000, or 0.  I cannot quite tell what the notation [gen(A)
>>  gen(B)] is trying to say.  I am guessing "Two generation
>> numbers within the 'valid' range can be compared" is what the second
>> one is trying to say, but it is much less interesting to know that
>> two infinities compare equal than how generation numbers from
>> different classes compare, which cannot be depicted in the above
>> notation, I am afraid.  For example, don't we want to say that a
>> commit with INF can never be reached by a commit with a valid
>> generation number, or something like that?
>
> My intention with the arrows was to demonstrate where parent
> relationships can go, and the generation-number relation between a
> commit A with parent B. Clearly, this diagram is less than helpful.

Perhaps the following table would make the information clearer (perhaps
in addition to the above graph, but without "gen(A) {cmp} gen(B)"
arrows).

I assume that it is possible to have both GENERATION_NUMBER_ZERO and non
zero generation numbers in one repo, perhaps via alternates.  I also
assume that A != B, and that generation numbers (both set, and 0s) are
transitivelu closed under reachability.

gen(A) \   commit B ->   | gen(B)
\-\  |
commit A   \ | 0x | larger   | smaller | 0x
\++--+-+
0x   | =>  > >
0 < larger  < 0x4000 | < N  = n> >
0 < smaller < 0x4000 | < N  < N= n   >
0x   | < N  < N< N   =

The "<", "=", ">" denotes result of comparison between gen(A) and gen(B).

Generation numbers create a negative-cut filter: "N" and "n" denote
situation where we know from gen(A) and gen(B) that B is not reachable
from A.

As can be seen if we use gen(A) < gen(B) as cutoff, we don't need to
treat "infinity" and "zero" in a special way.


Best,
-- 
Jakub Narębski


Re: [PATCH v2 07/10] commit-graph.txt: update future work

2018-04-12 Thread Derrick Stolee

On 4/12/2018 5:12 AM, Junio C Hamano wrote:

Derrick Stolee  writes:


+Here is a diagram to visualize the shape of the full commit graph, and
+how different generation numbers relate:
+
++-+
+| GENERATION_NUMBER_INFINITY = 0x |
++-+
+   ||  ^
+   ||  |
+   |+--+
+   | [gen(A) = gen(B)]
+   V
++-+
+| 0 < commit->generation < 0x4000 |
++-+
+   ||  ^
+   ||  |
+   |+--+
+   |[gen(A) > gen(B)]
+   V
++-+
+| GENERATION_NUMBER_ZERO = 0  |
++-+
+|  ^
+|  |
++--+
+[gen(A) = gen(B)]

It may be just me but all I can read out of the above is that
commit->generation may store 0x, a value between 0 and
0x4000, or 0.  I cannot quite tell what the notation [gen(A)
 gen(B)] is trying to say.  I am guessing "Two generation
numbers within the 'valid' range can be compared" is what the second
one is trying to say, but it is much less interesting to know that
two infinities compare equal than how generation numbers from
different classes compare, which cannot be depicted in the above
notation, I am afraid.  For example, don't we want to say that a
commit with INF can never be reached by a commit with a valid
generation number, or something like that?


My intention with the arrows was to demonstrate where parent 
relationships can go, and the generation-number relation between a 
commit A with parent B. Clearly, this diagram is less than helpful.





  Design Details
  --
  
@@ -98,17 +141,12 @@ Future Work

  - The 'commit-graph' subcommand does not have a "verify" mode that is
necessary for integration with fsck.
  
-- The file format includes room for precomputed generation numbers. These

-  are not currently computed, so all generation numbers will be marked as
-  0 (or "uncomputed"). A later patch will include this calculation.
-
  - After computing and storing generation numbers, we must make graph
walks aware of generation numbers to gain the performance benefits they
enable. This will mostly be accomplished by swapping a commit-date-ordered
priority queue with one ordered by generation number. The following
-  operations are important candidates:
+  operation is an important candidate:

Good.




Re: [PATCH v2 07/10] commit-graph.txt: update future work

2018-04-12 Thread Junio C Hamano
Derrick Stolee  writes:

> +Here is a diagram to visualize the shape of the full commit graph, and
> +how different generation numbers relate:
> +
> ++-+
> +| GENERATION_NUMBER_INFINITY = 0x |
> ++-+
> + ||  ^
> + ||  |
> + |+--+
> + | [gen(A) = gen(B)]
> + V
> ++-+
> +| 0 < commit->generation < 0x4000 |
> ++-+
> + ||  ^
> + ||  |
> + |+--+
> + |[gen(A) > gen(B)]
> + V
> ++-+
> +| GENERATION_NUMBER_ZERO = 0  |
> ++-+
> +  |  ^
> +  |  |
> +  +--+
> +  [gen(A) = gen(B)]

It may be just me but all I can read out of the above is that
commit->generation may store 0x, a value between 0 and
0x4000, or 0.  I cannot quite tell what the notation [gen(A)
 gen(B)] is trying to say.  I am guessing "Two generation
numbers within the 'valid' range can be compared" is what the second
one is trying to say, but it is much less interesting to know that
two infinities compare equal than how generation numbers from
different classes compare, which cannot be depicted in the above
notation, I am afraid.  For example, don't we want to say that a
commit with INF can never be reached by a commit with a valid
generation number, or something like that?

>  Design Details
>  --
>  
> @@ -98,17 +141,12 @@ Future Work
>  - The 'commit-graph' subcommand does not have a "verify" mode that is
>necessary for integration with fsck.
>  
> -- The file format includes room for precomputed generation numbers. These
> -  are not currently computed, so all generation numbers will be marked as
> -  0 (or "uncomputed"). A later patch will include this calculation.
> -
>  - After computing and storing generation numbers, we must make graph
>walks aware of generation numbers to gain the performance benefits they
>enable. This will mostly be accomplished by swapping a commit-date-ordered
>priority queue with one ordered by generation number. The following
> -  operations are important candidates:
> +  operation is an important candidate:

Good.