Re: [PATCH 1/5] trailer: use singly-linked list, not doubly

2016-10-12 Thread Jeff King
On Wed, Oct 12, 2016 at 05:38:14PM +0200, Christian Couder wrote:

> On Wed, Oct 12, 2016 at 3:23 AM, Jonathan Tan  
> wrote:
> > Use singly-linked lists (instead of doubly-linked lists) in trailer to
> > keep track of arguments (whether implicit from configuration or explicit
> > from the command line) and trailer items.
> >
> > This change significantly reduces the code length and simplifies the code.
> 
> It's true that the code can be simplified a lot by using a
> singly-linked list, but if we already had and used some generic
> functions or macros to handle doubly-linked list, I doubt there would
> be a significant simplification (as the generic code could not be
> deleted in this case).

We didn't have such generic macros when you wrote the trailer code
originally, but we do now, in list.h (they come from the kernel's
doubly-linked list implementation). I used them recently in a series and
found them pretty pleasant and complete.

Maybe it's worth trying the conversion here to see if it simplifies the
code.

> > There are now fewer pointers to be manipulated, but most trailer
> > manipulations now require seeking from beginning to end, so there might
> > be a slight net decrease in performance; however the number of trailers
> > is usually small (10 to 15 at the most) so this should not cause a big
> > impact.
> 
> By default we append new trailers at the end of the trailer list, so a
> singly-linked list is theoretically not well suited at all for
> handling trailer lists.
> 
> You say that at most there is a small number of trailers, and that may
> be true indeed for the Linux kernel and most projects these days, but
> I am not sure it is a good assumption to make in general.

I agree. As somebody who has fixed quite a number of accidentally
quadratic cases in the number of refs, width of the commit graph, etc,
over the years, these things have a way of creeping up or finding
pathological cases.

You _can_ append to a singly linked list in O(1) if you keep a tail
pointer, but I think using list.h would be even nicer.

-Peff


Re: [PATCH 1/5] trailer: use singly-linked list, not doubly

2016-10-12 Thread Christian Couder
On Wed, Oct 12, 2016 at 3:23 AM, Jonathan Tan  wrote:
> Use singly-linked lists (instead of doubly-linked lists) in trailer to
> keep track of arguments (whether implicit from configuration or explicit
> from the command line) and trailer items.
>
> This change significantly reduces the code length and simplifies the code.

It's true that the code can be simplified a lot by using a
singly-linked list, but if we already had and used some generic
functions or macros to handle doubly-linked list, I doubt there would
be a significant simplification (as the generic code could not be
deleted in this case).

> There are now fewer pointers to be manipulated, but most trailer
> manipulations now require seeking from beginning to end, so there might
> be a slight net decrease in performance; however the number of trailers
> is usually small (10 to 15 at the most) so this should not cause a big
> impact.

By default we append new trailers at the end of the trailer list, so a
singly-linked list is theoretically not well suited at all for
handling trailer lists.

You say that at most there is a small number of trailers, and that may
be true indeed for the Linux kernel and most projects these days, but
I am not sure it is a good assumption to make in general.

For example if some projects use or start using "CC:  *" trailers and
tools to automatically append such trailers (perhaps using 'git
interpret-trailers' for that purpose by the way) based on people who
touched the same code, then it could very well be a common thing to
have 20 or more trailers on patches/commits for these projects.

There could also be automated testing tools that add their own
"Tested-by: *" trailers, and projects that require many people to add
their "Reviewed-by: *" trailers to each patch/commit. And in general
with millions of users, it is not very safe to make assumptions that
they will all be "reasonable" in the way they use a feature.

Another thing is that when Git started, I doubt many people would have
thought that there would often be more than just one or two
"Signed-off-by: *" trailer, and now 10 or 15 doesn't seem unthinkable.
In fact 'git interpret-trailers' has been made precisely because there
are more and more trailers, so there is more and more a need for a
tool to properly add and manage them.

We recently had a discussion on the list to increase the default
abbreviation length because it was not foreseen at the beginning of
Git that people would need a larger number of characters as projects
grow.

So even for the Linux kernel, I am not sure that it is safe to assume
that the number of trailers will not grow much over the years,
especially if we work on the tool that can make it easy to easily and
automatically add them.

Thanks,
Christian.


Re: [PATCH 1/5] trailer: use singly-linked list, not doubly

2016-10-12 Thread Junio C Hamano
Jonathan Tan  writes:

> Use singly-linked lists (instead of doubly-linked lists) in trailer to
> keep track of arguments (whether implicit from configuration or explicit
> from the command line) and trailer items.
>
> This change significantly reduces the code length and simplifies the code.
> There are now fewer pointers to be manipulated, but most trailer
> manipulations now require seeking from beginning to end, so there might
> be a slight net decrease in performance; however the number of trailers
> is usually small (10 to 15 at the most) so this should not cause a big
> impact.

It is overall a very good change, but can you split this into two
independent patches?  s/struct trailer_item/const &/ sprinkled all
over the place is more or less unrelated change and it is very
distracting to see the primary change of the way lists are handled.