Re: What's cooking in git.git (Aug 2016, #02; Thu, 4)

2016-08-08 Thread Mike Hommey
On Mon, Aug 08, 2016 at 08:48:55AM +0200, Torsten Bögershausen wrote:
> On 2016-08-05 01.32, Junio C Hamano wrote:
> > Mike Hommey  writes:
> []
> 
> >> What kind of support are you expecting?
> > 
> > The only rationale I recall you justifying this series was that this
> > makes the resulting code easier to read, but I do not recall other
> > people agreeing with you, and I do not particularly see the
> > resulting code easier to follow.
> > 
> If that helps:
> I can offer to make a code review,
> and come back at the end of the week or so.

You already did review that code.

Mike
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Aug 2016, #02; Thu, 4)

2016-08-08 Thread Torsten Bögershausen
On 2016-08-05 01.32, Junio C Hamano wrote:
> Mike Hommey  writes:
[]

>> What kind of support are you expecting?
> 
> The only rationale I recall you justifying this series was that this
> makes the resulting code easier to read, but I do not recall other
> people agreeing with you, and I do not particularly see the
> resulting code easier to follow.
> 
If that helps:
I can offer to make a code review,
and come back at the end of the week or so.


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Aug 2016, #02; Thu, 4)

2016-08-07 Thread Johannes Schindelin
Hi Junio,

On Thu, 4 Aug 2016, Junio C Hamano wrote:

> * js/import-tars-hardlinks (2016-08-03) 1 commit
>  - import-tars: support hard links
> 
>  "import-tars" fast-import script (in contrib/) used to ignore a
>  hardlink target and replaced it with an empty file, which has been
>  corrected to record the same blob as the other file the hardlink is
>  shared with.

For the record: this came up, and is required, for my CI work on Windows,
as many MSYS2 packages contain hard-linked files (think about *all* the
builtins in Git, for example). I will use Git to synchronize MSYS2 setups
between build agents.

Ciao,
Dscho
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Aug 2016, #02; Thu, 4)

2016-08-05 Thread Eric Wong
Jeff King  wrote:
> On Fri, Aug 05, 2016 at 08:26:30AM +, Eric Wong wrote:
> 
> > > I'm not sure which mallocs you mean. I allocate one struct per node,
> > > which seems like a requirement for a linked list. If you mean holding an
> > > extra list struct around an existing pointer (rather than shoving the
> > > prev/next pointers into the pointed-to- item), then yes, we could do
> > > that. But it feels like a bit dirty, since the point of the list is
> > > explicitly to provide an alternate ordering over an existing set of
> > > items.
> > 
> > This pattern to avoid that one malloc-per-node using list_entry
> > (container_of) is actually a common idiom in the Linux kernel
> > and Userspace RCU (URCU).  Fwiw, I find it less error-prone and
> > easier-to-follow than the "void *"-first-element thing we do
> > with hashmap.
> 
> My big problem with it is that it gets confusing when a particular
> struct is a member of several lists. We have had bugs in git where
> a struct ended up being used in two different lists, but accidentally
> using the same "next" pointer.

It might actually be easier since you would rarely (if ever)
touch the "next"/"prev" fields in your code.  This encourages
users to give meaningful names to list_head fields.

> So you need one "list_head" for each list that your struct may be a part
> of. Sometimes that's simple, but it's awkward when the code which wants
> the list is different than the code which "owns" the struct. Besides
> leaking concerns across modules, the struct may not want to pay the
> memory price for storing pointers for all of the possible lists it could
> be a member of.

Yes, the key is this list is flexible enough to be used either way:

/* there are millions of these structs in practice */
struct common_struct {
struct list_head hot_ent;
...
};

/* and only a handful of these */
struct rarely_used_wrapper {
struct list_head cold_ent;
struct common_struct *common;
...
};

> For instance, I think it would be a mistake to convert the current
> commit_list code to something like this.

Of course, often a doubly-linked list is not needed or the extra
pointer is too expensive.  Linux/URCU have hlist for this reason.

I'm no expert on git internals, either; but there can be
readability improvements, too.  For example, I find http-walker.c
is easier-to-follow after 94e99012fc7a
("http-walker: reduce O(n) ops with doubly-linked list")
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Aug 2016, #02; Thu, 4)

2016-08-05 Thread Jeff King
On Fri, Aug 05, 2016 at 08:26:30AM +, Eric Wong wrote:

> > I'm not sure which mallocs you mean. I allocate one struct per node,
> > which seems like a requirement for a linked list. If you mean holding an
> > extra list struct around an existing pointer (rather than shoving the
> > prev/next pointers into the pointed-to- item), then yes, we could do
> > that. But it feels like a bit dirty, since the point of the list is
> > explicitly to provide an alternate ordering over an existing set of
> > items.
> 
> This pattern to avoid that one malloc-per-node using list_entry
> (container_of) is actually a common idiom in the Linux kernel
> and Userspace RCU (URCU).  Fwiw, I find it less error-prone and
> easier-to-follow than the "void *"-first-element thing we do
> with hashmap.

My big problem with it is that it gets confusing when a particular
struct is a member of several lists. We have had bugs in git where
a struct ended up being used in two different lists, but accidentally
using the same "next" pointer.

So you need one "list_head" for each list that your struct may be a part
of. Sometimes that's simple, but it's awkward when the code which wants
the list is different than the code which "owns" the struct. Besides
leaking concerns across modules, the struct may not want to pay the
memory price for storing pointers for all of the possible lists it could
be a member of.

For instance, I think it would be a mistake to convert the current
commit_list code to something like this.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Aug 2016, #02; Thu, 4)

2016-08-05 Thread Eric Wong
Jeff King  wrote:
> On Fri, Aug 05, 2016 at 08:02:31AM +, Eric Wong wrote:
> 
> > > I just introduced another doubly-linked list in [1]. It adds some MRU
> > > features on top of the list, but it could in theory be built on top of a
> > > generic doubly-linked list.
> > 
> > Yes, and you'd be avoiding the extra mallocs and be able to use
> > list_entry (aka `container_of`) so it could be faster, too.
> 
> I'm not sure which mallocs you mean. I allocate one struct per node,
> which seems like a requirement for a linked list. If you mean holding an
> extra list struct around an existing pointer (rather than shoving the
> prev/next pointers into the pointed-to- item), then yes, we could do
> that. But it feels like a bit dirty, since the point of the list is
> explicitly to provide an alternate ordering over an existing set of
> items.

This pattern to avoid that one malloc-per-node using list_entry
(container_of) is actually a common idiom in the Linux kernel
and Userspace RCU (URCU).  Fwiw, I find it less error-prone and
easier-to-follow than the "void *"-first-element thing we do
with hashmap.

> It also doesn't make a big difference for my use case. All I really care
> about is the speed of delete-from-middle-and-insert-at-front, which is
> trivially O(1) and involves no mallocs.
> 
> > I was thinking packed_git could also be a doubly-linked list
> > anyways since it would allow easier removal of unlinked pack
> > entries.  My use case would be long-running "cat-file --batch"
> > processes being able to detect unlinked packs after someone
> > else runs GC.
> 
> We never remove packed_git structs, but it is not because of the list
> data structure. We may be holding open mmaps to packs that are deleted
> and continue using them. And in some cases other code may even hold
> pointers to our packed_git structs. So you'd have to figure out some
> memory ownership questions first.

Yes, it's easier to replace a running process once in a while :)
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Aug 2016, #02; Thu, 4)

2016-08-05 Thread Jeff King
On Fri, Aug 05, 2016 at 08:02:31AM +, Eric Wong wrote:

> > I just introduced another doubly-linked list in [1]. It adds some MRU
> > features on top of the list, but it could in theory be built on top of a
> > generic doubly-linked list.
> 
> Yes, and you'd be avoiding the extra mallocs and be able to use
> list_entry (aka `container_of`) so it could be faster, too.

I'm not sure which mallocs you mean. I allocate one struct per node,
which seems like a requirement for a linked list. If you mean holding an
extra list struct around an existing pointer (rather than shoving the
prev/next pointers into the pointed-to- item), then yes, we could do
that. But it feels like a bit dirty, since the point of the list is
explicitly to provide an alternate ordering over an existing set of
items.

It also doesn't make a big difference for my use case. All I really care
about is the speed of delete-from-middle-and-insert-at-front, which is
trivially O(1) and involves no mallocs.

> I was thinking packed_git could also be a doubly-linked list
> anyways since it would allow easier removal of unlinked pack
> entries.  My use case would be long-running "cat-file --batch"
> processes being able to detect unlinked packs after someone
> else runs GC.

We never remove packed_git structs, but it is not because of the list
data structure. We may be holding open mmaps to packs that are deleted
and continue using them. And in some cases other code may even hold
pointers to our packed_git structs. So you'd have to figure out some
memory ownership questions first.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Aug 2016, #02; Thu, 4)

2016-08-05 Thread Eric Wong
(Fixed Nico's address)

Jeff King  wrote:
> On Thu, Aug 04, 2016 at 11:34:35PM +, Eric Wong wrote:
> > Junio C Hamano  wrote:
> > > [Graduated to "master"]
> > 
> > > * ew/http-walker (2016-07-18) 4 commits
> > >   (merged to 'next' on 2016-07-18 at a430a97)
> > >  + list: avoid incompatibility with *BSD sys/queue.h
> > >   (merged to 'next' on 2016-07-13 at 8585c03)
> > >  + http-walker: reduce O(n) ops with doubly-linked list
> > 
> > Yay!  This finally introduces the Linux kernel linked list
> > into git.  I'm not sure if it's worth the effort to introduce
> > cleanup commits to start using it in places where we already
> > have doubly-linked list implementations:
> > 
> > (+Cc Nicolas and Lukas)
> > * sha1_file.c delta_base_cache_lru is open codes this
> > * builtin/pack-redundant.c could probably be adapted, too
> >  ... any more?
> 
> I just introduced another doubly-linked list in [1]. It adds some MRU
> features on top of the list, but it could in theory be built on top of a
> generic doubly-linked list.

Yes, and you'd be avoiding the extra mallocs and be able to use
list_entry (aka `container_of`) so it could be faster, too.

I was thinking packed_git could also be a doubly-linked list
anyways since it would allow easier removal of unlinked pack
entries.  My use case would be long-running "cat-file --batch"
processes being able to detect unlinked packs after someone
else runs GC.

> It's also possible the delta-base-cache stuff could build on top of that
> same code, but I didn't look closely at it.
> 
> -Peff
> 
> [1] http://public-inbox.org/git/20160729040659.gc22...@sigill.intra.peff.net/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Aug 2016, #02; Thu, 4)

2016-08-05 Thread Jeff King
On Thu, Aug 04, 2016 at 11:34:35PM +, Eric Wong wrote:

> Junio C Hamano  wrote:
> > [Graduated to "master"]
> 
> > * ew/http-walker (2016-07-18) 4 commits
> >   (merged to 'next' on 2016-07-18 at a430a97)
> >  + list: avoid incompatibility with *BSD sys/queue.h
> >   (merged to 'next' on 2016-07-13 at 8585c03)
> >  + http-walker: reduce O(n) ops with doubly-linked list
> 
> Yay!  This finally introduces the Linux kernel linked list
> into git.  I'm not sure if it's worth the effort to introduce
> cleanup commits to start using it in places where we already
> have doubly-linked list implementations:
> 
> (+Cc Nicolas and Lukas)
> * sha1_file.c delta_base_cache_lru is open codes this
> * builtin/pack-redundant.c could probably be adapted, too
>  ... any more?

I just introduced another doubly-linked list in [1]. It adds some MRU
features on top of the list, but it could in theory be built on top of a
generic doubly-linked list.

It's also possible the delta-base-cache stuff could build on top of that
same code, but I didn't look closely at it.

-Peff

[1] http://public-inbox.org/git/20160729040659.gc22...@sigill.intra.peff.net/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Aug 2016, #02; Thu, 4)

2016-08-05 Thread Junio C Hamano
Eric Wong  writes:

> Yay!  This finally introduces the Linux kernel linked list
> into git.  I'm not sure if it's worth the effort to introduce
> cleanup commits to start using it in places where we already
> have doubly-linked list implementations:
>
> (+Cc Nicolas and Lukas)
> * sha1_file.c delta_base_cache_lru is open codes this
> * builtin/pack-redundant.c could probably be adapted, too
>  ... any more?
>
> And there may be other places where we have performance problems
> walking singly-linked lists and would be better off on a
> doubly-linked one (or even just readability ones).

Sounds like a set of fun projects.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Aug 2016, #02; Thu, 4)

2016-08-04 Thread Mike Hommey
On Thu, Aug 04, 2016 at 04:32:02PM -0700, Junio C Hamano wrote:
> Mike Hommey  writes:
> 
> > On Thu, Aug 04, 2016 at 03:28:55PM -0700, Junio C Hamano wrote:
> >> * mh/connect (2016-06-06) 10 commits
> >>  - connect: [host:port] is legacy for ssh
> >>  ...
> >>  - connect: document why we sometimes call get_port after get_host_and_port
> >> 
> >>  Rewrite Git-URL parsing routine (hopefully) without changing any
> >>  behaviour.
> >> 
> >>  It has been two months without any support.  We may want to discard
> >>  this.
> >
> > What kind of support are you expecting?
> 
> The only rationale I recall you justifying this series was that this
> makes the resulting code easier to read, but I do not recall other
> people agreeing with you, and I do not particularly see the
> resulting code easier to follow.

FWIW, IIRC, Torsten agreed it did.

Mike
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Aug 2016, #02; Thu, 4)

2016-08-04 Thread Eric Wong
Junio C Hamano  wrote:
> [Graduated to "master"]

> * ew/http-walker (2016-07-18) 4 commits
>   (merged to 'next' on 2016-07-18 at a430a97)
>  + list: avoid incompatibility with *BSD sys/queue.h
>   (merged to 'next' on 2016-07-13 at 8585c03)
>  + http-walker: reduce O(n) ops with doubly-linked list

Yay!  This finally introduces the Linux kernel linked list
into git.  I'm not sure if it's worth the effort to introduce
cleanup commits to start using it in places where we already
have doubly-linked list implementations:

(+Cc Nicolas and Lukas)
* sha1_file.c delta_base_cache_lru is open codes this
* builtin/pack-redundant.c could probably be adapted, too
 ... any more?

And there may be other places where we have performance problems
walking singly-linked lists and would be better off on a
doubly-linked one (or even just readability ones).



>  cf. 
>  cf. 

Re: What's cooking in git.git (Aug 2016, #02; Thu, 4)

2016-08-04 Thread Junio C Hamano
"Philip Oakley"  writes:

>> Updates in 4/8 ("give headings") is reported to break formatting?
>> cf. <57913c97.1030...@xiplink.com>
>
> Just to say I haven't forgotten.

OK.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Aug 2016, #02; Thu, 4)

2016-08-04 Thread Junio C Hamano
Mike Hommey  writes:

> On Thu, Aug 04, 2016 at 03:28:55PM -0700, Junio C Hamano wrote:
>> * mh/connect (2016-06-06) 10 commits
>>  - connect: [host:port] is legacy for ssh
>>  ...
>>  - connect: document why we sometimes call get_port after get_host_and_port
>> 
>>  Rewrite Git-URL parsing routine (hopefully) without changing any
>>  behaviour.
>> 
>>  It has been two months without any support.  We may want to discard
>>  this.
>
> What kind of support are you expecting?

The only rationale I recall you justifying this series was that this
makes the resulting code easier to read, but I do not recall other
people agreeing with you, and I do not particularly see the
resulting code easier to follow.

> FWIW, I have WIP cleaning up the code further, tha obviously depends on
> this series.

As this is not even in 'next', your cleaning-up does not have to
depend on it.  It can be part of a reroll, of course.

By the way, "discarding" is not equal to "rejecting".  The latter is
"it is a bad thing to do, don't even come back with a new version".
It is just "This hasn't made any progress, and it is not ready for
'next', either. Keeping it in 'pu' is eating my time without giving
much benefit to the project at this point".
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Aug 2016, #02; Thu, 4)

2016-08-04 Thread Philip Oakley

From: "Junio C Hamano" 



* po/range-doc (2016-07-20) 8 commits
- doc: revisions - clarify reachability examples
- doc: revisions - define `reachable`
- doc: gitrevisions - clarify 'latter case' is revision walk
- doc: gitrevisions - use 'reachable' in page description
- doc: give headings for the two and three dot notations
- doc: show the actual left, right, and boundary marks
- doc: revisions - name the left and right sides
- doc: use 'symmetric difference' consistently

Clarify various ways to specify the "revision ranges" in the
documentation.

Updates in 4/8 ("give headings") is reported to break formatting?
cf. <57913c97.1030...@xiplink.com>




Just to say I haven't forgotten.

The format breakage was analysed by Jeff (12 July 2016 23:12) and looks to 
be internal to the man viewer.


There's still a suggestion surrounding one of the headings to sort (which 
has gone around in circles;-).


Also I now think I have a sensible idea for the ^@/! examples (i.e. use 
the Loeliger illustration) - that'd make if 9 in the series (was originally 
2).


Plus ensure all comments across V1-4 have been attended to (including yours 
of 12 July 2016 18:04).


--
Philip 


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Aug 2016, #02; Thu, 4)

2016-08-04 Thread Mike Hommey
On Thu, Aug 04, 2016 at 03:28:55PM -0700, Junio C Hamano wrote:
> * mh/connect (2016-06-06) 10 commits
>  - connect: [host:port] is legacy for ssh
>  - connect: move ssh command line preparation to a separate function
>  - connect: actively reject git:// urls with a user part
>  - connect: change the --diag-url output to separate user and host
>  - connect: make parse_connect_url() return the user part of the url as a 
> separate value
>  - connect: group CONNECT_DIAG_URL handling code
>  - connect: make parse_connect_url() return separated host and port
>  - connect: re-derive a host:port string from the separate host and port 
> variables
>  - connect: call get_host_and_port() earlier
>  - connect: document why we sometimes call get_port after get_host_and_port
> 
>  Rewrite Git-URL parsing routine (hopefully) without changing any
>  behaviour.
> 
>  It has been two months without any support.  We may want to discard
>  this.

What kind of support are you expecting?

FWIW, I have WIP cleaning up the code further, tha obviously depends on
this series.

Mike
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html