Re: What's cooking in git.git (Dec 2017, #02; Thu, 7)

2017-12-13 Thread Jeff Hostetler



On 12/12/2017 4:30 AM, Christian Couder wrote:

On Thu, Dec 7, 2017 at 7:04 PM, Junio C Hamano  wrote:


* jh/object-filtering (2017-12-05) 9 commits
   (merged to 'next' on 2017-12-05 at 3a56b51085)
  + rev-list: support --no-filter argument
  + list-objects-filter-options: support --no-filter
  + list-objects-filter-options: fix 'keword' typo in comment
   (merged to 'next' on 2017-11-27 at e5008c3b28)
  + pack-objects: add list-objects filtering
  + rev-list: add list-objects filtering support
  + list-objects: filter objects in traverse_commit_list
  + oidset: add iterator methods to oidset
  + oidmap: add oidmap iterator methods
  + dir: allow exclusions from blob in addition to file
  (this branch is used by jh/fsck-promisors and jh/partial-clone.)

  In preparation for implementing narrow/partial clone, the object
  walking machinery has been taught a way to tell it to "filter" some
  objects from enumeration.


* jh/fsck-promisors (2017-12-05) 12 commits
  - gc: do not repack promisor packfiles
  - rev-list: support termination at promisor objects
  - fixup: sha1_file: add TODO
  - fixup: sha1_file: convert gotos to break/continue
  - sha1_file: support lazily fetching missing objects
  - introduce fetch-object: fetch one promisor object
  - index-pack: refactor writing of .keep files
  - fsck: support promisor objects as CLI argument
  - fsck: support referenced promisor objects
  - fsck: support refs pointing to promisor objects
  - fsck: introduce partialclone extension
  - extension.partialclone: introduce partial clone extension
  (this branch is used by jh/partial-clone; uses jh/object-filtering.)

  In preparation for implementing narrow/partial clone, the machinery
  for checking object connectivity used by gc and fsck has been
  taught that a missing object is OK when it is referenced by a
  packfile specially marked as coming from trusted repository that
  promises to make them available on-demand and lazily.


I am currently working on integrating this series with my external odb
series 
(https://public-inbox.org/git/20170916080731.13925-1-chrisc...@tuxfamily.org/).

Instead of using an "extension.partialclone" config variable, an odb
will be configured like using an "odb..promisorRemote" (the
name might still change) config variable. Other odbs could still be
configured using "odb..scriptCommand" and
"odb..subprocessCommand".

The current work is still very much WIP and some tests fail, but you
can take a look there:

https://github.com/chriscool/git/tree/gl-promisor-external-odb440



In our current V6 patch series, Jonathan Tan and I are using the
extension.partialclone config variable for 2 purposes.  First, to
indicate a change in the repository format and stop non-aware clients
(older versions of git.exe) from operating on the repo -- since they
won't know how to handle missing objects.   Second, to name the remote
to help the client later demand load missing objects.  This is a current
limitation (we only support one promisor remote), so this second usage
may change.  I haven't had time to look at your branch yet, so I can't
comment on how it might help/solve our second usage, but we do need to
keep the first usage in mind.

Jeff


Re: What's cooking in git.git (Dec 2017, #02; Thu, 7)

2017-12-12 Thread Philip Oakley

From: "Christian Couder" 

On Thu, Dec 7, 2017 at 7:04 PM, Junio C Hamano  wrote:


* jh/object-filtering (2017-12-05) 9 commits
  (merged to 'next' on 2017-12-05 at 3a56b51085)
 + rev-list: support --no-filter argument
 + list-objects-filter-options: support --no-filter
 + list-objects-filter-options: fix 'keword' typo in comment
  (merged to 'next' on 2017-11-27 at e5008c3b28)
 + pack-objects: add list-objects filtering
 + rev-list: add list-objects filtering support
 + list-objects: filter objects in traverse_commit_list
 + oidset: add iterator methods to oidset
 + oidmap: add oidmap iterator methods
 + dir: allow exclusions from blob in addition to file
 (this branch is used by jh/fsck-promisors and jh/partial-clone.)

 In preparation for implementing narrow/partial clone, the object
 walking machinery has been taught a way to tell it to "filter" some
 objects from enumeration.


* jh/fsck-promisors (2017-12-05) 12 commits
 - gc: do not repack promisor packfiles
 - rev-list: support termination at promisor objects
 - fixup: sha1_file: add TODO
 - fixup: sha1_file: convert gotos to break/continue
 - sha1_file: support lazily fetching missing objects
 - introduce fetch-object: fetch one promisor object
 - index-pack: refactor writing of .keep files
 - fsck: support promisor objects as CLI argument
 - fsck: support referenced promisor objects
 - fsck: support refs pointing to promisor objects
 - fsck: introduce partialclone extension
 - extension.partialclone: introduce partial clone extension
 (this branch is used by jh/partial-clone; uses jh/object-filtering.)

 In preparation for implementing narrow/partial clone, the machinery
 for checking object connectivity used by gc and fsck has been
 taught that a missing object is OK when it is referenced by a
 packfile specially marked as coming from trusted repository that
 promises to make them available on-demand and lazily.


I am currently working on integrating this series with my external odb
series 
(https://public-inbox.org/git/20170916080731.13925-1-chrisc...@tuxfamily.org/).


I too had seen that, as currently configured, the 'partialClone' could be 
seen as a method for using the remote as if it were an object database (odb) 
that was part of an 'always on-line' capability. However I'm cautious about 
locking out the original DVCS capability of being off-line relative to some, 
or all, remotes and still needing to work in 'airplane mode'.


It should be OK for the local narrowClone (my term) to be totally off-line 
for a while and still be able to work when back on line with other suitable 
remotes, even after the original remote has gone.




Instead of using an "extension.partialclone" config variable, an odb
will be configured like using an "odb..promisorRemote" (the
name might still change) config variable. Other odbs could still be
configured using "odb..scriptCommand" and
"odb..subprocessCommand".


The future work Jeff had indicated, IIRC, should be able to cope with 
multiple promisor remotes, which it's to be hope this could handle. I'm not 
sure how the odb code would handle a partial failure where a partition of 
the odb stops being available.




The current work is still very much WIP and some tests fail, but you
can take a look there:

https://github.com/chriscool/git/tree/gl-promisor-external-odb440

--
Philip 



Re: What's cooking in git.git (Dec 2017, #02; Thu, 7)

2017-12-12 Thread Christian Couder
On Thu, Dec 7, 2017 at 7:04 PM, Junio C Hamano  wrote:
>
> * jh/object-filtering (2017-12-05) 9 commits
>   (merged to 'next' on 2017-12-05 at 3a56b51085)
>  + rev-list: support --no-filter argument
>  + list-objects-filter-options: support --no-filter
>  + list-objects-filter-options: fix 'keword' typo in comment
>   (merged to 'next' on 2017-11-27 at e5008c3b28)
>  + pack-objects: add list-objects filtering
>  + rev-list: add list-objects filtering support
>  + list-objects: filter objects in traverse_commit_list
>  + oidset: add iterator methods to oidset
>  + oidmap: add oidmap iterator methods
>  + dir: allow exclusions from blob in addition to file
>  (this branch is used by jh/fsck-promisors and jh/partial-clone.)
>
>  In preparation for implementing narrow/partial clone, the object
>  walking machinery has been taught a way to tell it to "filter" some
>  objects from enumeration.
>
>
> * jh/fsck-promisors (2017-12-05) 12 commits
>  - gc: do not repack promisor packfiles
>  - rev-list: support termination at promisor objects
>  - fixup: sha1_file: add TODO
>  - fixup: sha1_file: convert gotos to break/continue
>  - sha1_file: support lazily fetching missing objects
>  - introduce fetch-object: fetch one promisor object
>  - index-pack: refactor writing of .keep files
>  - fsck: support promisor objects as CLI argument
>  - fsck: support referenced promisor objects
>  - fsck: support refs pointing to promisor objects
>  - fsck: introduce partialclone extension
>  - extension.partialclone: introduce partial clone extension
>  (this branch is used by jh/partial-clone; uses jh/object-filtering.)
>
>  In preparation for implementing narrow/partial clone, the machinery
>  for checking object connectivity used by gc and fsck has been
>  taught that a missing object is OK when it is referenced by a
>  packfile specially marked as coming from trusted repository that
>  promises to make them available on-demand and lazily.

I am currently working on integrating this series with my external odb
series 
(https://public-inbox.org/git/20170916080731.13925-1-chrisc...@tuxfamily.org/).

Instead of using an "extension.partialclone" config variable, an odb
will be configured like using an "odb..promisorRemote" (the
name might still change) config variable. Other odbs could still be
configured using "odb..scriptCommand" and
"odb..subprocessCommand".

The current work is still very much WIP and some tests fail, but you
can take a look there:

https://github.com/chriscool/git/tree/gl-promisor-external-odb440


Re: What's cooking in git.git (Dec 2017, #02; Thu, 7)

2017-12-08 Thread Junio C Hamano
Johannes Schindelin  writes:

> We might want to consider using a saner Continuous Testing workflow, to
> avoid re-testing (and re-finding) breakages in individual patch series
> just because completely unrelated patch got updated.
>
> I mean, yes, it seemed like a good idea a long time ago to have One Branch
> that contains All The Patch Series Currently Cooking, back when our most
> reliable (because only) test facilities were poor humans.
>
> But we see how many more subtle bugs are spotted nowadays where Git's
> source code is tested automatically on a growing number of Operating
> System/CPU architecture "coordinates", and it is probably time to save
> some human resources.
>
> How about testing the individual branches instead?

We would benefit from both, so not "instead", but "in addition"
would make more sense.

Even if a topic passes a test in isolation, the job of the developer
who originally did that topic does not end there, as the topic may
break in presence of other topics in flight when tested together
with them, and because a project is a team effort, we expect those
familiar with the topics involved in such a breakage to all
participate in diagnosing and fixing.  

Ideally, in addition to the tips of these integration branches, and
in addition to the tips of topics, it would be nicer if we can test
individual new commits.  When we see the tip of 'pu' updated from A
to B, then

git rev-list --no-merges A..B

would give us all individual non-merge commits that have been added,
and assuming that we have already tested commits back when the tip
was at A, these are the only commits that needs testing to see what
is broken in the new round.

I do not know how easy it is to arrange something like that, though.
What we currently run with Travis lets us limit the number of jobs
to the number of tentative integration branches; a scheme like that
would require quite a lot more test cycles, triggered by a single
pushout.

(Unscientific numbers)

$ git rev-list --count --no-merges pu@{48.hours}..pu@{24.hours}
10
$ git rev-list --count --no-merges pu@{24.hours}..pu
37



Re: What's cooking in git.git (Dec 2017, #02; Thu, 7)

2017-12-08 Thread Junio C Hamano
Christian Couder  writes:

> On Thu, Dec 7, 2017 at 7:04 PM, Junio C Hamano  wrote:
>
>
>> * cc/skip-to-optional-val (2017-12-07) 7 commits
>>  - t4045: test 'diff --relative' for real
>>  - t4045: reindent to make helpers readable
>>  - diff: use skip-to-optional-val in parsing --relative
>>  - diff: use skip_to_optional_val_default()
>>  - diff: use skip_to_optional_val()
>>  - index-pack: use skip_to_optional_val()
>>  - git-compat-util: introduce skip_to_optional_val()
>>
>>  Introduce a helper to simplify code to parse a common pattern that
>>  expects either "--key" or "--key=".
>>
>>  Even though I queued fixes for "diff --relative" on top, it may
>>  still want a final reroll to make it harder to misuse by allowing
>>  NULL at the valp part of the argument.
>
> Yeah, I already implemented that and it will be in the next v3 version.

Good.  I am hoping that you've followed the discussion on the tests,
where all of us agreed that the approach taken by Jacob's one is
preferrable over what is queued above?

>> Also s/_val/_arg/.
>
> I am not sure that is a good idea, because it could suggest that the
> functions are designed to parse only command option arguments, while
> they can be used to parse any "key=val" string where "key" is also
> allowed.
>
>>  cf. 
>>  cf. 
>
> It doesn't look like s/_val/_arg/ was discussed in the above messages.

It came from your statement that was made before the thread, where
you said you'll rename it to use arg after I said I suspect that arg
would make more sense than val.

https://public-inbox.org/git/CAP8UFD2OSsqzhyAL-QG1TOowB-xgbf=kc9whre+flc+0j1x...@mail.gmail.com/


Thanks.


Re: What's cooking in git.git (Dec 2017, #02; Thu, 7)

2017-12-08 Thread Christian Couder
On Thu, Dec 7, 2017 at 7:04 PM, Junio C Hamano  wrote:


> * cc/skip-to-optional-val (2017-12-07) 7 commits
>  - t4045: test 'diff --relative' for real
>  - t4045: reindent to make helpers readable
>  - diff: use skip-to-optional-val in parsing --relative
>  - diff: use skip_to_optional_val_default()
>  - diff: use skip_to_optional_val()
>  - index-pack: use skip_to_optional_val()
>  - git-compat-util: introduce skip_to_optional_val()
>
>  Introduce a helper to simplify code to parse a common pattern that
>  expects either "--key" or "--key=".
>
>  Even though I queued fixes for "diff --relative" on top, it may
>  still want a final reroll to make it harder to misuse by allowing
>  NULL at the valp part of the argument.

Yeah, I already implemented that and it will be in the next v3 version.

> Also s/_val/_arg/.

I am not sure that is a good idea, because it could suggest that the
functions are designed to parse only command option arguments, while
they can be used to parse any "key=val" string where "key" is also
allowed.

>  cf. 
>  cf. 

It doesn't look like s/_val/_arg/ was discussed in the above messages.


Re: What's cooking in git.git (Dec 2017, #02; Thu, 7)

2017-12-08 Thread Johannes Schindelin
Hi,

On Fri, 8 Dec 2017, Torsten Bögershausen wrote:

> > * tb/check-crlf-for-safe-crlf (2017-11-27) 1 commit
> >   (merged to 'next' on 2017-12-05 at 7adaa1fe01)
> >  + convert: tighten the safe autocrlf handling
> > 
> >  The "safe crlf" check incorrectly triggered for contents that does
> >  not use CRLF as line endings, which has been corrected.
> > 
> >  Broken on Windows???
> >  cf. 
> 
> Yes, broken on Windows. A fix is coming the next days.

We might want to consider using a saner Continuous Testing workflow, to
avoid re-testing (and re-finding) breakages in individual patch series
just because completely unrelated patch got updated.

I mean, yes, it seemed like a good idea a long time ago to have One Branch
that contains All The Patch Series Currently Cooking, back when our most
reliable (because only) test facilities were poor humans.

But we see how many more subtle bugs are spotted nowadays where Git's
source code is tested automatically on a growing number of Operating
System/CPU architecture "coordinates", and it is probably time to save
some human resources.

How about testing the individual branches instead?

This would save me a ton of time, as bisecting is just too expensive given
the scattered base commits of the branches smooshed into `pu`. (There is a
new Git/Error.pm breakage in pu for about a week that I simply have not
gotten around to, or better put: that I did not want to tackle given the
time committment).)

Ciao,
Dscho

Re: What's cooking in git.git (Dec 2017, #02; Thu, 7)

2017-12-07 Thread Torsten Bögershausen
> * tb/check-crlf-for-safe-crlf (2017-11-27) 1 commit
>   (merged to 'next' on 2017-12-05 at 7adaa1fe01)
>  + convert: tighten the safe autocrlf handling
> 
>  The "safe crlf" check incorrectly triggered for contents that does
>  not use CRLF as line endings, which has been corrected.
> 
>  Broken on Windows???
>  cf. 

Yes, broken on Windows. A fix is coming the next days.


What's cooking in git.git (Dec 2017, #02; Thu, 7)

2017-12-07 Thread Junio C Hamano
Here are the topics that have been cooking.  Commits prefixed with
'-' are only in 'pu' (proposed updates) while commits prefixed with
'+' are in 'next'.  The ones marked with '.' do not appear in any of
the integration branches, but I am still holding onto them.

You can find the changes described here in the integration branches
of the repositories listed at

http://git-blame.blogspot.com/p/git-public-repositories.html

--
[Graduated to "master"]

* ac/complete-pull-autostash (2017-11-22) 1 commit
  (merged to 'next' on 2017-11-27 at 802d204eda)
 + completion: add --autostash and --no-autostash to pull

 The shell completion (in contrib/) learned that "git pull" can take
 the "--autostash" option.


* bw/protocol-v1 (2017-10-17) 11 commits
  (merged to 'next' on 2017-11-27 at 55040d09ec)
 + Documentation: document Extra Parameters
 + ssh: introduce a 'simple' ssh variant
 + i5700: add interop test for protocol transition
 + http: tell server that the client understands v1
 + connect: tell server that the client understands v1
 + connect: teach client to recognize v1 server response
 + upload-pack, receive-pack: introduce protocol version 1
 + daemon: recognize hidden request arguments
 + protocol: introduce protocol extension mechanisms
 + pkt-line: add packet_write function
 + connect: in ref advertisement, shallows are last
 (this branch is used by jn/ssh-wrappers.)

 A new mechanism to upgrade the wire protocol in place is proposed
 and demonstrated that it works with the older versions of Git
 without harming them.


* cc/git-packet-pm (2017-11-22) 2 commits
  (merged to 'next' on 2017-11-27 at 1527ab3519)
 + Git/Packet.pm: use 'if' instead of 'unless'
 + Git/Packet: clarify that packet_required_key_val_read allows EOF

 Code clean-up.


* cc/perf-run-config (2017-09-24) 9 commits
  (merged to 'next' on 2017-11-27 at d75a2469eb)
 + perf: store subsection results in "test-results/$GIT_PERF_SUBSECTION/"
 + perf/run: show name of rev being built
 + perf/run: add run_subsection()
 + perf/run: update get_var_from_env_or_config() for subsections
 + perf/run: add get_subsections()
 + perf/run: add calls to get_var_from_env_or_config()
 + perf/run: add GIT_PERF_DIRS_OR_REVS
 + perf/run: add get_var_from_env_or_config()
 + perf/run: add '--config' option to the 'run' script


* hm/config-parse-expiry-date (2017-11-18) 1 commit
  (merged to 'next' on 2017-11-27 at 20014f5541)
 + config: add --expiry-date

 "git config --expiry-date gc.reflogexpire" can read "2.weeks" from
 the configuration and report it as a timestamp, just like "--int"
 would read "1k" and report 1024, to help consumption by scripts.


* jk/fewer-pack-rescan (2017-11-22) 5 commits
  (merged to 'next' on 2017-11-27 at 2c35a2d831)
 + sha1_file: fast-path null sha1 as a missing object
 + everything_local: use "quick" object existence check
 + p5551: add a script to test fetch pack-dir rescans
 + t/perf/lib-pack: use fast-import checkpoint to create packs
 + p5550: factor out nonsense-pack creation

 Internaly we use 0{40} as a placeholder object name to signal the
 codepath that there is no such object (e.g. the fast-forward check
 while "git fetch" stores a new remote-tracking ref says "we know
 there is no 'old' thing pointed at by the ref, as we are creating
 it anew" by passing 0{40} for the 'old' side), and expect that a
 codepath to locate an in-core object to return NULL as a sign that
 the object does not exist.  A look-up for an object that does not
 exist however is quite costly with a repository with large number
 of packfiles.  This access pattern has been optimized.


* jn/reproducible-build (2017-11-22) 3 commits
  (merged to 'next' on 2017-11-27 at 6ae6946f8c)
 + Merge branch 'jn/reproducible-build' of ../git-gui into jn/reproducible-build
 + git-gui: sort entries in optimized tclIndex
 + generate-cmdlist: avoid non-deterministic output

 The build procedure has been taught to avoid some unnecessary
 instability in the build products.


* jn/ssh-wrappers (2017-11-21) 9 commits
  (merged to 'next' on 2017-11-27 at 00a2bb7a3c)
 + connect: correct style of C-style comment
 + ssh: 'simple' variant does not support --port
 + ssh: 'simple' variant does not support -4/-6
 + ssh: 'auto' variant to select between 'ssh' and 'simple'
 + connect: split ssh option computation to its own function
 + connect: split ssh command line options into separate function
 + connect: split git:// setup into a separate function
 + connect: move no_fork fallback to git_tcp_connect
 + ssh test: make copy_ssh_wrapper_as clean up after itself
 (this branch uses bw/protocol-v1.)

 The ssh-variant 'simple' introduced earlier broke existing
 installations by not passing --port/-4/-6 and not diagnosing an
 attempt to pass these as an error.  Instead, default to
 automatically detect how compatible the GIT_SSH/GIT_SSH_COMMAND is
 to OpenSSH convention and then error out an invocation to make it
 easier to