Re: [L10N] Kickoff of translation for Git 2.14.0 round 1

2017-07-17 Thread Kaartic Sivaraam
Hello,

As a heads up (and because I have been pulled into this thread :)), I
wanted to bring to the notice of translators that the commit,

b884244c8 commit-template: remove outdated notice about explicit paths

*removes* a message ("Explicit paths specified without -i or -o;
assuming --only paths...") that's currently being translated.

-- 
Kaartic


Re: reftable: new ref storage format

2017-07-17 Thread Michael Haggerty
On Sun, Jul 16, 2017 at 12:43 PM, Shawn Pearce  wrote:
> On Sun, Jul 16, 2017 at 10:33 AM, Michael Haggerty  
> wrote:
>> * Do you propose to store *all* references (i.e., including the
>> references that we call pseudorefs, like `HEAD`, `FETCH_HEAD`, etc) in
>> reftables, or only the references under `refs/`? If the former, then
>> you need to consider that some pseudorefs contain additional
>> information besides the SHA-1 or linked-to reference. If the latter,
>> then you could get some additional compression by making the `refs/`
>> prefix implicit rather than storing it in the "restart" records
>> explicitly.
>
> Great question I didn't have an answer for. I was planning to store
> HEAD, but not FETCH_HEAD/MERGE_HEAD/etc. Mostly because our storage
> system at $DAY_JOB doesn't have a place to store HEAD itself, its
> buried down in the reference system. So reftable has to do the job.
>
> The file format for reftable describes a type 0x3 which is just a
> length delimited byte string. Provided that the data fits into a
> single block, reftable can store these larger files with their
> auxiliary data.
>
> I'm open to the idea of HEAD and friends being outside of reftable in
> a git-core accessed repository, but the backend storage I want to run
> reftable on would likely need to store HEAD inside reftable.
>
>>   Personally, I don't think it makes sense to store pseudorefs in
>> reftables. HEAD, though it doesn't include any supplemental
>> information, is read so frequently that it seems like a bad idea to
>> have to go through the lookup process rather than storing it in a
>> separate flat file. Moreover, HEAD is written very *infrequently*, so
>> (absent special treatment) it would tend to sink deep in the reftable
>> stack, making reads even more expensive.
>
> That is a very fair argument for keeping HEAD outside.
>
> A counter argument is HEAD is written very frequently by following its
> indirect pointer to the branch (e.g. refs/heads/master). HEAD is
> consequently reflogged very frequently. reftable stores the logs
> inside the table shards. HEAD could be floated onto the top on every
> write to accompany its log record.

On second thought, the idea of having HEAD (or maybe all pseudorefs)
in the same system would open a few interesting possibilities that
derive from having a global, atomic view of all references:

1. We could store backlinks from references to the symbolic references
that refer to them. This would allow us to update the reflogs for
symbolic refs properly. (Currently, there is special-case code to
update the reflogs for HEAD when the reference that it points at is
modified, but not for other symrefs.)

2. We could store "peeled" versions of symbolic refs. These would have
to be updated whenever the pointed-at reference is updated, but that
would have two nice advantages: HEAD would usually be resolvable based
on the top reftable in the stack, and it would be resolvable in one
step (without having the follow the symref explicitly).

> [...]
>> * The tuning parameter number_of_restarts currently trades off space
>> (for the full refnames and the restart_offsets) against the need to
>> read and parse more ref_records to get the full refnames. ISTM that
>> this tradeoff could be made less painful by defining a blockwide
>> prefix that is omitted from the refnames as used in the restarts. So
>> the full refname would change from
>>
>>   this_name = prior_name[0..prefix_length] + suffix
>>
>>   to
>>
>>   this_name = block_prefix + prior_name[0..prefix_length] + suffix
>>
>>   I would expect this to allow more frequent restarts at lower space
>> cost.
>
> I've been on the fence about the value of this. It makes the search
> with restarts more difficult to implement, but does allow shrinking a
> handful of very popular prefixes like "refs/" and "refs/pulls/" in
> some blocks.
>
> An older format of reftable used only a block_prefix, and could not
> get nearly as good compression as too many blocks contained references
> with different prefixes.

It's clear that using *only* a block_prefix wouldn't yield as good
compression as your proposed scheme. And it feels plausible that when
using 64k blocks, a block_prefix wouldn't help much.

I'm still not quite resigned to non-Google users wanting to use blocks
as large as 64k, but (short of doing actual experiments, yuck!) I
can't estimate whether it would make any detectable difference in the
real world.

On the other end of the spectrum, I might mention that the
shared-storage "network.git" repositories that we use at GitHub often
have a colossal number of references (basically, the sum of the number
of references in all of the forks in a "repository network", including
some hidden references that users don't see). For example, one
"network.git" repository has 56M references(!) Mercifully, we
currently only have to access these repositories for batch jobs, but,
given a better reference 

Re: [PATCH] PRItime: wrap PRItime for better l10n compatibility

2017-07-17 Thread Jiang Xin
2017-07-18 1:10 GMT+08:00 Junio C Hamano :
> Jiang Xin  writes:
>
>> Commit cb71f8bdb5 ("PRItime: introduce a new "printf format" for
>> timestamps", 2017-04-21) does not play well with i18n framework. The
>> static string concatenation cannot be correctly interpreted by
>> gettext utilities, such as xgettext.
>>
>> Wrap PRItime in format_raw_time() function, so that we can extract
>> correct l10n messages into "po/git.pot".
>>
>> Reported-by: Jean-Noël Avila 
>> Signed-off-by: Jiang Xin 
>> ...
>> @@ -191,6 +200,15 @@ struct date_mode *date_mode_from_type(enum 
>> date_mode_type type)
>>   return 
>>  }
>>
>> +const char *format_raw_time(timestamp_t time)
>> +{
>> + static struct strbuf time_buf = STRBUF_INIT;
>> +
>> + strbuf_reset(_buf);
>> + strbuf_addf(_buf, "%"PRItime, time);
>> + return time_buf.buf;
>> +}
>
> Hmm.
>
> Two potential issues are:
>
>  - After this patch, there still are quite a many
>
> printf("time is %"PRItime" ...\n", timestamp)
>
>so the burden on the programmers having to remember when it is
>required to use format_raw_time() becomes unclear, and makes the
>change/churn larger when an existing message needs to be marked
>for translation.
>
>  - The static struct strbuf here is a cheap way to avoid leaks, but
>at the same time it is unfriendly to threaded code.  We could
>instead do:
>
> void append_PRItime(struct strbuf *buf, timestamp_t time);
>
>to fix that trivially, but the damage to the caller obviously is
>much larger going this way.
>

I wonder if we can replace the original %lu for timestamp with PRIuMAX
instead.  PRIuMAX works fine with gettext utils.

-- 
Jiang Xin


Re: [L10N] Kickoff of translation for Git 2.14.0 round 1

2017-07-17 Thread Jiang Xin
2017-07-18 0:06 GMT+08:00 Johannes Schindelin :
> Hi,
>
> On Mon, 17 Jul 2017, Jiang Xin wrote:
>
>> 2017-07-16 3:30 GMT+08:00 Jean-Noël Avila :
>> >
>> >
>> > A few remarks on i18n:
>> >
>> >  * commit cb71f8bdb5 ("PRItime: introduce a new "printf format" for
>> > timestamps") does not play well with i18n framework. The static string
>> > concatenation cannot be correctly interpreted by msgmerge. I don't know
>> > how we can combine variable format indicators with translatable strings.
>> >
>>
>> We can add a new wrapper for raw timestamp like:
>>
>> +const char *format_raw_time(timestamp_t time)
>> +{
>> +   static struct strbuf time_buf = STRBUF_INIT;
>> +
>> +   strbuf_reset(_buf);
>> +   strbuf_addf(_buf, "%"PRItime, time);
>> +   return time_buf.buf;
>> +}
>>
>>
>> , and replace macro PRItime in i18n messages with format_raw_time
>> wrapper, like this:
>>
>> -   strbuf_addf(, Q_("%"PRItime" year",
>> "%"PRItime" years", years), years);
>> +   strbuf_addf(, Q_("%s year", "%s years",
>> years), format_raw_time(years));
>
> That would come at the price of complexifying the code just to accommodate
> a translation tool.
>
> How do you gentle people deal with PRIuMAX?

Can we just use PRIuMAX instead of the macro PRItime?  PRIuMAX can be
handled properly by gettext utilities, while PRItime not.

If replace PRItime to PRIuMAX like this:

-Q_("%"PRItime" second ago", "%"PRItime"
seconds ago", diff), diff);
+Q_("%"PRIuMAX" second ago", "%"PRIuMAX"
seconds ago", diff), diff);

The above l10n message can be extracted properly by running `make pot`:

#: date.c:122
#, c-format
msgid "% second ago"
msgid_plural "% seconds ago"
msgstr[0] ""
msgstr[1] ""

> Ciao,
> Dscho



-- 
Jiang Xin


[PATCH] t8008: rely on rev-parse'd HEAD instead of sha1 value

2017-07-17 Thread Stefan Beller
Remove hard coded sha1 values, obtain the values using 'git rev-parse HEAD'
which should be future proof regardless of the hash function used.

Signed-off-by: Stefan Beller 
---
 t/t8008-blame-formats.sh | 28 +++-
 1 file changed, 15 insertions(+), 13 deletions(-)

diff --git a/t/t8008-blame-formats.sh b/t/t8008-blame-formats.sh
index 92c8e792d1..49cac4b9af 100755
--- a/t/t8008-blame-formats.sh
+++ b/t/t8008-blame-formats.sh
@@ -12,22 +12,25 @@ test_expect_success 'setup' '
echo c >>file &&
echo d >>file &&
test_tick &&
-   git commit -a -m two
+   git commit -a -m two &&
+   ID1=$(git rev-parse HEAD^) &&
+   shortID1="^$(git rev-parse HEAD^ |cut -c 1-7)" &&
+   ID2=$(git rev-parse HEAD) &&
+   shortID2="$(git rev-parse HEAD |cut -c 1-8)"
 '
 
-cat >expect <<'EOF'
-^baf5e0b (A U Thor 2005-04-07 15:13:13 -0700 1) a
-8825379d (A U Thor 2005-04-07 15:14:13 -0700 2) b
-8825379d (A U Thor 2005-04-07 15:14:13 -0700 3) c
-8825379d (A U Thor 2005-04-07 15:14:13 -0700 4) d
+cat >expect expect <

Donation

2017-07-17 Thread Annette Kleemann
Hello, My name is Gloria C. Mackenzie, i have a Monetary Donation to make for 
the less privilege, yourself and your organization, am writing you with a 
friend's email, please contact me on gmackenzie...@rogers.com


Re: [PATCH v2 1/1] sha1_file: Add support for downloading blobs on demand

2017-07-17 Thread Jonathan Tan
On Mon, 17 Jul 2017 16:09:17 -0400
Ben Peart  wrote:

> > Is this change meant to ensure that Git code that operates on loose
> > objects directly (bypassing storage-agnostic functions such as
> > sha1_object_info_extended() and has_sha1_file()) still work? If yes,
> > this patch appears incomplete (for example, read_loose_object() needs to
> > be changed too), and this seems like a difficult task - in my patch set
> > [1], I ended up deciding to create a separate type of storage and
> > instead looked at the code that operates on *packed* objects directly
> > (because there were fewer such methods) to ensure that they would work
> > correctly in the presence of a separate type of storage.
> > 
> 
> Yes, with this set of patches, we've been running successfully on 
> completely sparse clones (no commits, trees, or blobs) for several 
> months.  read_loose_object() is only called by fsck when it is 
> enumerating existing loose objects so does not need to be updated.

Ah, that's good to know. I think such an analysis (of the other
loose-related functions) in the commit message would be useful, like I
did for the packed-related functions [1].

[1] 
https://public-inbox.org/git/34efd9e9936fdab331655f5a33a098a72dc134f4.1499800530.git.jonathanta...@google.com/

> We have a few thousand developers making ~100K commits per week so in 
> our particular usage, I'm fairly confident it works correctly.  That 
> said, it is possible there is some code path I've missed. :)

I think that code paths like the history-searching ones ("git log -S",
for example) should still work offline if possible - one of my ideas is
to have these commands take a size threshold parameter so that we do not
need to fetch large blobs during the invocation. (Hence my preference
for size information to be already available to the repo.)

Aside from that, does fsck of a partial repo work? Ability to fsck seems
quite important (or, at least, useful). I tried updating fsck to support
omitting commits and trees (in addition to blobs), and it seems
relatively involved (although I didn't look into it deeply yet).

(Also, that investigation also made me realize that, in my patch set, I
didn't handle the case where a tag references a blob - fsck doesn't work
when the blob is missing, even if it is promised. That's something for
me to look into.)


Re: [PATCH] RFC: A new type of symbolic refs

2017-07-17 Thread Stefan Beller
On Mon, Jul 17, 2017 at 2:48 PM, Junio C Hamano  wrote:
> I _think_ a symbolic HEAD that points upwards to the gitlink entry in
> the superproject's index is the easiest to understand and it is
> something we can define a clear and useful semantics for.

ok, let's start with that. Once we have that we can see if that also
makes sense for the non-submodule/superproject world.

> When a recursive checkout of a branch 'foo' is made in the
> superproject, the index in the superproject would name the commit in
> the submodule to be checked out.  We traditionally detech the HEAD
> at the submodule to that commit, but instead we could say "check the
> index of the superproject to see where the HEAD should be pointing
> at" in the submodule.  Either way, immediately after such a
> recursive checkout, "git status" inside the submodule would find
> that the HEAD points at the commit recorded in the 'foo' branch of
> the superproject and things are clean.

Right.

> After you work in the submodule and make a commit, an equivalent of
> "git update-ref HEAD" is done behind the scene to update HEAD in the
> submodule.  In the traditional world, that is done to detached HEAD
> and nothing else changes, but if the symref HEAD points upwards into
> the index of the superproject, what needs to be done is very obvious;
> we do "git add submodule" in the superproject.  And this is not just
> limited to creating a commit in the submodule.  "reset --hard HEAD~2"
> in the submodule to rewind the HEAD by two commits would also be an
> update to HEAD and through the symref-ness of the HEAD should result
> in an update to the index of the superproject.

agreed. we'd find out what HEAD~2 is and then write the gitlink in the
superproject index to be that commit.

> However, I do not think a good explanation of what should mean when
> this new-style symbolic HEAD points at a commit in the superproject,
> whether its limited to its HEAD or a tip of an arbitrary branch that
> may not even be checked out.

The proposal only inspects the currently staged value of a gitlink, there
is no way to reference a specific commit or (non-checked-out) branch,
so it is not possible to point at them with the given read syntax.
So in case the superproject changes its HEAD (such as checking
out a different branch, or detaching the HEAD), then the submodules
HEAD changes automatically as the superprojects tree may change.


>  These are not something we can easily
> change without affecting wider context.  Our submodule, when we make
> a new commit, may be ready to advance, but our superproject and
> other submodules may not be ready to be included in a new commit in
> the superproject.

That is an interesting problem, as un-staging the submodule
is not as easy as un-staging a change in a file. Assume we
have a layout as

  git init super
  git -C super init sub
  echo change >>super/sub/file
  git -C super/sub commit -a -m "submodule changes"

then the following behaviors seem reasonable:

git -C super reset --soft sub
  As the superprojects index is not touched, so is the
  submodules HEAD and working tree. The submodule
  stays as is.

git -C super reset --mixed sub
  As this resets the index of the superproject, but not the working
  tree, I'd expect the submodules HEAD to be detached keeping
  the new content alive, such that it can be added as
git -C super add --re-attach-with-new-symref-mode sub

git -C super reset --hard sub
  This resets the index and the working tree in the superproject,
  so the submodules commit with the new content would be lost,
  the submodules HEAD is still the new symbolic ref.  The subs
  working tree is dirty and contains the change unless
  --recurse-submodules is given in addition, which would wipe
  the subs working directory in addition.

> So I think the idea this patch illustrates is on to something
> interesting and potentially useful, but I am not sure if it makes
> sense to tie it to anything but the index of the superproject.

Hypothetical use case:

  Your project targets multiple platforms and you are interested
  in not producing the perfect history like we do, but you never
  change a commit once created (for policy legal reasons).
  Naturally this produces a lot of duds.
  Create a a gitlink for each platform that points to the latest
  commit known to work for the platform. Point HEAD to the
  repository *itself*, and gitlink of the platform that you want to
  follow. After each fetch, checking out the latest HEAD is known
  to work for your platform.  (This hypothetical HEAD reference
  to itself is entertaining for sure.)

> Even if we limit ourselves to pointing at the index of the
> superproject, there probably are a handful of interesting issues
> that need to be clarified (not in the sense of "this and that issues
> exist, so this won't be a useful feature", but in the sense of "we'd
> be able to do these 

Re: [PATCH] doc: reformat the paragraph containing the 'cut-line'

2017-07-17 Thread Junio C Hamano
Kaartic Sivaraam  writes:

> The paragraph that describes the 'scissors' cleanup mode of
> 'commit' had the 'cut-line' in the middle of a sentence. This
> made it possible for the line to get wrapped on smaler windows.
> This shouldn't be the case as it makes it hard for the user to
> understand the structure of the cut-line.
>
> Reformat the pragraph to make the 'cut-line' stand on a line of
> it's own thus distinguishing it from the rest of the paragraph.
> This further prevents it from getting wrapped to some extent.
>
> Signed-off-by: Kaartic Sivaraam 
> ---
>  Documentation/git-commit.txt | 11 ++-
>  1 file changed, 6 insertions(+), 5 deletions(-)
>
> diff --git a/Documentation/git-commit.txt b/Documentation/git-commit.txt
> index afb06adba..2c0d48bab 100644
> --- a/Documentation/git-commit.txt
> +++ b/Documentation/git-commit.txt
> @@ -196,11 +196,12 @@ whitespace::
>  verbatim::
>   Do not change the message at all.
>  scissors::
> - Same as `whitespace`, except that everything from (and
> - including) the line
> - "`#  >8 `"
> - is truncated if the message is to be edited. "`#`" can be
> - customized with core.commentChar.
> + Same as `whitespace` except that everything from (and including)
> +the line found below is truncated, if the message is to be edited.
> +"`#`" can be customized with core.commentChar.

Is there some funny indentation setting involved?  Why aren't these
lines aligning?

> +
> +#  >8 
> +

Does an empty line before this one screw up the asciidoc formatting?

If the long line is problematic, I think it is safer to perhaps
spell it out, perhaps something like

... except that everything is ignored down from the
"scissors" line that begins with a core.commentChar (`#` by
default) followed by many dashes `-`, followed by the
scissors mark ` >8 `, followed by many dashes `-`.

???

>  default::
>   Same as `strip` if the message is to be edited.
>   Otherwise `whitespace`.


Re: [PATCH] doc: camelCase the i18n config variables to improve readability

2017-07-17 Thread Junio C Hamano
Kaartic Sivaraam  writes:

> The i18n config variable used weren't readable as they were in
> the crude form of how git stores/uses it's config variables.
>
> Improve it's readability by replacing them with camelCased versions
> of config variables as it doesn't have any impact on it's usage.
>
> Signed-off-by: Kaartic Sivaraam 
> ---

Thanks; this makes sense.

>  Documentation/i18n.txt | 10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/Documentation/i18n.txt b/Documentation/i18n.txt
> index 2dd79db5c..7e36e5b55 100644
> --- a/Documentation/i18n.txt
> +++ b/Documentation/i18n.txt
> @@ -42,11 +42,11 @@ mind.
>  +
>  
>  [i18n]
> - commitencoding = ISO-8859-1
> + commitEncoding = ISO-8859-1
>  
>  +
>  Commit objects created with the above setting record the value
> -of `i18n.commitencoding` in its `encoding` header.  This is to
> +of `i18n.commitEncoding` in its `encoding` header.  This is to
>  help other people who look at them later.  Lack of this header
>  implies that the commit log message is encoded in UTF-8.
>  
> @@ -54,15 +54,15 @@ implies that the commit log message is encoded in UTF-8.
>`encoding` header of a commit object, and try to re-code the
>log message into UTF-8 unless otherwise specified.  You can
>specify the desired output encoding with
> -  `i18n.logoutputencoding` in `.git/config` file, like this:
> +  `i18n.logOutputEncoding` in `.git/config` file, like this:
>  +
>  
>  [i18n]
> - logoutputencoding = ISO-8859-1
> + logOutputEncoding = ISO-8859-1
>  
>  +
>  If you do not have this configuration variable, the value of
> -`i18n.commitencoding` is used instead.
> +`i18n.commitEncoding` is used instead.
>  
>  Note that we deliberately chose not to re-code the commit log
>  message when a commit is made to force UTF-8 at the commit


Re: What's cooking in git.git (Jul 2017, #04; Thu, 13)

2017-07-17 Thread Junio C Hamano
Santiago Torres  writes:

> Other projects such as notmuch opted for a solution that's simlar to
> what I had suggested[1], but I wonder if it's even necessary to do.
> There is already a fix on the master branch of gnupg[2], which I imagine
> will show up to the next version of gpg2.
>
> I don't think it would make sense to fix anything on our side, unless we
> want to be extra sure the test suite is not leaking agents for all gpg
> versions (including these minor versions). 

I am not sure if it is merely "if it's even necessary"; if there are
two tests running in parallel, with their own separate
$TRASH_DIRECTORY, and one of them say "kill the agent" at the
beginning, would it affect the other test, depending on the timing?

I would imagine that the sockets are kept per GNUPGHOME and they are
not going to interfere, so if that is the case, I do not think we
mind helping folks with a buggy versions of GnuPG by having a "let's
be cautious and kill a leftover agent before starting to test"
patch, as long as the reason why we do so is clearly understood and
documented.

Thanks for digging it to the root cause.



Re: [PATCH] rebase: make resolve message clearer for inexperienced users

2017-07-17 Thread Junio C Hamano
William Duclot  writes:

> -$(gettext 'When you have resolved this problem, run "git rebase --continue".
> -If you prefer to skip this patch, run "git rebase --skip" instead.
> -To check out the original branch and stop rebasing, run "git rebase 
> --abort".')
> +$(gettext 'Resolve all conflicts manually, mark them as resolved with
> +"git add/rm ", then run "git rebase --continue".
> +You can instead skip this commit: run "git rebase --skip".
> +To abort and get back to the state before "git rebase -i", run "git rebase 
> --abort".')

This may be my fault, but we do not need to limit this to "reabse
-i"; I'll drop " -i" from this line while queuing.

Thanks.


Re: [PATCH] RFC: A new type of symbolic refs

2017-07-17 Thread Junio C Hamano
Stefan Beller  writes:

> +int read_external_symref(struct strbuf *from, struct strbuf *out)
> +{
> + struct child_process cp = CHILD_PROCESS_INIT;
> + const char *repo, *gitlink;
> + int hint, code;
> + struct strbuf **split = strbuf_split(from, 0);
> + struct strbuf cmd_out = STRBUF_INIT;
> +
> + if (!split[0] || !split[1])
> + return -1;
> +
> + repo = split[0]->buf + 5; /* skip 'repo:' */
> + gitlink = split[1]->buf;
> +
> + argv_array_pushl(,
> + "ignored-first-arg",
> + "-C", repo,
> + "ls-tree", "-z", "HEAD", "--", gitlink, NULL);
> +
> + /*
> +  * 17 accounts for '16 commit ',
> +  * the \t before path and trailing \0.
> +  */
> + hint = 17 + GIT_SHA1_HEXSZ + split[1]->len;
> + code = capture_command(, _out, hint);
> +
> + strbuf_release(split[0]);
> + strbuf_release(split[1]);
> +
> + if (!code) {
> + strbuf_reset(out);
> + strbuf_add(out, cmd_out.buf + strlen("16 commit "),
> +GIT_SHA1_HEXSZ);
> + } else
> + return -1;
> +
> + return 0;
> +}

This may help the initial checkout, but to be useful after that, we
need to define what happens when an equivalent of "git update-ref
HEAD" is done in the submodule repository, when HEAD is pointing
elsewhere.  The above only shows read-only operation, which is not
all that interesting.

I _think_ a symbolic HEAD that points upwards to the gitlink entry in
the superproject's index is the easiest to understand and it is
something we can define a clear and useful semantics for.

When a recursive checkout of a branch 'foo' is made in the
superproject, the index in the superproject would name the commit in
the submodule to be checked out.  We traditionally detech the HEAD
at the submodule to that commit, but instead we could say "check the
index of the superproject to see where the HEAD should be pointing
at" in the submodule.  Either way, immediately after such a
recursive checkout, "git status" inside the submodule would find
that the HEAD points at the commit recorded in the 'foo' branch of
the superproject and things are clean.  

After you work in the submodule and make a commit, an equivalent of
"git update-ref HEAD" is done behind the scene to update HEAD in the
submodule.  In the traditional world, that is done to detached HEAD
and nothing else changes, but if the symref HEAD points upwards into
the index of the superproject, what needs to be done is very obvious;
we do "git add submodule" in the superproject.  And this is not just
limited to creating a commit in the submodule.  "reset --hard HEAD~2"
in the submodule to rewind the HEAD by two commits would also be an
update to HEAD and through the symref-ness of the HEAD should result
in an update to the index of the superproject.

However, I do not think a good explanation of what should mean when
this new-style symbolic HEAD points at a commit in the superproject,
whether its limited to its HEAD or a tip of an arbitrary branch that
may not even be checked out.  These are not something we can easily
change without affecting wider context.  Our submodule, when we make
a new commit, may be ready to advance, but our superproject and
other submodules may not be ready to be included in a new commit in
the superproject.

So I think the idea this patch illustrates is on to something
interesting and potentially useful, but I am not sure if it makes
sense to tie it to anything but the index of the superproject.

Even if we limit ourselves to pointing at the index of the
superproject, there probably are a handful of interesting issues
that need to be clarified (not in the sense of "this and that issues
exist, so this won't be a useful feature", but in the sense of "we'd
be able to do these useful things using this feature, and we need to
fill in more details"), such as:

 - Making new commits in the submodule available to the upstream.
   Just like a detached HEAD in the submodule, this is not tied to
   any concrete branch, and it is unclear how a recursive "push"
   from the superproject should propagate the changes to the
   upstream of the submodule;

 - Switching between branches that bind the same commit for the
   submodule in the superproject would work just like switching
   between branches that record the same blob for a path, i.e. it
   will carry forward a local modification.

 - The index entry in the superproject may now have to get involved
   in fsck and reachability study in the submodule as reachability
   root.  A corollary to this is that submodules behave more
   similarly to regular blobs wrt "git reset --hard" in the
   superproject, which is a good thing.  "git -C submodule commit &&
   git reset --hard" will create a new commit in the submodule, add
   it to the index of the superproject, and then lose that change
   from the index of the 

Re: What's cooking in git.git (Jul 2017, #04; Thu, 13)

2017-07-17 Thread Santiago Torres
> > I'll dig into this. This sounds a way more reasonable approach.
> 
> Thanks.  Another thing that may help, if it turns out that we do
> want to let agent run when it wants to

I did some digging on the reason as to why this was happening. It turns
out there is a bug on gnupg. As of gpg 2.1.21, all operations require an
agent running (BTW --no-use-agent is effectively a no-op now). 


Given that on versions 2.19-21, they changed the way sockets are
created to be stored on /run/user/UID/gnupg/... deleting the $GNUPG_HOME
directory wouldn't flush the agent socket, and would leave an agent
instance per test running, possibly forever. E.g., make test would
result in the following:

santiago at ~ ✔ pgrep -a gpg-agent
632 gpg-agent --homedir /git/t/trash directory.t6050-replace/gpghome 
--use-standard-socket --daemon
1192 /usr/bin/gpg-agent --supervised
2939 gpg-agent --homedir /git/t/trash 
directory.t5801-remote-helpers/gpghome --use-standard-socket --daemon
4656 gpg-agent --homedir /git/t/trash directory.t6300-for-each-ref/gpghome 
--use-standard-socket --daemon
5427 gpg-agent --homedir /git/t/trash directory.t7510-signed-commit/gpghome 
--use-standard-socket --daemon
5898 gpg-agent --homedir /git/t/trash 
directory.t6302-for-each-ref-filter/gpghome --use-standard-socket --daemon
7747 gpg-agent --homedir /git/t/trash directory.t7003-filter-branch/gpghome 
--use-standard-socket --daemon
12922 gpg-agent --homedir /git/t/trash directory.t7600-merge/gpghome 
--use-standard-socket --daemon
13572 gpg-agent --homedir /git/t/trash directory.t7004-tag/gpghome 
--use-standard-socket --daemon
14521 gpg-agent --homedir /git/t/trash directory.t5534-push-signed/gpghome 
--use-standard-socket --daemon
16563 gpg-agent --homedir /git/t/trash 
directory.t5541-http-push-smart/gpghome --use-standard-socket --daemon
17853 gpg-agent --homedir /git/t/trash directory.t7030-verify-tag/gpghome 
--use-standard-socket --daemon
29858 gpg-agent --homedir /git/t/trash 
directory.t7612-merge-verify-signatures/gpghome --use-standard-socket --daemon
31100 gpg-agent --homedir /git/t/trash directory.t4202-log/gpghome 
--use-standard-socket --daemon

Other projects such as notmuch opted for a solution that's simlar to
what I had suggested[1], but I wonder if it's even necessary to do.
There is already a fix on the master branch of gnupg[2], which I imagine
will show up to the next version of gpg2.

I don't think it would make sense to fix anything on our side, unless we
want to be extra sure the test suite is not leaking agents for all gpg
versions (including these minor versions). 

What is your take?

Thanks!
-Santiago.

[1] https://paste.debian.net/976970/
[2] https://dev.gnupg.org/T3218


signature.asc
Description: PGP signature


Re: [PATCH] check-ref-format: require a repository for --branch

2017-07-17 Thread Marko Kungla
I guess that should only be about that it should not hit a (BUG).
In my case in the example I gave I scan trough the directories to
check repository status one of the tasks make use of check-ref-format.
Since it may hit directory which is not a git repository it should not
expose error (BUG) right.

On Mon, Jul 17, 2017 at 7:27 PM, Jonathan Nieder  wrote:
> Hi,
>
> Jeff King wrote:
>> On Fri, Jul 14, 2017 at 02:03:13PM -0400, Jeff King wrote:
>
>>> So I think the patch below is probably the right direction.
>>
>> And here it is with a real commit message, if this is what we want to
>> do.
> [...]
>> --- a/t/t1402-check-ref-format.sh
>> +++ b/t/t1402-check-ref-format.sh
>> @@ -161,6 +161,10 @@ test_expect_success 'check-ref-format --branch from 
>> subdir' '
>>   test "$refname" = "$sha1"
>>  '
>>
>> +test_expect_success 'check-ref-format --branch from non-repo' '
>> + test_must_fail nongit git check-ref-format --branch @{-1}
>> +'
>> +
>>  valid_ref_normalized() {
>>   prereq=
>>   case $1 in
>
> I don't think it's right.  Today I can do
>
> $ cd /tmp
> $ git check-ref-format --branch master
> master
>
> You might wonder why I'd ever do such a thing.  But that's what "git
> check-ref-format --branch" is for --- it is for taking a 
> argument and turning it into a branch name.  For example, if you have
> a script with an $opt_branch variable that defaults to "master", it
> may do
>
> resolved_branch=$(git check-ref-format --branch "$opt_branch")
>
> even though it is in a mode that not going to have to use
> $resolved_branch and it is not running from a repository.
>
> Thanks and hope that helps,
> Jonathan



-- 
Mail:marko.kun...@gmail.com
Phone: +31 (0) 6 2546 0117


Re: [PATCH v2 2/3] trailers: export action enums and corresponding lookup functions

2017-07-17 Thread Junio C Hamano
Christian Couder  writes:

> On Thu, Jul 13, 2017 at 12:21 AM, Paolo Bonzini  wrote:
>
>> diff --git a/trailer.h b/trailer.h
>> index e90ba1270..f306bf059 100644
>> --- a/trailer.h
>> +++ b/trailer.h
>> @@ -1,11 +1,33 @@
>>  #ifndef TRAILER_H
>>  #define TRAILER_H
>>
>> +enum action_where {
>> +   WHERE_END,
>> +   WHERE_AFTER,
>> +   WHERE_BEFORE,
>> +   WHERE_START
>> +};
>> +enum action_if_exists {
>> +   EXISTS_ADD_IF_DIFFERENT_NEIGHBOR,
>> +   EXISTS_ADD_IF_DIFFERENT,
>> +   EXISTS_ADD,
>> +   EXISTS_REPLACE,
>> +   EXISTS_DO_NOTHING
>> +};
>> +enum action_if_missing {
>> +   MISSING_ADD,
>> +   MISSING_DO_NOTHING
>> +};
>
> As these enums are now in trailer.h, maybe more specific names like
> "trailer_action_where" instead of "action_where" would be better.
>
>>  struct trailer_opts {
>> int in_place;
>> int trim_empty;
>>  };
>>
>> +int set_where(enum action_where *item, const char *value);
>> +int set_if_exists(enum action_if_exists *item, const char *value);
>> +int set_if_missing(enum action_if_missing *item, const char *value);
>
> "trailer_" should perhaps be added at the beginning of the names of
> the above functions too.

All sensible suggestions.  Thanks.


Re: [ANNOUNCE] Git v2.14.0-rc0

2017-07-17 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason  writes:

> On Thu, Jul 13 2017, Junio C. Hamano jotted:
>
> Proposed improvements for the release notes (is this a good way to
> propose RelNotes changes?)

Thanks.  You could also throw a patch just like any bugfix/update
to documentation, I would think.

> I think this may explain it better:
>
>  * The "[includeIf "gitdir:$dir"] path=..." mechanism introduced in
>2.13.0 would canonicalize the path of the gitdir being
>matched.
>
>Therefore it wouldn't match e.g. "gitdir:~/work/*" against a repo in
>"~/work/main" if ~/work was a symlink to "/mnt/storage/work".
>
>Now we match both the resolved canonical path and what "pwd" would
>show. The include will happen if either one matches.

Will use this (and some others) verbatim ;-)  Thanks.

>>  * Update "perl-compatible regular expression" support to enable JIT
>>and also allow linking with the newer PCRE v2 library.
>
> At the risk of advertising work I've done too much, I think it makes
> sense to split this into two separate and somewhat more verbose items:

As I shoot for shorter summary, going down to too much detail in
these entries is not welcome.

However, an exception is the top part of the release notes where we
discuss backward incompatible changes etc. that helps people to
decide the deployment strategy.  Encouraging migration from v1 to v2
belongs there, I would think.


Re: [PATCH] check for search permission

2017-07-17 Thread Junio C Hamano
Hielke Christian Braun  writes:

> ---
>  gitweb/gitweb.perl | 2 ++
>  1 file changed, 2 insertions(+)

Thanks for trying to help and welcome to Git development community.
But

 (1) Please double-check the title of your change.  Imagine that the
 title appears in a list of 600 other commits that goes in a
 single release in "git shortlog --no-merges" output.  Does it
 tell readers of the list what the change is about?  We cannot
 even guess that it is about the project list that appears in
 gitweb output.

 (2) Please explain what problem this is trying to solve; that is
 what the blank space before "---" line we see up above is for.
 What happens in the current code under what condition, until we
 do not apply this patch, and why is it a bad thing to happen?
 Once we apply this patch, in what way the situation gets
 improved?

 (3) Please sign-off your patch (see SubmittingPatches in
 Documentation).

Thanks.

>
> diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
> index 3d4a8ee27c96a..9208f42ed1753 100755
> --- a/gitweb/gitweb.perl
> +++ b/gitweb/gitweb.perl
> @@ -3071,6 +3071,8 @@ sub git_get_projects_list {
>   return if (m!^[/.]$!);
>   # only directories can be git repositories
>   return unless (-d $_);
> + # need search permission
> + return unless (-x $_);
>   # don't traverse too deep (Find is super slow 
> on os x)
>   # $project_maxdepth excludes depth of 
> $projectroot
>   if (($File::Find::name =~ tr!/!!) - $pfxdepth > 
> $project_maxdepth) {
>
> --
> https://github.com/git/git/pull/384


Re: [PATCH] commit: check for empty message before the check for untouched template

2017-07-17 Thread Junio C Hamano
Kaartic Sivaraam  writes:

> The check for whether the template given to 'git commit' is untouched
> is done before the empty message check. This results in a wrong error
> message being displayed in the following case. When the user removes
> everything in template completely to abort the commit he is shown the
> "template untouched" error which is wrong. He should be shown the
> "empty message" error.
>
> Do the empty message check before checking for an untouched template
> thus fixing this issue.
>
> Signed-off-by: Kaartic Sivaraam 
> ---
>  builtin/commit.c | 12 ++--
>  1 file changed, 6 insertions(+), 6 deletions(-)

Makes sense to me.

> diff --git a/builtin/commit.c b/builtin/commit.c
> index aff6bf7aa..9c299d792 100644
> --- a/builtin/commit.c
> +++ b/builtin/commit.c
> @@ -1736,19 +1736,19 @@ int cmd_commit(int argc, const char **argv, const 
> char *prefix)
>   if (verbose || /* Truncate the message just before the diff, if any. */
>   cleanup_mode == CLEANUP_SCISSORS)
>   strbuf_setlen(, wt_status_locate_end(sb.buf, sb.len));
> -
>   if (cleanup_mode != CLEANUP_NONE)
>   strbuf_stripspace(, cleanup_mode == CLEANUP_ALL);
> - if (template_untouched() && !allow_empty_message) {
> - rollback_index_files();
> - fprintf(stderr, _("Aborting commit; you did not edit the 
> message.\n"));
> - exit(1);
> - }
> +
>   if (message_is_empty() && !allow_empty_message) {
>   rollback_index_files();
>   fprintf(stderr, _("Aborting commit due to empty commit 
> message.\n"));
>   exit(1);
>   }
> + if (template_untouched() && !allow_empty_message) {
> + rollback_index_files();
> + fprintf(stderr, _("Aborting commit; you did not edit the 
> message.\n"));
> + exit(1);
> + }
>  
>   if (amend) {
>   const char *exclude_gpgsig[2] = { "gpgsig", NULL };


[PATCH v2 10/10] git.c: ignore pager.* when launching builtin as dashed external

2017-07-17 Thread Martin Ågren
When running, e.g., `git -c alias.bar=foo bar`, we expand the alias and
execute `git-foo` as a dashed external. This is true even if git foo is
a builtin. That is on purpose, and is motivated in a comment which was
added in commit 441981bc ("git: simplify environment save/restore
logic", 2016-01-26).

Shortly before we launch a dashed external, and unless we have already
found out whether we should use a pager, we check `pager.foo`. This was
added in commit 92058e4d ("support pager.* for external commands",
2011-08-18). If the dashed external is a builtin, this does not match
that commit's intention and is arguably wrong, since it would be cleaner
if we let the "dashed external builtin" handle `pager.foo`.

This has not mattered in practice, but a recent patch taught `git-tag`
to ignore `pager.tag` under certain circumstances. But, when started
using an alias, it doesn't get the chance to do so, as outlined above.
That recent patch added a test to document this breakage.

Do not check `pager.foo` before launching a builtin as a dashed
external, i.e., if we recognize the name of the external as a builtin.
Change the test to use `test_expect_success`.

Signed-off-by: Martin Ågren 
---
One could address this in run_argv(), by making the second call to
execv_dashed_external() conditional on "!is_builtin()" whereas a builtin
would be started as "git foo". (Possibly after unrolling and cleaning up
the "while (1)"-loop.) That seems like the wrong fix for this particular
issue, but might be a wanted change on its own -- or maybe not --, since
it would mean one could relay, e.g., "-c baz" to "git -c baz foo" (but
only for builtins...).

 t/t7006-pager.sh | 2 +-
 git.c| 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/t/t7006-pager.sh b/t/t7006-pager.sh
index df258c5d4..8b2ffb1aa 100755
--- a/t/t7006-pager.sh
+++ b/t/t7006-pager.sh
@@ -201,7 +201,7 @@ test_expect_success TTY 'git tag -a respects --paginate' '
test -e paginated.out
 '
 
-test_expect_failure TTY 'git tag as alias ignores pager.tag with -a' '
+test_expect_success TTY 'git tag as alias ignores pager.tag with -a' '
# git-tag will be launched as a dashed external, which
# 1) is the source of a potential bug, and
# 2) is why we use test_config and not -c.
diff --git a/git.c b/git.c
index 82ac2a092..6b6d9f68e 100644
--- a/git.c
+++ b/git.c
@@ -559,7 +559,7 @@ static void execv_dashed_external(const char **argv)
if (get_super_prefix())
die("%s doesn't support --super-prefix", argv[0]);
 
-   if (use_pager == -1)
+   if (use_pager == -1 && !is_builtin(argv[0]))
use_pager = check_pager_config(argv[0]);
commit_pager_choice();
 
-- 
2.14.0.rc0



[PATCH v2 08/10] tag: respect `pager.tag` in list-mode only

2017-07-17 Thread Martin Ågren
Using, e.g., `git -c pager.tag tag -a new-tag` results in errors such as
"Vim: Warning: Output is not to a terminal" and a garbled terminal. A
user who makes use of `git tag -a` and `git tag -l` will probably choose
not to configure `pager.tag` or to set it to "no", so that `git tag -a`
will actually work, at the cost of not getting the pager with `git tag
-l`.

Teach git tag to only respect `pager.tag` when running in list-mode.
Update the documentation and update tests.

If an alias is used to run `git tag -a`, then `pager.tag` will still be
respected. Document this known breakage. It will be fixed in a later
commit. Add a similar test for `-l`, which works as it should.

Noticed-by: Anatoly Borodin 
Suggested-by: Jeff King 
Signed-off-by: Martin Ågren 
---
 Documentation/git-tag.txt |  3 +++
 t/t7006-pager.sh  | 25 +
 builtin/tag.c |  3 ++-
 3 files changed, 26 insertions(+), 5 deletions(-)

diff --git a/Documentation/git-tag.txt b/Documentation/git-tag.txt
index 1eb15afa1..875d135e0 100644
--- a/Documentation/git-tag.txt
+++ b/Documentation/git-tag.txt
@@ -205,6 +205,9 @@ it in the repository configuration as follows:
 signingKey = 
 -
 
+`pager.tag` is only respected when listing tags, i.e., when `-l` is
+used or implied.
+See linkgit:git-config[1].
 
 DISCUSSION
 --
diff --git a/t/t7006-pager.sh b/t/t7006-pager.sh
index e7430bc93..a357436e1 100755
--- a/t/t7006-pager.sh
+++ b/t/t7006-pager.sh
@@ -187,18 +187,35 @@ test_expect_success TTY 'git tag -a defaults to not 
paging' '
! test -e paginated.out
 '
 
-test_expect_success TTY 'git tag -a respects pager.tag' '
+test_expect_success TTY 'git tag -a ignores pager.tag' '
test_when_finished "git tag -d newtag" &&
rm -f paginated.out &&
test_terminal git -c pager.tag tag -am message newtag &&
-   test -e paginated.out
+   ! test -e paginated.out
 '
 
 test_expect_success TTY 'git tag -a respects --paginate' '
test_when_finished "git tag -d newtag" &&
rm -f paginated.out &&
-   test_terminal git -c pager.tag=false --paginate \
-   tag -am message newtag &&
+   test_terminal git --paginate tag -am message newtag &&
+   test -e paginated.out
+'
+
+test_expect_failure TTY 'git tag as alias ignores pager.tag with -a' '
+   # git-tag will be launched as a dashed external, which
+   # 1) is the source of a potential bug, and
+   # 2) is why we use test_config and not -c.
+   test_when_finished "git tag -d newtag" &&
+   rm -f paginated.out &&
+   test_config pager.tag true &&
+   test_terminal git -c alias.t=tag t -am message newtag &&
+   ! test -e paginated.out
+'
+
+test_expect_success TTY 'git tag as alias respects pager.tag with -l' '
+   rm -f paginated.out &&
+   test_config pager.tag true &&
+   test_terminal git -c alias.t=tag t -l &&
test -e paginated.out
 '
 
diff --git a/builtin/tag.c b/builtin/tag.c
index 9eda434fb..5ad1af252 100644
--- a/builtin/tag.c
+++ b/builtin/tag.c
@@ -461,7 +461,8 @@ int cmd_tag(int argc, const char **argv, const char *prefix)
cmdmode = 'l';
}
 
-   setup_auto_pager("tag", 0);
+   if (cmdmode == 'l')
+   setup_auto_pager("tag", 0);
 
if ((create_tag_object || force) && (cmdmode != 0))
usage_with_options(git_tag_usage, options);
-- 
2.14.0.rc0



[PATCH v2 07/10] tag: handle `pager.tag`-configuration within the builtin

2017-07-17 Thread Martin Ågren
Use the mechanisms introduced in two earlier patches to ignore
`pager.tag` in git.c and let the `git tag` builtin handle it on its own.

This is in preparation for the next patch, where we will want to ignore
`pager.tag` unless we run in list-mode. For this reason, place the call
to setup_auto_pager() after the options have been parsed and we have
decided whether we are in list-mode.

No functional change is intended. That said, there is a window between
where the pager is started before and after this patch, and if an error
occurs within this window, as of this patch the error message might not
be paged where it would have been paged before. Since
operation-parsing has to happen inside this window, a difference can be
seen with, e.g., `git -c pager.tag="echo pager is used" tag
--unknown-option`. This change in paging-behavior should be acceptable
since it only affects erroneous usages.

Signed-off-by: Martin Ågren 
---
 builtin/tag.c | 2 ++
 git.c | 2 +-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/builtin/tag.c b/builtin/tag.c
index 01154ea8d..9eda434fb 100644
--- a/builtin/tag.c
+++ b/builtin/tag.c
@@ -461,6 +461,8 @@ int cmd_tag(int argc, const char **argv, const char *prefix)
cmdmode = 'l';
}
 
+   setup_auto_pager("tag", 0);
+
if ((create_tag_object || force) && (cmdmode != 0))
usage_with_options(git_tag_usage, options);
 
diff --git a/git.c b/git.c
index 66832f232..82ac2a092 100644
--- a/git.c
+++ b/git.c
@@ -466,7 +466,7 @@ static struct cmd_struct commands[] = {
{ "stripspace", cmd_stripspace },
{ "submodule--helper", cmd_submodule__helper, RUN_SETUP | 
SUPPORT_SUPER_PREFIX},
{ "symbolic-ref", cmd_symbolic_ref, RUN_SETUP },
-   { "tag", cmd_tag, RUN_SETUP },
+   { "tag", cmd_tag, RUN_SETUP | DELAY_PAGER_CONFIG },
{ "unpack-file", cmd_unpack_file, RUN_SETUP },
{ "unpack-objects", cmd_unpack_objects, RUN_SETUP },
{ "update-index", cmd_update_index, RUN_SETUP },
-- 
2.14.0.rc0



[PATCH v2 09/10] tag: change default of `pager.tag` to "on"

2017-07-17 Thread Martin Ågren
The previous patch taught `git tag` to only respect `pager.tag` in
list-mode. That patch left the default value of `pager.tag` at "off".

After that patch, it makes sense to let the default value be "on"
instead, since it will help with listing many tags, but will not hurt
users of `git tag -a` as it would have before. Make that change. Update
documentation and tests.

Signed-off-by: Martin Ågren 
---
 Documentation/git-tag.txt |  2 +-
 t/t7006-pager.sh  | 28 ++--
 builtin/tag.c |  2 +-
 3 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/Documentation/git-tag.txt b/Documentation/git-tag.txt
index 875d135e0..d97aad343 100644
--- a/Documentation/git-tag.txt
+++ b/Documentation/git-tag.txt
@@ -206,7 +206,7 @@ it in the repository configuration as follows:
 -
 
 `pager.tag` is only respected when listing tags, i.e., when `-l` is
-used or implied.
+used or implied. The default is to use a pager.
 See linkgit:git-config[1].
 
 DISCUSSION
diff --git a/t/t7006-pager.sh b/t/t7006-pager.sh
index a357436e1..df258c5d4 100755
--- a/t/t7006-pager.sh
+++ b/t/t7006-pager.sh
@@ -134,16 +134,16 @@ test_expect_success TTY 'configuration can enable pager 
(from subdir)' '
}
 '
 
-test_expect_success TTY 'git tag -l defaults to not paging' '
+test_expect_success TTY 'git tag -l defaults to paging' '
rm -f paginated.out &&
test_terminal git tag -l &&
-   ! test -e paginated.out
+   test -e paginated.out
 '
 
 test_expect_success TTY 'git tag -l respects pager.tag' '
rm -f paginated.out &&
-   test_terminal git -c pager.tag tag -l &&
-   test -e paginated.out
+   test_terminal git -c pager.tag=false tag -l &&
+   ! test -e paginated.out
 '
 
 test_expect_success TTY 'git tag -l respects --no-pager' '
@@ -152,32 +152,32 @@ test_expect_success TTY 'git tag -l respects --no-pager' '
! test -e paginated.out
 '
 
-test_expect_success TTY 'git tag with no args defaults to not paging' '
+test_expect_success TTY 'git tag with no args defaults to paging' '
# no args implies -l so this should page like -l
rm -f paginated.out &&
test_terminal git tag &&
-   ! test -e paginated.out
+   test -e paginated.out
 '
 
 test_expect_success TTY 'git tag with no args respects pager.tag' '
# no args implies -l so this should page like -l
rm -f paginated.out &&
-   test_terminal git -c pager.tag tag &&
-   test -e paginated.out
+   test_terminal git -c pager.tag=no tag &&
+   ! test -e paginated.out
 '
 
-test_expect_success TTY 'git tag --contains defaults to not paging' '
+test_expect_success TTY 'git tag --contains defaults to paging' '
# --contains implies -l so this should page like -l
rm -f paginated.out &&
test_terminal git tag --contains &&
-   ! test -e paginated.out
+   test -e paginated.out
 '
 
 test_expect_success TTY 'git tag --contains respects pager.tag' '
# --contains implies -l so this should page like -l
rm -f paginated.out &&
-   test_terminal git -c pager.tag tag --contains &&
-   test -e paginated.out
+   test_terminal git -c pager.tag=false tag --contains &&
+   ! test -e paginated.out
 '
 
 test_expect_success TTY 'git tag -a defaults to not paging' '
@@ -214,9 +214,9 @@ test_expect_failure TTY 'git tag as alias ignores pager.tag 
with -a' '
 
 test_expect_success TTY 'git tag as alias respects pager.tag with -l' '
rm -f paginated.out &&
-   test_config pager.tag true &&
+   test_config pager.tag false &&
test_terminal git -c alias.t=tag t -l &&
-   test -e paginated.out
+   ! test -e paginated.out
 '
 
 # A colored commit log will begin with an appropriate ANSI escape
diff --git a/builtin/tag.c b/builtin/tag.c
index 5ad1af252..ea83df5e1 100644
--- a/builtin/tag.c
+++ b/builtin/tag.c
@@ -462,7 +462,7 @@ int cmd_tag(int argc, const char **argv, const char *prefix)
}
 
if (cmdmode == 'l')
-   setup_auto_pager("tag", 0);
+   setup_auto_pager("tag", 1);
 
if ((create_tag_object || force) && (cmdmode != 0))
usage_with_options(git_tag_usage, options);
-- 
2.14.0.rc0



[PATCH v2 06/10] t7006: add tests for how git tag paginates

2017-07-17 Thread Martin Ågren
Using, e.g., `git -c pager.tag tag -a new-tag` results in errors such as
"Vim: Warning: Output is not to a terminal" and a garbled terminal. A
user who makes use of `git tag -a` and `git tag -l` will probably choose
not to configure `pager.tag` or to set it to "no", so that `git tag -a`
will actually work, at the cost of not getting the pager with `git tag
-l`.

Since we're about to change how `git tag` respects `pager.tag`, add tests
around this, including how the configuration is ignored if --no-pager or
--paginate are used.

Construct tests with a few different subcommands. First, use -l. Second,
use "no arguments" and --contains, since those imply -l. (There are
more arguments which imply -l, but using these two should be enough.)
Third, use -a as a representative for "not -l".

Make one of the tests demonstrate the behavior mentioned above, where
`git tag -a` respects `pager.tag`. Actually, the tests use `git tag -am`
so no editor is launched, but that is irrelevant, since we just want to
see whether the pager is used or not.

Signed-off-by: Martin Ågren 
---
 t/t7006-pager.sh | 68 
 1 file changed, 68 insertions(+)

diff --git a/t/t7006-pager.sh b/t/t7006-pager.sh
index 20b4d83c2..e7430bc93 100755
--- a/t/t7006-pager.sh
+++ b/t/t7006-pager.sh
@@ -134,6 +134,74 @@ test_expect_success TTY 'configuration can enable pager 
(from subdir)' '
}
 '
 
+test_expect_success TTY 'git tag -l defaults to not paging' '
+   rm -f paginated.out &&
+   test_terminal git tag -l &&
+   ! test -e paginated.out
+'
+
+test_expect_success TTY 'git tag -l respects pager.tag' '
+   rm -f paginated.out &&
+   test_terminal git -c pager.tag tag -l &&
+   test -e paginated.out
+'
+
+test_expect_success TTY 'git tag -l respects --no-pager' '
+   rm -f paginated.out &&
+   test_terminal git -c pager.tag --no-pager tag -l &&
+   ! test -e paginated.out
+'
+
+test_expect_success TTY 'git tag with no args defaults to not paging' '
+   # no args implies -l so this should page like -l
+   rm -f paginated.out &&
+   test_terminal git tag &&
+   ! test -e paginated.out
+'
+
+test_expect_success TTY 'git tag with no args respects pager.tag' '
+   # no args implies -l so this should page like -l
+   rm -f paginated.out &&
+   test_terminal git -c pager.tag tag &&
+   test -e paginated.out
+'
+
+test_expect_success TTY 'git tag --contains defaults to not paging' '
+   # --contains implies -l so this should page like -l
+   rm -f paginated.out &&
+   test_terminal git tag --contains &&
+   ! test -e paginated.out
+'
+
+test_expect_success TTY 'git tag --contains respects pager.tag' '
+   # --contains implies -l so this should page like -l
+   rm -f paginated.out &&
+   test_terminal git -c pager.tag tag --contains &&
+   test -e paginated.out
+'
+
+test_expect_success TTY 'git tag -a defaults to not paging' '
+   test_when_finished "git tag -d newtag" &&
+   rm -f paginated.out &&
+   test_terminal git tag -am message newtag &&
+   ! test -e paginated.out
+'
+
+test_expect_success TTY 'git tag -a respects pager.tag' '
+   test_when_finished "git tag -d newtag" &&
+   rm -f paginated.out &&
+   test_terminal git -c pager.tag tag -am message newtag &&
+   test -e paginated.out
+'
+
+test_expect_success TTY 'git tag -a respects --paginate' '
+   test_when_finished "git tag -d newtag" &&
+   rm -f paginated.out &&
+   test_terminal git -c pager.tag=false --paginate \
+   tag -am message newtag &&
+   test -e paginated.out
+'
+
 # A colored commit log will begin with an appropriate ANSI escape
 # for the first color; the text "commit" comes later.
 colorful() {
-- 
2.14.0.rc0



[PATCH v2 01/10] builtin.h: take over documentation from api-builtin.txt

2017-07-17 Thread Martin Ågren
Delete Documentation/technical/api-builtin.txt and move its content
verbatim into builtin.h. Just wrap it in /* ... */. In order to make
the move obviously correct, do not change any formatting, not even to
format the comment into Git's preferred style. That will be done in a
follow-up patch.

Signed-off-by: Martin Ågren 
---
 Documentation/technical/api-builtin.txt | 73 ---
 builtin.h   | 76 +
 2 files changed, 76 insertions(+), 73 deletions(-)
 delete mode 100644 Documentation/technical/api-builtin.txt

diff --git a/Documentation/technical/api-builtin.txt 
b/Documentation/technical/api-builtin.txt
deleted file mode 100644
index 22a39b929..0
--- a/Documentation/technical/api-builtin.txt
+++ /dev/null
@@ -1,73 +0,0 @@
-builtin API
-===
-
-Adding a new built-in
--
-
-There are 4 things to do to add a built-in command implementation to
-Git:
-
-. Define the implementation of the built-in command `foo` with
-  signature:
-
-   int cmd_foo(int argc, const char **argv, const char *prefix);
-
-. Add the external declaration for the function to `builtin.h`.
-
-. Add the command to the `commands[]` table defined in `git.c`.
-  The entry should look like:
-
-   { "foo", cmd_foo,  },
-+
-where options is the bitwise-or of:
-
-`RUN_SETUP`::
-   If there is not a Git directory to work on, abort.  If there
-   is a work tree, chdir to the top of it if the command was
-   invoked in a subdirectory.  If there is no work tree, no
-   chdir() is done.
-
-`RUN_SETUP_GENTLY`::
-   If there is a Git directory, chdir as per RUN_SETUP, otherwise,
-   don't chdir anywhere.
-
-`USE_PAGER`::
-
-   If the standard output is connected to a tty, spawn a pager and
-   feed our output to it.
-
-`NEED_WORK_TREE`::
-
-   Make sure there is a work tree, i.e. the command cannot act
-   on bare repositories.
-   This only makes sense when `RUN_SETUP` is also set.
-
-. Add `builtin/foo.o` to `BUILTIN_OBJS` in `Makefile`.
-
-Additionally, if `foo` is a new command, there are 3 more things to do:
-
-. Add tests to `t/` directory.
-
-. Write documentation in `Documentation/git-foo.txt`.
-
-. Add an entry for `git-foo` to `command-list.txt`.
-
-. Add an entry for `/git-foo` to `.gitignore`.
-
-
-How a built-in is called
-
-
-The implementation `cmd_foo()` takes three parameters, `argc`, `argv,
-and `prefix`.  The first two are similar to what `main()` of a
-standalone command would be called with.
-
-When `RUN_SETUP` is specified in the `commands[]` table, and when you
-were started from a subdirectory of the work tree, `cmd_foo()` is called
-after chdir(2) to the top of the work tree, and `prefix` gets the path
-to the subdirectory the command started from.  This allows you to
-convert a user-supplied pathname (typically relative to that directory)
-to a pathname relative to the top of the work tree.
-
-The return value from `cmd_foo()` becomes the exit status of the
-command.
diff --git a/builtin.h b/builtin.h
index 498ac80d0..51cb0249d 100644
--- a/builtin.h
+++ b/builtin.h
@@ -6,6 +6,82 @@
 #include "cache.h"
 #include "commit.h"
 
+/*
+builtin API
+===
+
+Adding a new built-in
+-
+
+There are 4 things to do to add a built-in command implementation to
+Git:
+
+. Define the implementation of the built-in command `foo` with
+  signature:
+
+   int cmd_foo(int argc, const char **argv, const char *prefix);
+
+. Add the external declaration for the function to `builtin.h`.
+
+. Add the command to the `commands[]` table defined in `git.c`.
+  The entry should look like:
+
+   { "foo", cmd_foo,  },
++
+where options is the bitwise-or of:
+
+`RUN_SETUP`::
+   If there is not a Git directory to work on, abort.  If there
+   is a work tree, chdir to the top of it if the command was
+   invoked in a subdirectory.  If there is no work tree, no
+   chdir() is done.
+
+`RUN_SETUP_GENTLY`::
+   If there is a Git directory, chdir as per RUN_SETUP, otherwise,
+   don't chdir anywhere.
+
+`USE_PAGER`::
+
+   If the standard output is connected to a tty, spawn a pager and
+   feed our output to it.
+
+`NEED_WORK_TREE`::
+
+   Make sure there is a work tree, i.e. the command cannot act
+   on bare repositories.
+   This only makes sense when `RUN_SETUP` is also set.
+
+. Add `builtin/foo.o` to `BUILTIN_OBJS` in `Makefile`.
+
+Additionally, if `foo` is a new command, there are 3 more things to do:
+
+. Add tests to `t/` directory.
+
+. Write documentation in `Documentation/git-foo.txt`.
+
+. Add an entry for `git-foo` to `command-list.txt`.
+
+. Add an entry for `/git-foo` to `.gitignore`.
+
+
+How a built-in is called
+
+
+The implementation `cmd_foo()` takes three parameters, `argc`, `argv,
+and `prefix`.  The first two are similar to 

[PATCH v2 03/10] builtin.h: document SUPPORT_SUPER_PREFIX

2017-07-17 Thread Martin Ågren
Commit 74866d75 ("git: make super-prefix option", 2016-10-07) introduced
SUPPORT_SUPER_PREFIX as a builtin flag without documenting it. The next
patch will add another flag, so document SUPPORT_SUPER_PREFIX, thereby
bringing the documentation up to date with the available flags.

While at it, correct '3 more things to do' to '4 more things to do'.

Signed-off-by: Martin Ågren 
---
 builtin.h | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/builtin.h b/builtin.h
index 62f22b547..7bcc08456 100644
--- a/builtin.h
+++ b/builtin.h
@@ -51,9 +51,13 @@
  * on bare repositories.
  * This only makes sense when `RUN_SETUP` is also set.
  *
+ * `SUPPORT_SUPER_PREFIX`::
+ *
+ * The builtin supports `--super-prefix`.
+ *
  * . Add `builtin/foo.o` to `BUILTIN_OBJS` in `Makefile`.
  *
- * Additionally, if `foo` is a new command, there are 3 more things to do:
+ * Additionally, if `foo` is a new command, there are 4 more things to do:
  *
  * . Add tests to `t/` directory.
  *
-- 
2.14.0.rc0



[PATCH v2 05/10] git.c: provide setup_auto_pager()

2017-07-17 Thread Martin Ågren
The previous patch introduced a way for builtins to declare that they
will take responsibility for handling the `pager.foo`-config item. (See
the commit message of that patch for why that could be useful.)

Provide setup_auto_pager(), which builtins can call in order to handle
`pager.`, including possibly starting the pager. Make this function
don't do anything if a pager has already been started, as indicated by
use_pager or pager_in_use().

Whenever this function is called from a builtin, git.c will already have
called commit_pager_choice(). Since commit_pager_choice() treats the
special value -1 as "punt" or "not yet decided", it is not a problem
that we might end up calling commit_pager_choice() once in git.c and
once (or more) in the builtin. Make the new function use -1 in the same
way and document it as "punt".

Don't add any users of setup_auto_pager just yet, one will follow in
a later patch.

Signed-off-by: Martin Ågren 
---
 builtin.h |  9 +
 git.c | 10 ++
 2 files changed, 19 insertions(+)

diff --git a/builtin.h b/builtin.h
index 4186635de..3ca4a53a8 100644
--- a/builtin.h
+++ b/builtin.h
@@ -113,6 +113,15 @@ struct fmt_merge_msg_opts {
 extern int fmt_merge_msg(struct strbuf *in, struct strbuf *out,
 struct fmt_merge_msg_opts *);
 
+/**
+ * If a builtin has DELAY_PAGER_CONFIG set, the builtin should call this early
+ * when it wishes to respect the `pager.foo`-config. The `cmd` is the name of
+ * the builtin, e.g., "foo". If a paging-choice has already been setup, this
+ * does nothing. The default in `def` should be 0 for "pager off", 1 for "pager
+ * on" or -1 for "punt".
+ */
+extern void setup_auto_pager(const char *cmd, int def);
+
 extern int is_builtin(const char *s);
 
 extern int cmd_add(int argc, const char **argv, const char *prefix);
diff --git a/git.c b/git.c
index 79195ebbd..66832f232 100644
--- a/git.c
+++ b/git.c
@@ -33,6 +33,16 @@ static void commit_pager_choice(void) {
}
 }
 
+void setup_auto_pager(const char *cmd, int def)
+{
+   if (use_pager != -1 || pager_in_use())
+   return;
+   use_pager = check_pager_config(cmd);
+   if (use_pager == -1)
+   use_pager = def;
+   commit_pager_choice();
+}
+
 static int handle_options(const char ***argv, int *argc, int *envchanged)
 {
const char **orig_argv = *argv;
-- 
2.14.0.rc0



[PATCH v2 02/10] builtin.h: format documentation-comment properly

2017-07-17 Thread Martin Ågren
The previous commit moved technical documentation verbatim into
builtin.h. Prefix all the lines with '*' to follow the coding
guidelines.

Remove a '+' which was needed when the information was formatted for
AsciiDoc. Similarly, change "::" to ":".

Signed-off-by: Martin Ågren 
---
 builtin.h | 146 +++---
 1 file changed, 73 insertions(+), 73 deletions(-)

diff --git a/builtin.h b/builtin.h
index 51cb0249d..62f22b547 100644
--- a/builtin.h
+++ b/builtin.h
@@ -7,79 +7,79 @@
 #include "commit.h"
 
 /*
-builtin API
-===
-
-Adding a new built-in
--
-
-There are 4 things to do to add a built-in command implementation to
-Git:
-
-. Define the implementation of the built-in command `foo` with
-  signature:
-
-   int cmd_foo(int argc, const char **argv, const char *prefix);
-
-. Add the external declaration for the function to `builtin.h`.
-
-. Add the command to the `commands[]` table defined in `git.c`.
-  The entry should look like:
-
-   { "foo", cmd_foo,  },
-+
-where options is the bitwise-or of:
-
-`RUN_SETUP`::
-   If there is not a Git directory to work on, abort.  If there
-   is a work tree, chdir to the top of it if the command was
-   invoked in a subdirectory.  If there is no work tree, no
-   chdir() is done.
-
-`RUN_SETUP_GENTLY`::
-   If there is a Git directory, chdir as per RUN_SETUP, otherwise,
-   don't chdir anywhere.
-
-`USE_PAGER`::
-
-   If the standard output is connected to a tty, spawn a pager and
-   feed our output to it.
-
-`NEED_WORK_TREE`::
-
-   Make sure there is a work tree, i.e. the command cannot act
-   on bare repositories.
-   This only makes sense when `RUN_SETUP` is also set.
-
-. Add `builtin/foo.o` to `BUILTIN_OBJS` in `Makefile`.
-
-Additionally, if `foo` is a new command, there are 3 more things to do:
-
-. Add tests to `t/` directory.
-
-. Write documentation in `Documentation/git-foo.txt`.
-
-. Add an entry for `git-foo` to `command-list.txt`.
-
-. Add an entry for `/git-foo` to `.gitignore`.
-
-
-How a built-in is called
-
-
-The implementation `cmd_foo()` takes three parameters, `argc`, `argv,
-and `prefix`.  The first two are similar to what `main()` of a
-standalone command would be called with.
-
-When `RUN_SETUP` is specified in the `commands[]` table, and when you
-were started from a subdirectory of the work tree, `cmd_foo()` is called
-after chdir(2) to the top of the work tree, and `prefix` gets the path
-to the subdirectory the command started from.  This allows you to
-convert a user-supplied pathname (typically relative to that directory)
-to a pathname relative to the top of the work tree.
-
-The return value from `cmd_foo()` becomes the exit status of the
-command.
+ * builtin API
+ * ===
+ *
+ * Adding a new built-in
+ * -
+ *
+ * There are 4 things to do to add a built-in command implementation to
+ * Git:
+ *
+ * . Define the implementation of the built-in command `foo` with
+ *   signature:
+ *
+ * int cmd_foo(int argc, const char **argv, const char *prefix);
+ *
+ * . Add the external declaration for the function to `builtin.h`.
+ *
+ * . Add the command to the `commands[]` table defined in `git.c`.
+ *   The entry should look like:
+ *
+ * { "foo", cmd_foo,  },
+ *
+ * where options is the bitwise-or of:
+ *
+ * `RUN_SETUP`:
+ * If there is not a Git directory to work on, abort.  If there
+ * is a work tree, chdir to the top of it if the command was
+ * invoked in a subdirectory.  If there is no work tree, no
+ * chdir() is done.
+ *
+ * `RUN_SETUP_GENTLY`:
+ * If there is a Git directory, chdir as per RUN_SETUP, otherwise,
+ * don't chdir anywhere.
+ *
+ * `USE_PAGER`:
+ *
+ * If the standard output is connected to a tty, spawn a pager and
+ * feed our output to it.
+ *
+ * `NEED_WORK_TREE`:
+ *
+ * Make sure there is a work tree, i.e. the command cannot act
+ * on bare repositories.
+ * This only makes sense when `RUN_SETUP` is also set.
+ *
+ * . Add `builtin/foo.o` to `BUILTIN_OBJS` in `Makefile`.
+ *
+ * Additionally, if `foo` is a new command, there are 3 more things to do:
+ *
+ * . Add tests to `t/` directory.
+ *
+ * . Write documentation in `Documentation/git-foo.txt`.
+ *
+ * . Add an entry for `git-foo` to `command-list.txt`.
+ *
+ * . Add an entry for `/git-foo` to `.gitignore`.
+ *
+ *
+ * How a built-in is called
+ * 
+ *
+ * The implementation `cmd_foo()` takes three parameters, `argc`, `argv,
+ * and `prefix`.  The first two are similar to what `main()` of a
+ * standalone command would be called with.
+ *
+ * When `RUN_SETUP` is specified in the `commands[]` table, and when you
+ * were started from a subdirectory of the work tree, `cmd_foo()` is called
+ * after chdir(2) to the top of the work tree, and `prefix` gets the path
+ * to the 

[PATCH v2 04/10] git.c: let builtins opt for handling `pager.foo` themselves

2017-07-17 Thread Martin Ågren
Before launching a builtin git foo and unless mechanisms with precedence
are in use, we check for and handle the `pager.foo` config. This is done
without considering exactly how git foo is being used, and indeed, git.c
cannot (and should not) know what the arguments to git foo are supposed
to achieve.

In practice this means that, e.g., `git -c pager.tag tag -a new-tag`
results in errors such as "Vim: Warning: Output is not to a terminal"
and a garbled terminal. A user who makes use of `git tag -a` and `git
tag -l` will probably choose not to configure `pager.tag` or to set it
to "no", so that `git tag -a` will actually work, at the cost of not
getting the pager with `git tag -l`.

To allow individual builtins to make more informed decisions about when
to respect `pager.foo`, introduce a flag DELAY_PAGER_CONFIG. If the flag
is set, do not check `pager.foo`.

Do not check for DELAY_PAGER_CONFIG in `execv_dashed_external()`. That
call site is arguably wrong, although in a way that is not yet visible,
and will be changed in a slightly different direction in a later patch.

Don't add any users of DELAY_PAGER_CONFIG just yet, one will follow in a
later patch.

Suggested-by: Jeff King 
Signed-off-by: Martin Ågren 
---
 builtin.h | 8 
 git.c | 4 +++-
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/builtin.h b/builtin.h
index 7bcc08456..4186635de 100644
--- a/builtin.h
+++ b/builtin.h
@@ -55,6 +55,14 @@
  *
  * The builtin supports `--super-prefix`.
  *
+ * `DELAY_PAGER_CONFIG`::
+ *
+ * If RUN_SETUP or RUN_SETUP_GENTLY is set, git.c normally handles
+ * the `pager.`-configuration. If this flag is used, git.c
+ * will skip that step, instead allowing the builtin to make a
+ * more informed decision, e.g., by ignoring `pager.` for
+ * certain subcommands.
+ *
  * . Add `builtin/foo.o` to `BUILTIN_OBJS` in `Makefile`.
  *
  * Additionally, if `foo` is a new command, there are 4 more things to do:
diff --git a/git.c b/git.c
index 489aab4d8..79195ebbd 100644
--- a/git.c
+++ b/git.c
@@ -283,6 +283,7 @@ static int handle_alias(int *argcp, const char ***argv)
  */
 #define NEED_WORK_TREE (1<<3)
 #define SUPPORT_SUPER_PREFIX   (1<<4)
+#define DELAY_PAGER_CONFIG (1<<5)
 
 struct cmd_struct {
const char *cmd;
@@ -306,7 +307,8 @@ static int run_builtin(struct cmd_struct *p, int argc, 
const char **argv)
prefix = setup_git_directory_gently(_ok);
}
 
-   if (use_pager == -1 && p->option & (RUN_SETUP | 
RUN_SETUP_GENTLY))
+   if (use_pager == -1 && p->option & (RUN_SETUP | 
RUN_SETUP_GENTLY) &&
+   !(p->option & DELAY_PAGER_CONFIG))
use_pager = check_pager_config(p->cmd);
if (use_pager == -1 && p->option & USE_PAGER)
use_pager = 1;
-- 
2.14.0.rc0



[PATCH v2 00/10] tag: only respect `pager.tag` in list-mode

2017-07-17 Thread Martin Ågren
This is the second version of "[PATCH 0/7] tag: more fine-grained
pager-configuration" [1]. That series introduced `pager.tag.list` to
address the fact that `pager.tag` can be useful with `git tag -l` but
actively hostile with `git tag -a`. Thanks to Junio, Peff and Brandon
for helpful feedback.

After that feedback, v2 drops `pager.tag.list` and instead teaches
`git tag` to only consider `pager.tag` in list-mode, as suggested by
Peff.

Patches 1-3/10 replace patch 1/7. They move Documentation/technical/
api-builtin.txt into builtin.h, tweak the formatting and bring it up to
date. I may have gone overboard making this 3 patches...

Patches 4-7/10 correspond to patches 2-5/7. `setup_auto_pager()' is now
much simpler since we do not need to handle "tag.list" with a clever
fallback strategy. IGNORE_PAGER_CONFIG is now called DELAY_PAGER_CONFIG.
I now check with pager_in_use() and I moved the handling of `pager.tag`
a bit further down.

Patches 8-9/10 teach `git tag` to only respect `pager.tag` in list-mode
and flip the default value for that config to "on".

Patch 10/10 is somewhat similar to a hunk in patch 2/7, but is now a
bug-fix instead of a feature. It teaches `execv_dashed_external()` not
to check `pager.foo` when launching `git-foo` where foo is a builtin.
I waffled about where to put this patch. Putting it earlier in the
series as a preparatory step, I couldn't come up with a way of writing a
test. So patch 8/10 introduces a `test_expect_failure` which this patch
then fixes.

Martin

[1] https://public-inbox.org/git/cover.1499723297.git.martin.ag...@gmail.com/T/

Martin Ågren (10):
  builtin.h: take over documentation from api-builtin.txt
  builtin.h: format documentation-comment properly
  builtin.h: document SUPPORT_SUPER_PREFIX
  git.c: let builtins opt for handling `pager.foo` themselves
  git.c: provide setup_auto_pager()
  t7006: add tests for how git tag paginates
  tag: handle `pager.tag`-configuration within the builtin
  tag: respect `pager.tag` in list-mode only
  tag: change default of `pager.tag` to "on"
  git.c: ignore pager.* when launching builtin as dashed external

 Documentation/git-tag.txt   |  3 +
 Documentation/technical/api-builtin.txt | 73 -
 t/t7006-pager.sh| 85 +
 builtin.h   | 97 +
 builtin/tag.c   |  3 +
 git.c   | 18 +-
 6 files changed, 203 insertions(+), 76 deletions(-)
 delete mode 100644 Documentation/technical/api-builtin.txt

-- 
2.14.0.rc0



Re: [PATCH v2 1/1] sha1_file: Add support for downloading blobs on demand

2017-07-17 Thread Ben Peart



On 7/17/2017 2:06 PM, Jonathan Tan wrote:

About the difference between this patch and my patch set [1], besides
the fact that this patch does not spawn separate processes for each
missing object, which does seem like an improvement to me, this patch
(i) does not use a list of promised objects (but instead communicates
with the hook for each missing object), and (ii) provides backwards
compatibility with other Git code (that does not know about promised
objects) in a different way.

The costs and benefits of (i) are being discussed here [2]. As for (ii),
I still think that my approach is better - I have commented more about
this below.

Maybe the best approach is a combination of both our approaches.


Yes, in the context of the promised objects model patch series, the 
value of this patch series is primarily as a sample of how to use the 
sub-process mechanism to create a versioned interface for retrieving 
objects.




[1] 
https://public-inbox.org/git/34efd9e9936fdab331655f5a33a098a72dc134f4.1499800530.git.jonathanta...@google.com/

[2] 
https://public-inbox.org/git/20170713123951.5cab1...@twelve2.svl.corp.google.com/

On Fri, 14 Jul 2017 09:26:51 -0400
Ben Peart  wrote:


+
+packet: git> command=get
+packet: git> sha1=0a214a649e1b3d5011e14a3dc227753f2bd2be05
+packet: git> 
+


It would be useful to have this command support more than one SHA-1, so
that hooks that know how to batch can do so.



I agree.  Since nothing was using that capability yet, I decided to keep 
it simple and not add support for a feature that wasn't being used. The 
reason the interface is versioned is so that if/when something does need 
that capability, it can be added.



+static int subprocess_map_initialized;
+static struct hashmap subprocess_map;


The documentation of "tablesize" in "struct hashmap" states that it can
be used to check if the hashmap is initialized, so
subprocess_map_initialized is probably unnecessary.



Nice.  That will make things a little simpler.


  static int check_and_freshen(const unsigned char *sha1, int freshen)
  {
-   return check_and_freshen_local(sha1, freshen) ||
-  check_and_freshen_nonlocal(sha1, freshen);
+   int ret;
+   int already_retried = 0;
+
+retry:
+   ret = check_and_freshen_local(sha1, freshen) ||
+   check_and_freshen_nonlocal(sha1, freshen);
+   if (!ret && core_virtualize_objects && !already_retried) {
+   already_retried = 1;
+   if (!read_object_process(sha1))
+   goto retry;
+   }
+
+   return ret;
  }


Is this change meant to ensure that Git code that operates on loose
objects directly (bypassing storage-agnostic functions such as
sha1_object_info_extended() and has_sha1_file()) still work? If yes,
this patch appears incomplete (for example, read_loose_object() needs to
be changed too), and this seems like a difficult task - in my patch set
[1], I ended up deciding to create a separate type of storage and
instead looked at the code that operates on *packed* objects directly
(because there were fewer such methods) to ensure that they would work
correctly in the presence of a separate type of storage.



Yes, with this set of patches, we've been running successfully on 
completely sparse clones (no commits, trees, or blobs) for several 
months.  read_loose_object() is only called by fsck when it is 
enumerating existing loose objects so does not need to be updated.


We have a few thousand developers making ~100K commits per week so in 
our particular usage, I'm fairly confident it works correctly.  That 
said, it is possible there is some code path I've missed. :)



[1] 
https://public-inbox.org/git/34efd9e9936fdab331655f5a33a098a72dc134f4.1499800530.git.jonathanta...@google.com/



Re: reftable [v2]: new ref storage format

2017-07-17 Thread Stefan Beller
On Mon, Jul 17, 2017 at 12:04 PM, Shawn Pearce  wrote:

> A block must have at least one restart in it, the first ref_record
> must be a restart. So number_of_restarts in the tail of the block can
> be 0, which implies 1 restart (number_of_restarts + 1), and the first
> restart is required at the first ref_record. :)

Hah! I assumed the first entry to not be recorded because it is always
a restart by definition of the file format, so it could be omitted in the
restart_offset list, but that would complicate the implementation, such
that including it makes sense.

>
>> When starting to write a block, we need to know exactly how large
>> the ref_records* and restart offsets need to be to put the
>> number_of_restarts at the position as promised via block_len.
>> This sounds complicated unless I missed the obvious.
>
> Correct. The writer needs to compute the block size before it writes
> the block. It does so by buffering the block contents until its
> approximately full, then fixes block_len, and flushes the block.

So that is another trade off for determining the block size. "How much
can I buffer?"

>> Going by this, would it rather make sense to omit the block_len
>> and then scan backwards from *block_size-1 to find the first non-NUL
>> and that will be the number_of_restarts?
>
> Not quite. On small reftable files the "physical" block may be shared
> with a log block ('g'). We need to be able to reliably find the of the
> ref block ('r'), without padding between the two blocks.

I'd need to reread the proposal to understand this bit as I assumed that
each block starts at a multiple of block_size. However we could choose
block_size such that there is no padding between 'r' and 'g'. Ok,
makes sense.

> The time field is also prefix compressed as part of the ref name's
> prefix compression. So there is no need to move to the complexity of a
> varint or anything else.

I agree, that is why you explicitly said that the key is
ref_name '\0' reverse_int32( time_sec )

Note (as found out in discussion with jrnieder@): The size of the integer
is determined by the suffix length encoding and the preceding '\0',
such that the file format allows arbitrary integer size. So instead of
pretending we can only do 32 bit here, just say 'uint' ?


Re: reftable [v2]: new ref storage format

2017-07-17 Thread Junio C Hamano
Shawn Pearce  writes:

> This is an updated draft after discussion on list with Peff, Michael
> Haggerty, and Dave Borowitz.
>
> You can read a rendered version of this here:
> https://googlers.googlesource.com/sop/jgit/+/reftable/Documentation/technical/reftable.md
>
> Biggest changes from the first proposal are:
>
> - reflog is now integrated into reftable
> - block headers slightly different
> - Peff's stack management idea is used
> - Michael's compaction idea is used

Just a few comments.

> A variable number of 4-byte `restart_offset` values follows the
> records.  Offsets are relative to the start of the block (0 in first
> block to include file header) and refer to the first byte of any
> `ref_record` whose name has not been prefixed compressed.  Readers can
> start linear scans from any of these records.

It is unclear what "0 in first block to include file header" wants
to say.  Do I write "8" if I want to express the offset of the first
record in the first block, or do I write "0"?

> The 2-byte `number_of_restarts + 1` stores the number of entries in
> the `restart_offset` list.

It is unclear whose responsibility it is to add this "1".  Does this
mean a reader thinks there is one entry in the restart table when it
sees "0" in the number_of_restarts field (hence you can have max
65536 entries in total)?

> Readers can use the restart count to binary search between restarts
> before starting a linear scan.  The `number_of_restarts` field must be
> the last 2 bytes of the block as specified by `block_len` from the
> block header.

Does the new term "restart count" mean the same thing as
number_of_restarts?

> ### Log block format
>
> A log block is written as:
>
> 'g'
> uint24( block_len )
> zlib_deflate {
>   log_record*
>   int32( restart_offset )*
>   int16( number_of_restarts )
> }
>
> Log blocks look similar to ref blocks, except `block_type = 'g'`.

Is there a general recommended strategy for writers to choose how
many entries to include in a single physical block?  I understand
that the deflated whole must fit in the physical block whose length
is defined in the footer of the whole file, and in general you would
not know how small the data deflates down to before compressing,
right?

> Log record keys are structured as:
>
> ref_name '\0' reverse_int32( time_sec )
>
> where `time_sec` is the update time in seconds since the epoch.  The
> `reverse_int32` function inverses the value so lexographical ordering
> the network byte order time sorts more recent records first:
>
> reverse_int(int32 t) {
>   return 0x - t;
> }

Is 2038 an issue, or by that time we'd all be retired together with
this file format and it won't be our problem?

As the file format uses delta compression with restarts, a reader
needs to sequencially scan some bounded number of entries to get the
contents of a specific entry anyway, so I am wondering if it is
worth storing a longer timestamp in varint() for an restart entry
and express the timestamp on delta entries as difference to the
previous entry.

> ### Log index
>
> The log index stores the log key (`refname \0 reverse_int32(time_sec)`)
> for the last log record of every log block in the file, supporting
> bounded-time lookup.

This assumes that timestamps never wildly rewind in the reflog,
doesn't it?  Is that a sensible assumption?

Or does "the last log record" in the above mean "the log record with
largest timestamp?  ... ah, not that is still not sufficient.  You'd
still need to assume that timestamps on entries in one log block must
be all older than the ones on entries in later log blocks.  Hmph...

Also it is not clear to me how reflogs for two refs would be
intermixed in the log blocks, and what log keys for the entries must
be recorded in the log index, to facilitate efficient lookup.  Is it
assumed that a consecutive sequence of log blocks record reflogs for
the same ref, before the sequence of log blocks switch to record
reflogs for another ref, or something?

> A log index block must be written if 2 or more log blocks are written
> to the file.  If present, the log index appears after the first log
> block.  There is no padding used to align the log index to block
> alignment.
>
> Log index format is identical to ref index, except the keys are 5
> bytes longer to include `'\0'` and the 4-byte `reverse_int32(time)`.
> Records use `block_offset` to refer to the start of a log block.

I am assuming that we do not care about being able to quickly
determine master@{24028}; I would imagine that it wouldn't be too
hard to add an index to help such query, but I offhand would not
know the details until I figure out how the format handles reflog
entries for multiple refs first.


Documentation lingering inconsistency between working directory and working tree

2017-07-17 Thread Yazeed Sabri
Hi all,

I have been reading the git documentation for a while now and came to
realize that in a
lot of commands the documentation is ambiguous when it comes to the difference
between "working directory" and "working tree". I'm aware of the
changes to status
in this commit for example:


commit 2a0e6cdedab306eccbd297c051035c13d0266343
Author: Lars Vogel 
Date:   Thu Jun 9 20:19:30 2016 +0200

Use "working tree" instead of "working directory" for git status

Working directory can be easily confused with the current directory.
In one of my patches I already updated the usage of working
directory
with working tree for the man page but I noticed that git status
also
uses this incorrect term.

Signed-off-by: Lars Vogel 
Signed-off-by: Junio C Hamano 

diff --git a/wt-status.c b/wt-status.c
index 4f27bd62a..4ce4e35ac 100644
--- a/wt-status.c
+++ b/wt-status.c
@@ -1554,7 +1554,7 @@ void wt_status_print(struct wt_status *s)
else
printf(_("nothing to commit\n"));
} else
-   printf(_("nothing to commit, working directory
clean\n"));
+   printf(_("nothing to commit, working tree
clean\n"));
}
 }



and more frequently used commands like add are consistent as well. But a
grep on the documentation to a pull I did today, shows that a lot of
commands are still
not updated accordingly.

This brings me to my next point, so what is the precise definition of
each concept? Is
working directory just an outdated mistake?

Was the change in status above made to solve the confusion and denote
that status
is unique per-working tree given its HEAD? If this is the case, then
is gc unique
per-working tree too, or is it done on the "database" like fsck? Same
idea with stash,
does stash care on what working tree it was called? If not, does it
make sense for
it to do so?


Best,
Yazeed

-- 

--
This message and its contents, including attachments are intended solely 
for the original recipient. If you are not the intended recipient or have 
received this message in error, please notify me immediately and delete 
this message from your computer system. Any unauthorized use or 
distribution is prohibited. Please consider the environment before printing 
this email.


[PATCH] check for search permission

2017-07-17 Thread Hielke Christian Braun
---
 gitweb/gitweb.perl | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 3d4a8ee27c96a..9208f42ed1753 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -3071,6 +3071,8 @@ sub git_get_projects_list {
return if (m!^[/.]$!);
# only directories can be git repositories
return unless (-d $_);
+   # need search permission
+   return unless (-x $_);
# don't traverse too deep (Find is super slow 
on os x)
# $project_maxdepth excludes depth of 
$projectroot
if (($File::Find::name =~ tr!/!!) - $pfxdepth > 
$project_maxdepth) {

--
https://github.com/git/git/pull/384


Re: [PATCH] RFC: A new type of symbolic refs

2017-07-17 Thread Stefan Beller
On Sun, Jul 16, 2017 at 6:04 AM, Philip Oakley  wrote:

>
> If I understand this correctly, the new type is the 'starts_with(buf, 
> "repo:"))'.
>
> It just wasn't obvious from the text that the new type is "repo:" as you
> never spell it out in the commit message. Should it be included in the
> message?
>
> Have I understood correctly?

Yes, you understood correctly.

The idea is to allow submodules to "borrow" its refs from the superproject.
My original idea was to only borrow HEAD from the superproject, but this
can be extended to more branches.

By having a more generic way of borrowing refs from another repo, this
can also be a different repo than the superproject (I don't know the use
case for this yet, but let's not have a to specific proposal).

Let's see how this RFC fares in further interest from the community.


Re: [PATCH/RFC] diff/read-cache: don't assume empty files will filter to empty

2017-07-17 Thread Junio C Hamano
Andrei Purdea  writes:

> diff --git a/read-cache.c b/read-cache.c
> index 2121b6e7b..ca306993c 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -262,9 +262,8 @@ static int ce_match_stat_basic(const struct
> cache_entry *ce, struct stat *st)
>   changed |= match_stat_data(>ce_stat_data, st);
>
>   /* Racily smudged entry? */
> - if (!ce->ce_stat_data.sd_size) {
> - if (!is_empty_blob_sha1(ce->oid.hash))
> - changed |= DATA_CHANGED;
> + if (ce->ce_stat_data.sd_size == (unsigned int)-1) {
> + changed |= DATA_CHANGED;
>   }
>
>   return changed;
> @@ -2028,7 +2027,7 @@ static void ce_smudge_racily_clean_entry(struct
> cache_entry *ce)
>   * file, and never calls us, so the cached size information
>   * for "frotz" stays 6 which does not match the filesystem.
>   */
> - ce->ce_stat_data.sd_size = 0;
> + ce->ce_stat_data.sd_size = (unsigned int)-1;
>   }
>  }

This is not a good idea.  A change like this will break existing
installations where a zeroed size is understood as "not verified as
clean".

Is it infeasible to fix your clean-smudge filter pair to clean
and/or smudge zero-sized payload to empty?  After all, it does not
help offloading the storing of blob that is known to be empty and
read another emptiness from the network or whatever external service
when the end result is known to be zero bytes anyway, no?



Re: reftable [v2]: new ref storage format

2017-07-17 Thread Shawn Pearce
On Mon, Jul 17, 2017 at 11:53 AM, Stefan Beller  wrote:
> On Mon, Jul 17, 2017 at 8:01 AM, Shawn Pearce  wrote:
>
>> A ref block is written as:
>>
>> 'r'
>> uint24 ( block_len )
>> ref_record*
>> uint32( restart_offset )*
>> uint16( number_of_restarts )
>> padding?
>>
> ...
>>
>> The 2-byte `number_of_restarts + 1` stores the number of entries in
>> the `restart_offset` list.
>
> This means uint16( number_of_restarts ) cannot be 0, but has to be 1
> to indicate no restarts?

A block must have at least one restart in it, the first ref_record
must be a restart. So number_of_restarts in the tail of the block can
be 0, which implies 1 restart (number_of_restarts + 1), and the first
restart is required at the first ref_record. :)

> When starting to write a block, we need to know exactly how large
> the ref_records* and restart offsets need to be to put the
> number_of_restarts at the position as promised via block_len.
> This sounds complicated unless I missed the obvious.

Correct. The writer needs to compute the block size before it writes
the block. It does so by buffering the block contents until its
approximately full, then fixes block_len, and flushes the block.

> Going by this, would it rather make sense to omit the block_len
> and then scan backwards from *block_size-1 to find the first non-NUL
> and that will be the number_of_restarts?

Not quite. On small reftable files the "physical" block may be shared
with a log block ('g'). We need to be able to reliably find the of the
ref block ('r'), without padding between the two blocks.

> Or we could allow additional padding between ref_record and
> restart_offsets, such that the writer implementation has wiggle room
> for the restarting logic.

I had that in an older format description, and removed it. Placing the
padding at the end of the block was simpler for the reader and writer
implementation to handle.


>> Log record keys are structured as:
>>
>> ref_name '\0' reverse_int32( time_sec )
>>
>> where `time_sec` is the update time in seconds since the epoch.
>
> The epoch ends 2038, which is in 21 years. (Did you just volunteer
> to fixup the time issues in 20 years?)
> If possible I'd prefer a date to be encoded with more range available.

Good point. However, I think in 20 years we'll see 2 more hash
functions for Git, and we can bump reftable to v2 and expand the field
to 8 bytes.

> The ref names itself are compressed, would we also want to compress
> the timing information?

The time field is also prefix compressed as part of the ref name's
prefix compression. So there is no need to move to the complexity of a
varint or anything else.


>> ### Update transactions
>>
>> Although reftables are immutable, mutations are supported by writing a
>> new reftable and atomically appending it to the stack:
>>
>> 1. Atomically create `stack.lock`
>> 2. Copy current stack to `stack.lock`.
>> 3. Prepare new reftable in temp file `.refXXX`.
>>Include log entries.
>> 4. Rename (3) to `${sha1}.ref`.
>> 5. Append `${sha1}.ref` to `stack.lock`
>> 6. Atomically rename `stack.lock` to `stack`.
>
> In case 3.+4. becomes time consuming, it can be prepared outside
> the lock, such that inside the lock we'd only need to check
> for contention of refs? For example if I'd want to update one ref and
> another write wants to update another refs, we'd both be preparing
> the a new refstable containing one ref and log, then one of us obtains
> the lock and writes. The second writer would then need to inspect
> the delta of the stack and see if any ref that they want to change
> was touched.

Excellent point, it reduces the contention window for non-conflicting
writes. I will update this section with your input, thank you Stefan.


Re: reftable [v2]: new ref storage format

2017-07-17 Thread Stefan Beller
On Mon, Jul 17, 2017 at 8:01 AM, Shawn Pearce  wrote:

> A ref block is written as:
>
> 'r'
> uint24 ( block_len )
> ref_record*
> uint32( restart_offset )*
> uint16( number_of_restarts )
> padding?
>
...
>
> The 2-byte `number_of_restarts + 1` stores the number of entries in
> the `restart_offset` list.

This means uint16( number_of_restarts ) cannot be 0, but has to be 1
to indicate no restarts?

Why do we need to be non-NUL here, but the padding is all NUL?

When starting to write a block, we need to know exactly how large
the ref_records* and restart offsets need to be to put the
number_of_restarts at the position as promised via block_len.
This sounds complicated unless I missed the obvious.

Going by this, would it rather make sense to omit the block_len
and then scan backwards from *block_size-1 to find the first non-NUL
and that will be the number_of_restarts?

Or we could allow additional padding between ref_record and
restart_offsets, such that the writer implementation has wiggle room
for the restarting logic.

>
>  log record
>
> Log record keys are structured as:
>
> ref_name '\0' reverse_int32( time_sec )
>
> where `time_sec` is the update time in seconds since the epoch.

The epoch ends 2038, which is in 21 years. (Did you just volunteer
to fixup the time issues in 20 years?)
If possible I'd prefer a date to be encoded with more range available.

>  The
> `reverse_int32` function inverses the value so lexographical ordering
> the network byte order time sorts more recent records first:
>
> reverse_int(int32 t) {
>   return 0x - t;
> }
>
> Log records have a similar starting structure to ref and index
> records, utilizing the same prefix compression scheme applied to the
> key described above.

The ref names itself are compressed, would we also want to compress
the timing information? The time_sec could be a varint indicating a delta
to the previous change of a ref, or fixed to the epoch if it is the first change
of that ref.

Just to be clear, the ordering here would be

  refs/heads/maint 
  refs/heads/maint 
  ...
  refs/heads/master 
  refs/heads/master 

such that refs that have more than one entry in its reflog in a given
refstable file, would have perfect prefix compression for the ref?

> ### Update transactions
>
> Although reftables are immutable, mutations are supported by writing a
> new reftable and atomically appending it to the stack:
>
> 1. Atomically create `stack.lock`
> 2. Copy current stack to `stack.lock`.
> 3. Prepare new reftable in temp file `.refXXX`.
>Include log entries.
> 4. Rename (3) to `${sha1}.ref`.
> 5. Append `${sha1}.ref` to `stack.lock`
> 6. Atomically rename `stack.lock` to `stack`.

In case 3.+4. becomes time consuming, it can be prepared outside
the lock, such that inside the lock we'd only need to check
for contention of refs? For example if I'd want to update one ref and
another write wants to update another refs, we'd both be preparing
the a new refstable containing one ref and log, then one of us obtains
the lock and writes. The second writer would then need to inspect
the delta of the stack and see if any ref that they want to change
was touched.

> ### Compaction
>
> A partial stack of reftables can be compacted by merging references
> using a straightforward merge join across reftables, selecting the
> most recent value for output, and omitting deleted references that do
> not appear in remaining, lower reftables.
>
> For sake of illustration, assume the stack currently consists of
> reftable files (from oldest to newest): A, B, C, and D. The compactor
> is going to compact B and C, leaving A and D alone.
>
> 1.  Obtain lock `stack.lock` and read the `stack` file.
> 2.  Obtain locks `B.lock` and `C.lock`.
> Ownership of these locks prevents other processes from trying
> to compact these files.
> 3.  Release `stack.lock`.
> 4.  Compact `B` and `C` in temp file `.refXXX`.
> 5.  Reacquire lock `stack.lock`.
> 6.  Verify that `B` and `C` are still in the stack, in that order. This
> should always be the case, assuming that other processes are adhering
> to the locking protocol.
> 7.  Rename `.refXXX` to `X`.
> 8.  Write the new stack to `stack.lock`, replacing `B` and `C` with `X`.
> 9.  Atomically rename `stack.lock` to `stack`.
> 10. Delete `B` and `C`, perhaps after a short sleep to avoid forcing
> readers to backtrack.
>
> This strategy permits compactions to proceed independently of updates.

10. could be deferred to gc as well. auto gc would need to learn about
the number
of loose refstables in that case.

Thanks,
Stefan


Re: Detect invalid submodule names from script?

2017-07-17 Thread Joachim Durchholz

Am 17.07.2017 um 19:49 schrieb Stefan Beller:

On Mon, Jul 17, 2017 at 4:17 AM, Joachim Durchholz  wrote:

Hi all

I'm hacking some script that calls into git, and I need to detect whether a
repository was configured with a submodule name that will work on "git
submodule init" and friends.


There is no such a thing as "git submodule valid-name" unfortunately.
Looking through "git submodule add", I think it is safe to assume
that any string valid as a subsection in git-config is a valid submodule name.


That's what I have been thinking myself.


Our man page says:

 Subsection names are case sensitive and can contain any characters
 except newline (doublequote " and backslash can be included by escaping
 them as \" and \\, respectively).

I am not sure about the quality of submodule shell code to handle the quotations
for double quote and backslash correctly, so I would suggest not using them,
either.


The quality is pretty dubious for path names (where I have torture tests 
in place).
I haven't done these tests for module names yet. I doubt it's going to 
look prettier though.



I *can* run a git init and see whether it works, but I need to be 100% sure
that the error was due to an invalid submodule name and not something else.
Bonus points for every version of git for which it works.


I do not think Git offers a universal solution across versions except actually
running "submodule init" and then making an educated guess if the error
comes from bad naming or something else.


Yeah, I think that's the way to go - clone into a local temp directory, 
set up the submodule to a sane module and subdirectory name, and if that 
works, test what happens with the original configuration.


I'd have preferred something less elaborate though.
E.g. SVN has a --xml option that will output everything in XML format. 
(Not useful for shell scripting but easy for almost any other scripting 
language.)



This sounds like you're taking user input or otherwise untrustworthy data
as submodule names?


I'm auto-downloading submodules as configured in 3rd-party repositories.
Which means untrusted data is copied to the user's home directory, so I 
definitely need to make sure that nothing nasty can happen.


My current code is a /bin/sh script.
Given the security/reliability/robustness problems I keep encountering, 
the temptation to rewrite this in Python has been steadily growing, 
though I'm not quite there yet.


Regards,
Jo


Re: [PATCH v2 1/1] sha1_file: Add support for downloading blobs on demand

2017-07-17 Thread Jonathan Tan
About the difference between this patch and my patch set [1], besides
the fact that this patch does not spawn separate processes for each
missing object, which does seem like an improvement to me, this patch
(i) does not use a list of promised objects (but instead communicates
with the hook for each missing object), and (ii) provides backwards
compatibility with other Git code (that does not know about promised
objects) in a different way.

The costs and benefits of (i) are being discussed here [2]. As for (ii),
I still think that my approach is better - I have commented more about
this below.

Maybe the best approach is a combination of both our approaches.

[1] 
https://public-inbox.org/git/34efd9e9936fdab331655f5a33a098a72dc134f4.1499800530.git.jonathanta...@google.com/

[2] 
https://public-inbox.org/git/20170713123951.5cab1...@twelve2.svl.corp.google.com/

On Fri, 14 Jul 2017 09:26:51 -0400
Ben Peart  wrote:

> +
> +packet: git> command=get
> +packet: git> sha1=0a214a649e1b3d5011e14a3dc227753f2bd2be05
> +packet: git> 
> +

It would be useful to have this command support more than one SHA-1, so
that hooks that know how to batch can do so.

> +static int subprocess_map_initialized;
> +static struct hashmap subprocess_map;

The documentation of "tablesize" in "struct hashmap" states that it can
be used to check if the hashmap is initialized, so
subprocess_map_initialized is probably unnecessary.

>  static int check_and_freshen(const unsigned char *sha1, int freshen)
>  {
> - return check_and_freshen_local(sha1, freshen) ||
> -check_and_freshen_nonlocal(sha1, freshen);
> + int ret;
> + int already_retried = 0;
> +
> +retry:
> + ret = check_and_freshen_local(sha1, freshen) ||
> + check_and_freshen_nonlocal(sha1, freshen);
> + if (!ret && core_virtualize_objects && !already_retried) {
> + already_retried = 1;
> + if (!read_object_process(sha1))
> + goto retry;
> + }
> +
> + return ret;
>  }

Is this change meant to ensure that Git code that operates on loose
objects directly (bypassing storage-agnostic functions such as
sha1_object_info_extended() and has_sha1_file()) still work? If yes,
this patch appears incomplete (for example, read_loose_object() needs to
be changed too), and this seems like a difficult task - in my patch set
[1], I ended up deciding to create a separate type of storage and
instead looked at the code that operates on *packed* objects directly
(because there were fewer such methods) to ensure that they would work
correctly in the presence of a separate type of storage.

[1] 
https://public-inbox.org/git/34efd9e9936fdab331655f5a33a098a72dc134f4.1499800530.git.jonathanta...@google.com/


Re: [RFC PATCH 0/3] Partial clone: promised blobs (formerly "missing blobs")

2017-07-17 Thread Jonathan Nieder
Hi Philip,

Philip Oakley wrote:
> From: "Jonathan Tan" 

>> These patches are part of a set of patches implementing partial clone,
>> as you can see here:
>>
>> https://github.com/jonathantanmy/git/tree/partialclone
[...]
> If I understand correctly, this method doesn't give any direct user
> visibility of missing blobs in the file system. Is that correct?
>
> I was hoping that eventually the various 'on demand' approaches
> would still allow users to continue to work as they go off-line such
> that they can see directly (in the FS) where the missing blobs (and
> trees) are located, so that they can continue to commit new work on
> existing files.
>
> I had felt that some sort of 'gitlink' should be present (huma
> readable) as a place holder for the missing blob/tree. e.g.
> 'gitblob: 1234abcd' (showing the missing oid, jsut like sub-modules
> can do - it's no different really.

That's a reasonable thing to want, but it's a little different from
the use cases that partial clone work so far has aimed to support.
They are:

 A. Avoiding downloading all blobs (and likely trees as well) that are
not needed in the current operation (e.g. checkout).  This blends
well with the sparse checkout feature, which allows the current
checkout to be fairly small in a large repository.

GVFS uses a trick that makes it a little easier to widen a sparse
checkout upon access of a directory.  But the same building blocks
should work fine with a sparse checkout that has been set up
explicitly.

 B. Avoiding downloading large blobs, except for those needed in the
current operation (e.g. checkout).

When not using sparse checkout, the main benefit out of the box is
avoiding downloading *historical versions* of large blobs.

It sounds like you are looking for a sort of placeholder outside the
sparse checkout area.  In a way, that's orthogonal to these patches:
even if you have all relevant blobs, you may want to avoid inflating
them to check them out and reading them to compare to the index (i.e.
the usual benefits of sparse checkout).  In a sparse checkout, you
still might like to be able to get a listing of files outside the
sparse area (which you can get with "git ls-tree") and you may even
want to be able to get such a listing with plain "ls" (as with your
proposal).

Thanks and hope that helps,
Jonathan


Re: What's cooking in git.git (Jul 2017, #04; Thu, 13)

2017-07-17 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason  writes:

> Let's see what *nix does:
>
> $ rm -rf /tmp/{master,backup}; mkdir /tmp/master && cd /tmp/master && mv 
> /tmp/{master,backup} ; file /tmp/{master,backup}
>
> Similarly to that, when you're on "master" "git branch --move backup"
> could have left you on an orphan branch, but it doesn't, it's the
> equivalent of "mv && cd" in *nix terms.

And with the same analogy,

mkdir /tmp/master
cd /tmp/master
cp -R /tmp/master /tmp/backup
pwd

would show you that you are still in /tmp/master.  Which is quite
different from what happens with "mv".  So what's your point?

In any case, you already realize that it is a crazy and convoluted
example nobody does in real life to go in to one directory *and*
rename it while you are inside it, don't you?

The reason why I do not think it is wise to move to the backup
branch when you did

git checkout master
git branch --copy master backup

is not because of that crazy analogy, though.  It is primarily that
I think one of the major reasons people want to do a "copy" to a
backup is to keep the pre-modification state of a thing that they
may be able to later come back to when it turns out to be needed.
For that workflow, with the above "branch --copy", your user is
planning to make a change that is not usually kept track of by Git
to the "master" branch (perhaps branch.*.description is changed,
perhaps the remote-tracking branch it is set to integrate with is
changed, etc.), but the user is unsuare, and preparing a backup that
can be used with "gir branch -M backup master" to come back to.  

You can do all that even if your "branch --copy" initially takes you
to the "backup" branch you just created, but then the user has to
immediately come back to "master" to start doing the major surgery
on "master", for which the "copy" was a preparation to recover.

> So since our --move is really --move-and-checkout I think it would be
> confusing to introduce a --copy sister option that has the semantics of
> --copy-no-checkout instead of a corresponding --copy-and-checkout.

Our --move is *NOT* --move-and-checkout.

git chekckout master
git branch naster
git branch -m naster aster

will *not* move "naster" to "aster" and check it out.  During the
whole exercise, you will stay on "master".  

When you rename the current branch so that the name of the current
branch will no longer exist in the system, you _could_ detach the
HEAD, or you _could_ move the current branch to the new name.  There
is no sensible alternative other than these two, but either way, you
need to have a special case for renaming the current branch.  It's
just the latter is more useful in practice and that is the only
reason why it moves _you_ along with the current branch if you
happen to be on that branch being renamed.

I do not see much reason why such a special case has to apply to
"copy".  The source of the copy is not going away, and in the
"backup" scenario, moving away from the thing that is backed up, in
preparation for further work that may have to be reverted, is
actively counter-productive.

There however _is_ an opposite use case.  You may want to start
working on a _new_ branch that is more similarly set up to what your
current branch is, _and_ you want the new branch to start at your
current branch.  But I think that should be done by adding an
enhanced mode of "checkout -b".  IOW, I think

git checkout master
git checkout -b --with-configuration naster [master]

is a very sensible thing to desire; if "master" is set to integrate
with refs/remotes/origin/master, the new "naster" branch may want to
integrate with refs/remotes/origin/naster (instead of the local
"master", which is what the traditional "checkout -b" would do), for
example.  Or you could do the same thing with

git branch --copy master naster
git checkout naster

if your "branch --copy" does not switch out of the current branch.






Re: Detect invalid submodule names from script?

2017-07-17 Thread Stefan Beller
On Mon, Jul 17, 2017 at 4:17 AM, Joachim Durchholz  wrote:
> Hi all
>
> I'm hacking some script that calls into git, and I need to detect whether a
> repository was configured with a submodule name that will work on "git
> submodule init" and friends.

There is no such a thing as "git submodule valid-name" unfortunately.
Looking through "git submodule add", I think it is safe to assume
that any string valid as a subsection in git-config is a valid submodule name.
Our man page says:

Subsection names are case sensitive and can contain any characters
except newline (doublequote " and backslash can be included by escaping
them as \" and \\, respectively).

I am not sure about the quality of submodule shell code to handle the quotations
for double quote and backslash correctly, so I would suggest not using them,
either.

> I *can* run a git init and see whether it works, but I need to be 100% sure
> that the error was due to an invalid submodule name and not something else.
> Bonus points for every version of git for which it works.

I do not think Git offers a universal solution across versions except actually
running "submodule init" and then making an educated guess if the error
comes from bad naming or something else.

This sounds like you're taking user input or otherwise untrustworthy data
as submodule names?

Thanks,
Stefan

>
> Any suggestions?
> Thanks!
>
> Regards,
> Jo


Re: [PATCH 04/33] notes: make get_note return pointer to struct object_id

2017-07-17 Thread Brandon Williams
On 07/15, René Scharfe wrote:
> Am 30.05.2017 um 19:30 schrieb Brandon Williams:
> > @@ -392,7 +392,7 @@ static int add(int argc, const char **argv, const char 
> > *prefix)
> > const char *object_ref;
> > struct notes_tree *t;
> > unsigned char object[20], new_note[20];
> > -   const unsigned char *note;
> > +   const struct object_id *note;
> > struct note_data d = { 0, 0, NULL, STRBUF_INIT };
> > struct option options[] = {
> > { OPTION_CALLBACK, 'm', "message", , N_("message"),
> 
> In between here, note can be set to NULL...
> 
> > @@ -453,7 +453,7 @@ static int add(int argc, const char **argv, const char 
> > *prefix)
> > sha1_to_hex(object));
> > }
> >   
> > -   prepare_note_data(object, , note);
> > +   prepare_note_data(object, , note->hash);
> 
> ... which we then dereference here.
> 
> > @@ -598,13 +598,13 @@ static int append_edit(int argc, const char **argv, 
> > const char *prefix)
> > t = init_notes_check(argv[0], NOTES_INIT_WRITABLE);
> > note = get_note(t, object);
> >   
> > -   prepare_note_data(object, , edit ? note : NULL);
> > +   prepare_note_data(object, , edit && note ? note->hash : NULL);
> 
> Here a NULL check was added; we need a similar one above as well.
> 
> -- >8 --
> Subject: [PATCH] notes: don't access hash of NULL object_id pointer
> 
> Check if note is NULL, as we already do for different purposes a few
> lines above, and pass a NULL pointer to prepare_note_data() in that
> case instead of trying to access the hash member.

Looks good, thanks for catching this!

> 
> Found with Clang's UBSan.
> 
> Signed-off-by: Rene Scharfe 
> ---
> The third parameter of prepare_note_data() could easily be turned into
> an object_id pointer (and it should), but this patch is meant to be a
> minimal fix.
> 
>  builtin/notes.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/builtin/notes.c b/builtin/notes.c
> index 77573cf1ea..4303848e04 100644
> --- a/builtin/notes.c
> +++ b/builtin/notes.c
> @@ -456,7 +456,7 @@ static int add(int argc, const char **argv, const char 
> *prefix)
>   oid_to_hex());
>   }
>  
> - prepare_note_data(, , note->hash);
> + prepare_note_data(, , note ? note->hash : NULL);
>   if (d.buf.len || allow_empty) {
>   write_note_data(, new_note.hash);
>   if (add_note(t, , _note, combine_notes_overwrite))
> -- 
> 2.13.3

-- 
Brandon Williams


Re: [RFC PATCH 0/3] Partial clone: promised blobs (formerly "missing blobs")

2017-07-17 Thread Ben Peart



On 7/16/2017 11:23 AM, Philip Oakley wrote:

From: "Jonathan Tan" 
Sent: Tuesday, July 11, 2017 8:48 PM

These patches are part of a set of patches implementing partial clone,
as you can see here:

https://github.com/jonathantanmy/git/tree/partialclone

In that branch, clone with batch checkout works, as you can see in the
README. The code and tests are generally done, but some patches are
still missing documentation and commit messages.

These 3 patches implement the foundational concept - formerly known as
"missing blobs" in the "missing blob manifest", I decided to call them
"promised blobs". The repo knows their object names and sizes. It also
does not have the blobs themselves, but can be configured to know how to
fetch them.

If I understand correctly, this method doesn't give any direct user 
visibility of missing blobs in the file system. Is that correct?


That is correct



I was hoping that eventually the various 'on demand' approaches would 
still allow users to continue to work as they go off-line such that they 
can see directly (in the FS) where the missing blobs (and trees) are 
located, so that they can continue to commit new work on existing files.




This is a challenge as git assumes all objects are always available 
(that is a key design principal of a DVCS) so any missing object is 
considered a corruption that typically results in a call to "die."


The GVFS solution gets around this by ensuring any missing object is 
retrieved on behalf of git so that it never sees it as missing.  The 
obvious tradeoff is that this requires a network connection so the 
object can be retrieved.


I had felt that some sort of 'gitlink' should be present (huma readable) 
as a place holder for the missing blob/tree. e.g. 'gitblob: 1234abcd' 
(showing the missing oid, jsut like sub-modules can do - it's no 
different really.




We explored that option briefly but when you have a large number of 
files, even writing out some sort of place holder can take a very long 
time.  In fact, since the typical source file is relatively small (a few 
kilobytes), writing out a placeholder doesn't save much time vs just 
writing out the actual file contents.


Another challenge is that even if there is a placeholder written to 
disk, you still need a network connection to retrieve the actual 
contents if/when it is needed.


I'm concerned that the various GVFS extensions haven't fully achieved a 
separation of concerns surrounding the DVCS capability for 
on-line/off-line conversion as comms drop in and out. The GVFS looks 
great for a fully networked, always on, environment, but it would be 
good to also have the sepration for those who (will) have shallow/narrow 
clones that may also need to work with a local upstream that is also 
shallow/narrow.




You are correct that this hasn't been tackled yet. It is a challenging 
problem. I can envision something along the lines of what was done for 
the shallow clone feature where there are distinct ways to change the 
set of objects that are available but that would hopefully come in some 
future patch series.



--
Philip
I wanted to at least get my thoughts into the discussion before it all 
passes by.



An older version of these patches was sent as a single demonstration
patch in versions 1 to 3 of [1]. In there, Junio suggested that I have
only one file containing missing blob information. I have made that
suggested change in this version.

One thing remaining is to add a repository extension [2] so that older
versions of Git fail immediately instead of trying to read missing
blobs, but I thought I'd send these first in order to get some initial
feedback.

[1] 
https://public-inbox.org/git/cover.1497035376.git.jonathanta...@google.com/ 


[2] Documentation/technical/repository-version.txt

Jonathan Tan (3):
 promised-blob, fsck: introduce promised blobs
 sha1-array: support appending unsigned char hash
 sha1_file: add promised blob hook support

Documentation/config.txt   |   8 ++
Documentation/gitrepository-layout.txt |   8 ++
Makefile   |   1 +
builtin/cat-file.c |   9 ++
builtin/fsck.c |  13 +++
promised-blob.c| 170 
+

promised-blob.h|  27 ++
sha1-array.c   |   7 ++
sha1-array.h   |   1 +
sha1_file.c|  44 ++---
t/t3907-promised-blob.sh   |  65 +
t/test-lib-functions.sh|   6 ++
12 files changed, 345 insertions(+), 14 deletions(-)
create mode 100644 promised-blob.c
create mode 100644 promised-blob.h
create mode 100755 t/t3907-promised-blob.sh

--
2.13.2.932.g7449e964c-goog





Re: [PATCH] push: disable lazy --force-with-lease by default

2017-07-17 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason  writes:

> Once Francesco's patch is also applied (or some version thereof) you can
> then set push.AlwaysForceWithElease to make --force mean
> --force-with-lease, which is disabled by default.
>
> I think this is really crappy UX design. Now in some future version of
> Git I need to set a config option *and* type a longer option name to get
> behavior that's an improvement over --force.

After thinking about this a bit longer, I changed my mind.

I do not think a configuration option that turns "push --force" with
anything but "unconditionally force, I mean it" is acceptable;
otherwise use of "--force" as the last resort that always does what
the option means will be broken, and I do not think you can ask
everybody who has scripts to do so to temporarily disable the
configuration in them.

What we could do is to improve the "--force-with-lease" that does
not say what exact object should be there at the remote for the
non-ff push to go through.  In another thread there was a discussion
to improve the logic to ignore remote-tracking branch and instead
use the reflog of the _source_ of the push, and I think that is a
lot saner and safer heuristic than what we currently has.

After that happens, we probably could give it an even shorter
synonym to the option.  But even then, it is unwise to use "--force"
as that shorter synonym.

Thanks.


Re: [PATCH] check-ref-format: require a repository for --branch

2017-07-17 Thread Jonathan Nieder
Hi,

Jeff King wrote:
> On Fri, Jul 14, 2017 at 02:03:13PM -0400, Jeff King wrote:

>> So I think the patch below is probably the right direction.
>
> And here it is with a real commit message, if this is what we want to
> do.
[...]
> --- a/t/t1402-check-ref-format.sh
> +++ b/t/t1402-check-ref-format.sh
> @@ -161,6 +161,10 @@ test_expect_success 'check-ref-format --branch from 
> subdir' '
>   test "$refname" = "$sha1"
>  '
>  
> +test_expect_success 'check-ref-format --branch from non-repo' '
> + test_must_fail nongit git check-ref-format --branch @{-1}
> +'
> +
>  valid_ref_normalized() {
>   prereq=
>   case $1 in

I don't think it's right.  Today I can do

$ cd /tmp
$ git check-ref-format --branch master
master

You might wonder why I'd ever do such a thing.  But that's what "git
check-ref-format --branch" is for --- it is for taking a 
argument and turning it into a branch name.  For example, if you have
a script with an $opt_branch variable that defaults to "master", it
may do

resolved_branch=$(git check-ref-format --branch "$opt_branch")

even though it is in a mode that not going to have to use
$resolved_branch and it is not running from a repository.

Thanks and hope that helps,
Jonathan


Re: [PATCH] commit & merge: modularize the empty message validator

2017-07-17 Thread Junio C Hamano
Christian Brabandt  writes:

> On Do, 13 Jul 2017, Junio C Hamano wrote:
>
>> I think many people know about and do use the "delete all lines"
>> (i.e. ":1,$d" in vi, or \M-< \C-SPC \M-> \C-w in Emacs) to abort out
>> of a commit or a merge.  I just do not think it is likely for them
>> to leave Sign-off lines and remove everything else, which is more
>> work than to delete everything, hence my reaction.
>
> In Vim you can also abort the commit message using :cq which exits the 
> editor with an error code.

Sure, but it's not like we are trying to come up with an education
material to teach people how to abort their commit in progress in
this discussion, so I do not quite see a relevance of your comment
to the topic at hand here.


Re: [PATCH] PRItime: wrap PRItime for better l10n compatibility

2017-07-17 Thread Junio C Hamano
Jiang Xin  writes:

> Commit cb71f8bdb5 ("PRItime: introduce a new "printf format" for
> timestamps", 2017-04-21) does not play well with i18n framework. The
> static string concatenation cannot be correctly interpreted by
> gettext utilities, such as xgettext.
>
> Wrap PRItime in format_raw_time() function, so that we can extract
> correct l10n messages into "po/git.pot".
>
> Reported-by: Jean-Noël Avila 
> Signed-off-by: Jiang Xin 
> ...
> @@ -191,6 +200,15 @@ struct date_mode *date_mode_from_type(enum 
> date_mode_type type)
>   return 
>  }
>  
> +const char *format_raw_time(timestamp_t time)
> +{
> + static struct strbuf time_buf = STRBUF_INIT;
> +
> + strbuf_reset(_buf);
> + strbuf_addf(_buf, "%"PRItime, time);
> + return time_buf.buf;
> +}

Hmm.

Two potential issues are:

 - After this patch, there still are quite a many 

printf("time is %"PRItime" ...\n", timestamp)

   so the burden on the programmers having to remember when it is
   required to use format_raw_time() becomes unclear, and makes the
   change/churn larger when an existing message needs to be marked
   for translation.

 - The static struct strbuf here is a cheap way to avoid leaks, but
   at the same time it is unfriendly to threaded code.  We could
   instead do:

void append_PRItime(struct strbuf *buf, timestamp_t time);

   to fix that trivially, but the damage to the caller obviously is
   much larger going this way.



Re: [L10N] Kickoff of translation for Git 2.14.0 round 1

2017-07-17 Thread Johannes Schindelin
Hi,

On Mon, 17 Jul 2017, Jiang Xin wrote:

> 2017-07-16 3:30 GMT+08:00 Jean-Noël Avila :
> >
> >
> > A few remarks on i18n:
> >
> >  * commit cb71f8bdb5 ("PRItime: introduce a new "printf format" for
> > timestamps") does not play well with i18n framework. The static string
> > concatenation cannot be correctly interpreted by msgmerge. I don't know
> > how we can combine variable format indicators with translatable strings.
> >
> 
> We can add a new wrapper for raw timestamp like:
> 
> +const char *format_raw_time(timestamp_t time)
> +{
> +   static struct strbuf time_buf = STRBUF_INIT;
> +
> +   strbuf_reset(_buf);
> +   strbuf_addf(_buf, "%"PRItime, time);
> +   return time_buf.buf;
> +}
> 
> 
> , and replace macro PRItime in i18n messages with format_raw_time
> wrapper, like this:
> 
> -   strbuf_addf(, Q_("%"PRItime" year",
> "%"PRItime" years", years), years);
> +   strbuf_addf(, Q_("%s year", "%s years",
> years), format_raw_time(years));

That would come at the price of complexifying the code just to accommodate
a translation tool.

How do you gentle people deal with PRIuMAX?

Ciao,
Dscho

Re: [L10N] af.po: New translation workflow

2017-07-17 Thread Jiang Xin
2017-07-17 1:37 GMT+08:00 Jacques Viviers :
> Hi all,
>
> I have started a new translation af.po at:
> (https://github.com/jbviviers/git-po)
>
> The translation is far from complete. Should I wait until it is all
> done before I submit the patch, or is it better to submit as I go
> along?
>

After you complete most of the translations, send a PR to the l10n
coordinator's repo (git-l10n/git-po on GitHub).

> Although it is unlikely that someone else is busy with the same thing,
> it would still be better to somehow notify other potential translators
> that this work is in progress.
>

A work-in-progress (WIP) branch will be created to let other guys know
that l10n on Afrikaans is working in progress.

> There is a few of my colleagues that indicated they would help with
> this translation. How does that collaboration work? Do I merge
> everyone's contributions as I see fit and then send in the set of
> patches?
>

You can review contributions using pull requests on your GitHub repo
(jbviviers/git-po), and squash them into one or several commits before
send pull request to the l10n coordinator's repo (git-l10n/git-po on
GitHub).

> Some advice would be much appreciated.
>
> Kind regards,
> Jacques



-- 
Jiang Xin


[PATCH] doc: camelCase the i18n config variables to improve readability

2017-07-17 Thread Kaartic Sivaraam
The i18n config variable used weren't readable as they were in
the crude form of how git stores/uses it's config variables.

Improve it's readability by replacing them with camelCased versions
of config variables as it doesn't have any impact on it's usage.

Signed-off-by: Kaartic Sivaraam 
---
 Documentation/i18n.txt | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/Documentation/i18n.txt b/Documentation/i18n.txt
index 2dd79db5c..7e36e5b55 100644
--- a/Documentation/i18n.txt
+++ b/Documentation/i18n.txt
@@ -42,11 +42,11 @@ mind.
 +
 
 [i18n]
-   commitencoding = ISO-8859-1
+   commitEncoding = ISO-8859-1
 
 +
 Commit objects created with the above setting record the value
-of `i18n.commitencoding` in its `encoding` header.  This is to
+of `i18n.commitEncoding` in its `encoding` header.  This is to
 help other people who look at them later.  Lack of this header
 implies that the commit log message is encoded in UTF-8.
 
@@ -54,15 +54,15 @@ implies that the commit log message is encoded in UTF-8.
   `encoding` header of a commit object, and try to re-code the
   log message into UTF-8 unless otherwise specified.  You can
   specify the desired output encoding with
-  `i18n.logoutputencoding` in `.git/config` file, like this:
+  `i18n.logOutputEncoding` in `.git/config` file, like this:
 +
 
 [i18n]
-   logoutputencoding = ISO-8859-1
+   logOutputEncoding = ISO-8859-1
 
 +
 If you do not have this configuration variable, the value of
-`i18n.commitencoding` is used instead.
+`i18n.commitEncoding` is used instead.
 
 Note that we deliberately chose not to re-code the commit log
 message when a commit is made to force UTF-8 at the commit
-- 
2.13.3.973.gaaf39e6bd.dirty



[PATCH] doc: reformat the paragraph containing the 'cut-line'

2017-07-17 Thread Kaartic Sivaraam
The paragraph that describes the 'scissors' cleanup mode of
'commit' had the 'cut-line' in the middle of a sentence. This
made it possible for the line to get wrapped on smaler windows.
This shouldn't be the case as it makes it hard for the user to
understand the structure of the cut-line.

Reformat the pragraph to make the 'cut-line' stand on a line of
it's own thus distinguishing it from the rest of the paragraph.
This further prevents it from getting wrapped to some extent.

Signed-off-by: Kaartic Sivaraam 
---
 Documentation/git-commit.txt | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/Documentation/git-commit.txt b/Documentation/git-commit.txt
index afb06adba..2c0d48bab 100644
--- a/Documentation/git-commit.txt
+++ b/Documentation/git-commit.txt
@@ -196,11 +196,12 @@ whitespace::
 verbatim::
Do not change the message at all.
 scissors::
-   Same as `whitespace`, except that everything from (and
-   including) the line
-   "`#  >8 `"
-   is truncated if the message is to be edited. "`#`" can be
-   customized with core.commentChar.
+   Same as `whitespace` except that everything from (and including)
+the line found below is truncated, if the message is to be edited.
+"`#`" can be customized with core.commentChar.
+
+#  >8 
+
 default::
Same as `strip` if the message is to be edited.
Otherwise `whitespace`.
-- 
2.13.3.973.gaaf39e6bd.dirty



[PATCH] commit: check for empty message before the check for untouched template

2017-07-17 Thread Kaartic Sivaraam
The check for whether the template given to 'git commit' is untouched
is done before the empty message check. This results in a wrong error
message being displayed in the following case. When the user removes
everything in template completely to abort the commit he is shown the
"template untouched" error which is wrong. He should be shown the
"empty message" error.

Do the empty message check before checking for an untouched template
thus fixing this issue.

Signed-off-by: Kaartic Sivaraam 
---
 builtin/commit.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/builtin/commit.c b/builtin/commit.c
index aff6bf7aa..9c299d792 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -1736,19 +1736,19 @@ int cmd_commit(int argc, const char **argv, const char 
*prefix)
if (verbose || /* Truncate the message just before the diff, if any. */
cleanup_mode == CLEANUP_SCISSORS)
strbuf_setlen(, wt_status_locate_end(sb.buf, sb.len));
-
if (cleanup_mode != CLEANUP_NONE)
strbuf_stripspace(, cleanup_mode == CLEANUP_ALL);
-   if (template_untouched() && !allow_empty_message) {
-   rollback_index_files();
-   fprintf(stderr, _("Aborting commit; you did not edit the 
message.\n"));
-   exit(1);
-   }
+
if (message_is_empty() && !allow_empty_message) {
rollback_index_files();
fprintf(stderr, _("Aborting commit due to empty commit 
message.\n"));
exit(1);
}
+   if (template_untouched() && !allow_empty_message) {
+   rollback_index_files();
+   fprintf(stderr, _("Aborting commit; you did not edit the 
message.\n"));
+   exit(1);
+   }
 
if (amend) {
const char *exclude_gpgsig[2] = { "gpgsig", NULL };
-- 
2.13.3.973.gaaf39e6bd.dirty



[PATCH] PRItime: wrap PRItime for better l10n compatibility

2017-07-17 Thread Jiang Xin
Commit cb71f8bdb5 ("PRItime: introduce a new "printf format" for
timestamps", 2017-04-21) does not play well with i18n framework. The
static string concatenation cannot be correctly interpreted by
gettext utilities, such as xgettext.

Wrap PRItime in format_raw_time() function, so that we can extract
correct l10n messages into "po/git.pot".

Reported-by: Jean-Noël Avila 
Signed-off-by: Jiang Xin 
---
 archive-zip.c  |  4 ++--
 builtin/blame.c|  4 ++--
 builtin/fsck.c |  3 ++-
 builtin/log.c  |  4 ++--
 builtin/receive-pack.c |  4 ++--
 builtin/rev-parse.c|  2 +-
 cache.h|  1 +
 date.c | 63 --
 fetch-pack.c   |  2 +-
 refs/files-backend.c   |  6 ++---
 upload-pack.c  |  2 +-
 vcs-svn/fast_export.c  |  8 ---
 12 files changed, 63 insertions(+), 40 deletions(-)

diff --git a/archive-zip.c b/archive-zip.c
index e8913e5a26..92df24815e 100644
--- a/archive-zip.c
+++ b/archive-zip.c
@@ -600,8 +600,8 @@ static void dos_time(timestamp_t *timestamp, int *dos_date, 
int *dos_time)
struct tm *t;
 
if (date_overflows(*timestamp))
-   die("timestamp too large for this system: %"PRItime,
-   *timestamp);
+   die("timestamp too large for this system: %s",
+   format_raw_time(*timestamp));
time = (time_t)*timestamp;
t = localtime();
*timestamp = time;
diff --git a/builtin/blame.c b/builtin/blame.c
index bda1a78726..97bd99dd51 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -245,11 +245,11 @@ static int emit_one_suspect_detail(struct blame_origin 
*suspect, int repeat)
get_commit_info(suspect->commit, , 1);
printf("author %s\n", ci.author.buf);
printf("author-mail %s\n", ci.author_mail.buf);
-   printf("author-time %"PRItime"\n", ci.author_time);
+   printf("author-time %s\n", format_raw_time(ci.author_time));
printf("author-tz %s\n", ci.author_tz.buf);
printf("committer %s\n", ci.committer.buf);
printf("committer-mail %s\n", ci.committer_mail.buf);
-   printf("committer-time %"PRItime"\n", ci.committer_time);
+   printf("committer-time %s\n", format_raw_time(ci.committer_time));
printf("committer-tz %s\n", ci.committer_tz.buf);
printf("summary %s\n", ci.summary.buf);
if (suspect->commit->object.flags & UNINTERESTING)
diff --git a/builtin/fsck.c b/builtin/fsck.c
index 99dea7adf6..1f6fc674c3 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -407,7 +407,8 @@ static void fsck_handle_reflog_oid(const char *refname, 
struct object_id *oid,
if (timestamp && name_objects)
add_decoration(fsck_walk_options.object_names,
obj,
-   xstrfmt("%s@{%"PRItime"}", refname, 
timestamp));
+   xstrfmt("%s@{%s}", refname,
+   format_raw_time(timestamp)));
obj->used = 1;
mark_object_reachable(obj);
} else {
diff --git a/builtin/log.c b/builtin/log.c
index c6362cf92e..83b303f8af 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -920,8 +920,8 @@ static void get_patch_ids(struct rev_info *rev, struct 
patch_ids *ids)
 static void gen_message_id(struct rev_info *info, char *base)
 {
struct strbuf buf = STRBUF_INIT;
-   strbuf_addf(, "%s.%"PRItime".git.%s", base,
-   (timestamp_t) time(NULL),
+   strbuf_addf(, "%s.%s.git.%s", base,
+   format_raw_time((timestamp_t) time(NULL)),

git_committer_info(IDENT_NO_NAME|IDENT_NO_DATE|IDENT_STRICT));
info->message_id = strbuf_detach(, NULL);
 }
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index cabdc55e09..75f26a1bc5 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -460,12 +460,12 @@ static char *prepare_push_cert_nonce(const char *path, 
timestamp_t stamp)
struct strbuf buf = STRBUF_INIT;
unsigned char sha1[20];
 
-   strbuf_addf(, "%s:%"PRItime, path, stamp);
+   strbuf_addf(, "%s:%s", path, format_raw_time(stamp));
hmac_sha1(sha1, buf.buf, buf.len, cert_nonce_seed, 
strlen(cert_nonce_seed));;
strbuf_release();
 
/* RFC 2104 5. HMAC-SHA1-80 */
-   strbuf_addf(, "%"PRItime"-%.*s", stamp, 20, sha1_to_hex(sha1));
+   strbuf_addf(, "%s-%.*s", format_raw_time(stamp), 20, 
sha1_to_hex(sha1));
return strbuf_detach(, NULL);
 }
 
diff --git a/builtin/rev-parse.c b/builtin/rev-parse.c
index c78b7b33d6..c55872196f 100644
--- a/builtin/rev-parse.c
+++ b/builtin/rev-parse.c
@@ -219,7 +219,7 @@ static void show_datestring(const char *flag, const char 
*datestr)
/* date handling requires both flags and revs 

reftable [v2]: new ref storage format

2017-07-17 Thread Shawn Pearce
This is an updated draft after discussion on list with Peff, Michael
Haggerty, and Dave Borowitz.

You can read a rendered version of this here:
https://googlers.googlesource.com/sop/jgit/+/reftable/Documentation/technical/reftable.md

Biggest changes from the first proposal are:

- reflog is now integrated into reftable
- block headers slightly different
- Peff's stack management idea is used
- Michael's compaction idea is used


## Overview

### Problem statement

Some repositories contain a lot of references (e.g.  android at 866k,
rails at 31k).  The existing packed-refs format takes up a lot of
space (e.g.  62M), and does not scale with additional references.
Lookup of a single reference requires linearly scanning the file.

Atomic pushes modifying multiple references require copying the
entire packed-refs file, which can be a considerable amount of data
moved (e.g. 62M in, 62M out) for even small transactions (2 refs
modified).

Repositories with many loose references occupy a large number of disk
blocks from the local file system, as each reference is its own file
storing 41 bytes (and another file for the corresponding reflog).
This negatively affects the number of inodes available when a large
number of repositories are stored on the same filesystem.  Readers can
be penalized due to the larger number of syscalls required to traverse
and read the `$GIT_DIR/refs` directory.

### Objectives

- Near constant time lookup for any single reference, even when the
  repository is cold and not in process or kernel cache.
- Efficient lookup of an entire namespace, such as `refs/tags/`.
- Occupy less disk space for large repositories.
- Support atomic push `O(size_of_update)` operations.
- Combine reflog storage with ref storage.

### Description

A reftable file is a portable binary file format customized for
reference storage. References are sorted, enabling linear scans,
binary search lookup, and range scans.

Storage in the file is organized into blocks.  Prefix compression
is used within a single block to reduce disk space.  Block size is
tunable by the writer.

### Performance

Space used, packed-refs vs. reftable:

repository | packed-refs | reftable | % original | avg ref
---|:|-:|---:|-:
android|  62.2 M |   27.7 M | 44.4%  | 33 bytes
rails  |   1.8 M |  896.2 K | 47.6%  | 29 bytes
git|  78.7 K |   27.9 K | 40.0%  | 43 bytes
git (heads)|   332 b |222 b | 66.9%  | 37 bytes

Scan (read 866k refs) and lookup (single ref from 866k refs):

format  | scan| lookup
|:|---:
packed-refs |  380 ms | 375420.0 usec
reftable|  125 ms | 42.3 usec

Space used for 149,932 log entries on 43,061 refs,
reflog vs. reftable:

format| size  | avg log
--|--:|---:
$GIT_DIR/logs | 173 M | 1209 bytes
reftable  |   4 M |   30 bytes

## Details

### Peeling

References in a reftable are always peeled.

### Reference name encoding

Reference names should be encoded with UTF-8.

### Ordering

Blocks are lexicographically ordered by their first reference.

### Directory/file conflicts

The reftable format accepts both `refs/heads/foo` and
`refs/heads/foo/bar` as distinct references.

This property is useful for retaining log records in reftable, but may
confuse versions of Git using `$GIT_DIR/refs` directory tree to
maintain references.  Users of reftable may choose to continue to
reject `foo` and `foo/bar` type conflicts to prevent problems for
peers.

## File format

### Structure

A reftable file has the following basic structure:

first_block {
  header
  first_ref_block
}
ref_blocks*
ref_index?
log_blocks*
log_index?
footer

### Block size

The `block_size` is arbitrarily determined by the writer, and does not
have to be a power of 2.  The block size must be larger than the
longest reference name or log entry used in the repository, as
references cannot span blocks.

Powers of two that are friendly to the virtual memory system or
filesystem (such as 4k or 8k) are recommended.  Larger sizes
(64k) can yield better compression.

The largest block size is `16777215` bytes (15.99 MiB).

### Header

A 8-byte header appears at the beginning of the file:

'\1REF'
uint8( version_number = 1 )
uint24( block_size )

The `block_size` and all other uint fields are in network byte order.

### First ref block

The first ref block shares the same block as the file header, and is 8
bytes smaller than all other blocks in the file.  The first block
immediately begins after the file header, at offset 8.

### Ref block format

A ref block is written as:

'r'
uint24 ( block_len )
ref_record*
uint32( restart_offset )*
uint16( number_of_restarts )
padding?

Blocks begin with `block_type = 'r'` and a 3-byte `block_len` which
encodes the number of bytes in the block up to, but not including the
optional 

Detect invalid submodule names from script?

2017-07-17 Thread Joachim Durchholz

Hi all

I'm hacking some script that calls into git, and I need to detect 
whether a repository was configured with a submodule name that will work 
on "git submodule init" and friends.
I *can* run a git init and see whether it works, but I need to be 100% 
sure that the error was due to an invalid submodule name and not 
something else. Bonus points for every version of git for which it works.


Any suggestions?
Thanks!

Regards,
Jo


Re: [PATCH] commit & merge: modularize the empty message validator

2017-07-17 Thread Christian Brabandt

On Do, 13 Jul 2017, Junio C Hamano wrote:

> I think many people know about and do use the "delete all lines"
> (i.e. ":1,$d" in vi, or \M-< \C-SPC \M-> \C-w in Emacs) to abort out
> of a commit or a merge.  I just do not think it is likely for them
> to leave Sign-off lines and remove everything else, which is more
> work than to delete everything, hence my reaction.

In Vim you can also abort the commit message using :cq which exits the 
editor with an error code.

Best,
Christian
-- 
Das Werk soll den Meister loben.


Re: Proposal

2017-07-17 Thread Elodie Azeez
Hello,

Compliment of the day to you. I am Mrs. Elodie, I am sending this brief
letter to solicit your partnership to transfer $5.5 million US Dollars into
your account for investment in your country. I shall send you more information
and procedures when I receive positive response from you.

Thank you.
Mrs. Elodie Azeez


Can You Handle This For Me ?

2017-07-17 Thread Mrs.Susan Permintel
Hello Friend,
Greetings to you. I know that this message might come to you as surprise  
because we don't know each other nor have we ever met before but accept  it 
with an open heart I'm Mrs. Suzan Permintel and  I have been suffering from 
ovarian cancer disease and the doctor says that I have  just few days to leave. 
I am from (Paris) France, but based in Africa Burkina Faso since eight years 
ago as a business woman dealing with gold  exportation and cotton Sales.

Now, that I am about to end the race like this, without any family members and 
no child. I have $2.8 Million US DOLLARS in BANK OF AFRICA  (B.O.A) Burkina 
Faso which I instructed the Bank to give St Andrews  Missionary and Home Kizito 
Orphanage in Burkina Faso. But my mind is not  at rest because I am writing 
this letter now through the help of my  computer beside my sick bed.
I have instructed the bank to transfer the fund to you as a foreigner  that 
will apply to the bank after I have gone that they should release  the fund to 
him/her, but you will assure me that you will take 50  percent of the fund and 
50 percent to the orphanage as I instructed for  my heart to rest in peace. 
Respond to me immediately for further details  since I have just few days to 
end my life due to the ovarian cancer  disease, hoping you will understand my 
situation and give favorable  attention.

Thank you and God bless you,
Mrs.Suzan Permintel


Draft of Git Rev News edition 29

2017-07-17 Thread Christian Couder
Hi,

A draft of a new Git Rev News edition is available here:

  https://github.com/git/git.github.io/blob/master/rev_news/drafts/edition-29.md

Everyone is welcome to contribute in any section either by editing the
above page on GitHub and sending a pull request, or by commenting on
this GitHub issue:

  https://github.com/git/git.github.io/issues/251

You can also reply to this email.

In general all kinds of contribution, for example proofreading,
suggestions for articles or links, help on the issues in GitHub, and
so on, are very much appreciated.

I tried to cc everyone who appears in this edition, but maybe I missed
some people, sorry about that.

Thomas, Jakub, Markus and myself plan to publish this edition on
Wednesday July 19th.

Thanks,
Christian.