[no subject]

2017-07-20 Thread nima.alavi...@gmail.com


ارسال از تلفن همراه Huawei من

What's cooking in git.git (Jul 2017, #06; Thu, 20)

2017-07-20 Thread Junio C Hamano
Here are the topics that have been cooking.  Commits prefixed with
'-' are only in 'pu' (proposed updates) while commits prefixed with
'+' are in 'next'.  The ones marked with '.' do not appear in any of
the integration branches, but I am still holding onto them.

Tagging of -rc1 is delayed, waiting for a resolution on the l10n
issues around '"%"PRItime' custome format specifier, which naturally
cannot be handled by gettext(1) suite nicely.  I think what is queued
on 'pu' based on Dscho's suggestion is a usable workaround, and I
plan to use it unless I hear better ideas in the comint 10+ hours.

You can find the changes described here in the integration branches
of the repositories listed at

http://git-blame.blogspot.com/p/git-public-repositories.html

--
[Graduated to "master"]

* ew/fd-cloexec-fix (2017-07-17) 1 commit
  (merged to 'next' on 2017-07-18 at a3de1b1998)
 + set FD_CLOEXEC properly when O_CLOEXEC is not supported

 Portability/fallback fix.


* jk/build-with-asan (2017-07-17) 1 commit
  (merged to 'next' on 2017-07-18 at f92636c616)
 + Makefile: allow combining UBSan with other sanitizers

 A recent update made it easier to use "-fsanitize=" option while
 compiling but supported only one sanitize option.  Allow more than
 one to be combined, joined with a comma, like "make SANITIZE=foo,bar".


* jk/test-copy-bytes-fix (2017-07-17) 1 commit
  (merged to 'next' on 2017-07-18 at c32c264e96)
 + t: handle EOF in test_copy_bytes()

 A test fix.


* js/alias-case-sensitivity (2017-07-17) 2 commits
  (merged to 'next' on 2017-07-18 at 31641a39f2)
 + alias: compare alias name *case-insensitively*
 + t1300: demonstrate that CamelCased aliases regressed

 A recent update broke an alias that contained an uppercase letter.


* mt/p4-parse-G-output (2017-07-13) 3 commits
  (merged to 'next' on 2017-07-18 at e065b689d4)
 + git-p4: filter for {'code':'info'} in p4CmdList
 + git-p4: parse marshal output "p4 -G" in p4 changes
 + git-p4: git-p4 tests with p4 triggers

 Use "p4 -G" to make "p4 changes" output more Python-friendly
 to parse.

--
[New Topics]

* bw/push-options-recursively-to-submodules (2017-07-20) 1 commit
 - submodule--helper: teach push-check to handle HEAD

 "git push --recurse-submodules $there HEAD:$target" was not
 propagated down to the submodules, but now it is.

 Will merge to and cook in 'next'.


* jc/http-sslkey-and-ssl-cert-are-paths (2017-07-20) 1 commit
  (merged to 'next' on 2017-07-20 at 5489304b99)
 + http.c: http.sslcert and http.sslkey are both pathnames

 The http.{sslkey,sslCert} configuration variables are to be
 interpreted as a pathname that honors "~[username]/" prefix, but
 weren't, which has been fixed.

 Will cook in 'next'.


* jc/po-pritime-fix (2017-07-20) 1 commit
 - Makefile: help gettext tools to cope with our custom PRItime format

 We started using "%" PRItime, imitating "%" PRIuMAX and friends, as
 a way to format the internal timestamp value, but this does not
 play well with gettext(1) i18n framework, and causes "make pot"
 that is run by the l10n coordinator to create a broken po/git.pot
 file.  This is a possible workaround for that problem.

 Will fast-track to 2.14-rc1 once we hear positive result.


* jt/fsck-code-cleanup (2017-07-20) 2 commits
  (merged to 'next' on 2017-07-20 at f7045a8c47)
 + object: remove "used" field from struct object
 + fsck: remove redundant parse_tree() invocation

 Code clean-up.

 Will cook in 'next'.


* rs/pack-objects-pbase-cleanup (2017-07-20) 1 commit
  (merged to 'next' on 2017-07-20 at a6b618559b)
 + pack-objects: remove unnecessary NULL check

 Code clean-up.

 Will cook in 'next'.


* st/lib-gpg-kill-stray-agent (2017-07-20) 1 commit
  (merged to 'next' on 2017-07-20 at 8ea68c483f)
 + t: lib-gpg: flush gpg agent on startup

 Some versions of GnuPG fails to kill gpg-agent it auto-spawned
 and such a left-over agent can interfere with a test.  Work it
 around by attempting to kill one before starting a new test.

 Will cook in 'next'.

--
[Stalled]

* mg/status-in-progress-info (2017-05-10) 2 commits
 - status --short --inprogress: spell it as --in-progress
 - status: show in-progress info for short status

 "git status" learns an option to report various operations
 (e.g. "merging") that the user is in the middle of.

 cf. 


* nd/worktree-move (2017-04-20) 6 commits
 - worktree remove: new command
 - worktree move: refuse to move worktrees with submodules
 - worktree move: accept destination as directory
 - worktree move: new command
 - worktree.c: add update_worktree_location()
 - worktree.c: add validate_worktree()

 "git worktree" learned move and remove subcommands.

 Expecting a reroll.
 cf. <20170420101024.7593-1-pclo...@gmail.com>
 cf. <20170421145916.mknekgqzhxffu...@sigill.intra.peff.net>
 cf. 

Re: [PATCH 21/28] commit_packed_refs(): use a staging file separate from the lockfile

2017-07-20 Thread Jonathan Nieder
Junio C Hamano wrote:
> Michael Haggerty  writes:

>> We will want to be able to hold the lockfile for `packed-refs` even
>> after we have activated the new values. So use a separate tempfile,
>> `packed-refs.new`, as a place to stage the new contents of the
>> `packed-refs` file. For now this is all done within
>> `commit_packed_refs()`, but that will change shortly.
>>
>> Signed-off-by: Michael Haggerty 
>> ---
>
> The layout created by "contrib/workdir/git-new-workdir" will be
> broken by this line of change.  "git worktree" is supposed to know
> that refs/packed-refs is a shared thing and lives in common-dir,
> so it shouldn't be affected.
>
> Do we care about the ancient layout that used symlinks to emulate
> the more modern worktree one?

I think we do care.  In the context of people's changing workflows,
"git worktree" is a relatively new tool.  Breaking the older
git-new-workdir (and tools that emulate it) would affect a large
number of users that don't necessarily know how to clean up the
result.

Thanks,
Jonathan


Re: [PATCH v3 00/30] Create a reference backend for packed refs

2017-07-20 Thread Jonathan Nieder
+cc: dawalker, who reported the bug
Stefan Beller wrote:

> We have a user that reports:
>
>   The issue is for users who have a mirrored repository, "git pack-refs"
>   now overwrites the .git/packed-refs symlink instead of following it and
>   replacing the file it points to.
>
> I suspect this series to be at fault, as the bug report came in a day after
> we deployed next containing these changes.
>
> Do symlinks and packed-refs ring a bell for this series?

contrib/workdir/git-new-workdir installs packed-refs as a symlink.
The reported scenario was with another tool that does something
similar for similar reasons.

Dave Walker wrote:

> In the meantime, since this is linked to "git gc", it can crop up
> nearly at any time you modify things from a mirror. I'd recommend
> extreme care until this is sorted out, and it's probably safest to
> avoid using the mirror for branch-modifying operations.
[...]
> The change at fault is this one:
> https://github.com/gitster/git/commit/42dfa7ecef22191b004862fb56074b408c94fc97

That's "commit_packed_refs(): use a staging file separate from the
lockfile", 2017-06-23, which would indeed appear to explain the
symptoms.

I'll try to make a reproduction recipe.

Thanks,
Jonathan


Re: [PATCH 21/28] commit_packed_refs(): use a staging file separate from the lockfile

2017-07-20 Thread Junio C Hamano
Michael Haggerty  writes:

> We will want to be able to hold the lockfile for `packed-refs` even
> after we have activated the new values. So use a separate tempfile,
> `packed-refs.new`, as a place to stage the new contents of the
> `packed-refs` file. For now this is all done within
> `commit_packed_refs()`, but that will change shortly.
>
> Signed-off-by: Michael Haggerty 
> ---

The layout created by "contrib/workdir/git-new-workdir" will be
broken by this line of change.  "git worktree" is supposed to know
that refs/packed-refs is a shared thing and lives in common-dir,
so it shouldn't be affected.

Do we care about the ancient layout that used symlinks to emulate
the more modern worktree one?


Re: [PATCH v3 00/30] Create a reference backend for packed refs

2017-07-20 Thread Stefan Beller
On Wed, Jul 5, 2017 at 2:12 AM, Jeff King  wrote:
> On Sat, Jul 01, 2017 at 08:30:38PM +0200, Michael Haggerty wrote:
>
>> This is v3 of a patch series creating a `packed_ref_store` reference
>> backend. Thanks to Peff and Junio for their comments about v2 [1].
>>
>> Changes since v2:
>>
>> * Delete some debugging `cat` commands in t1408.
>>
>> * Add some tests of reading packed-refs files with bogus contents.
>>
>> * When reporting corruption in packed-refs files, distinguish between
>>   unterminated lines and other corruption.
>>
>> * Fixed a typo in a commit message.
>
> Thanks. I just quickly re-reviewed based on the diff from v2, and it
> looks good to me.
>
> -Peff

We have a user that reports:

  The issue is for users who have a mirrored repository, "git pack-refs"
  now overwrites the .git/packed-refs symlink instead of following it and
  replacing the file it points to.

I suspect this series to be at fault, as the bug report came in a day after
we deployed next containing these changes.

Do symlinks and packed-refs ring a bell for this series?

Thanks,
Stefan


Re: [PATCH v2 00/10] tag: only respect `pager.tag` in list-mode

2017-07-20 Thread Junio C Hamano
Martin Ågren  writes:

> This is the second version of "[PATCH 0/7] tag: more fine-grained
> pager-configuration" [1]. That series introduced `pager.tag.list` to
> address the fact that `pager.tag` can be useful with `git tag -l` but
> actively hostile with `git tag -a`. Thanks to Junio, Peff and Brandon
> for helpful feedback.
>
> After that feedback, v2 drops `pager.tag.list` and instead teaches
> `git tag` to only consider `pager.tag` in list-mode, as suggested by
> Peff.
>
> Patches 1-3/10 replace patch 1/7. They move Documentation/technical/
> api-builtin.txt into builtin.h, tweak the formatting and bring it up to
> date. I may have gone overboard making this 3 patches...
>
> Patches 4-7/10 correspond to patches 2-5/7. `setup_auto_pager()' is now
> much simpler since we do not need to handle "tag.list" with a clever
> fallback strategy. IGNORE_PAGER_CONFIG is now called DELAY_PAGER_CONFIG.
> I now check with pager_in_use() and I moved the handling of `pager.tag`
> a bit further down.

I tend to agree with you that 1-3/10 may be better off being a
single patch (or 3/10 dropped, as Brandon is working on losing it
nearby).  I would have expected 7-8/10 to be a single patch, as by
the time a reader reaches 07/10, because of the groundwork laid by
04-06/10, it is obvious that the general direction is to allow the
caller, i.e. cmd_tag(), to make a call to setup_auto_pager() only in
some but not all circumstances, and 07/10 being faithful to the
original behaviour (only to be updated in 08/10) is somewhat counter
intuitive.  It is not wrong per-se; it was just unexpected. 

> Patches 8-9/10 teach `git tag` to only respect `pager.tag` in list-mode
> and flip the default value for that config to "on".
>
> Patch 10/10 is somewhat similar to a hunk in patch 2/7, but is now a
> bug-fix instead of a feature. It teaches `execv_dashed_external()` not
> to check `pager.foo` when launching `git-foo` where foo is a builtin.
> I waffled about where to put this patch. Putting it earlier in the
> series as a preparatory step, I couldn't come up with a way of writing a
> test. So patch 8/10 introduces a `test_expect_failure` which this patch
> then fixes.

I haven't thought about ramifications of 9-10/10 to make a comment
yet, but overall the series was a pleasant read.

Thanks.


Re: Fwd: New Defects reported by Coverity Scan for git

2017-07-20 Thread Junio C Hamano
René Scharfe  writes:

> We could remove that NULL check -- it's effectively just a shortcut.
> But how would that improve safety?  Well, if the array is unallocated
> (NULL) and _num is greater than zero we'd get a segfault without it,
> and thus would notice it.  That check currently papers over such a
> hypothetical bug.  Makes sense?
>
> -- >8 --
> Subject: [PATCH] pack-objects: remove unnecessary NULL check
>
> If done_pbase_paths is NULL then done_pbase_paths_num must be zero and
> done_pbase_path_pos() returns -1 without accessing the array, so the
> check is not necessary.
>
> If the invariant was violated then the check would make sure we keep
> on going and allocate the necessary amount of memory in the next
> ALLOC_GROW call.  That sounds nice, but all array entries except for
> one would contain garbage data.
>
> If the invariant was violated without the check we'd get a segfault in
> done_pbase_path_pos(), i.e. an observable crash, alerting us of the
> presence of a bug.
>
> Currently there is no such bug: Only the functions check_pbase_path()
> and cleanup_preferred_base() change pointer and counter, and both make
> sure to keep them in sync.  Get rid of the check anyway to allow us to
> see if later changes introduce such a defect, and to simplify the code.
>
> Detected by Coverity Scan.
>
> Signed-off-by: Rene Scharfe 
> ---

It's always amusing to see that a removal of conditional codepath
would result in better chance of finding possible invariant
breakers, as we often think that belt-and-suspenders safety would
require more conditionals and asserts ;-)



>  builtin/pack-objects.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
> index e730b415bf..c753e9237a 100644
> --- a/builtin/pack-objects.c
> +++ b/builtin/pack-objects.c
> @@ -1289,7 +1289,7 @@ static int done_pbase_path_pos(unsigned hash)
>  
>  static int check_pbase_path(unsigned hash)
>  {
> - int pos = (!done_pbase_paths) ? -1 : done_pbase_path_pos(hash);
> + int pos = done_pbase_path_pos(hash);
>   if (0 <= pos)
>   return 1;
>   pos = -pos - 1;


Re: [PATCH] sha1_file: use access(), not lstat(), if possible

2017-07-20 Thread Junio C Hamano
Jonathan Tan  writes:

> In sha1_loose_object_info(), use access() (indirectly invoked through
> has_loose_object()) instead of lstat() if we do not need the on-disk
> size, as it should be faster on Windows [1].

That sounds as if Windows is the only thing that matters.  "It is
faster in general, and is much faster on Windows" would have been
more convincing, and "It isn't slower, and is much faster on
Windows" would also have been OK.  Do we have any measurement, or
this patch does not yield any measuable gain?  

By the way, the special casing of disk_sizep (which is only used by
the batch-check feature of cat-file) is somewhat annoying with or
without this patch, but this change makes it even more so by adding
an extra indentation level.  I do not think of a way to make it less
annoying offhand, and I do not think this change needs to address it
in any way, but I am mentioning this as a hint to bystanders who may
want to find something small that can be cleaned up ;-)

Thanks.

>
> [1] 
> https://public-inbox.org/git/alpine.DEB.2.21.1.1707191450570.4193@virtualbox/
>
> Signed-off-by: Jonathan Tan 
> ---
> Thanks for the information - here's a patch. Do you, by any chance, know
> of a web page (or similar thing) that I can cite for this?
> ---
>  sha1_file.c | 21 ++---
>  1 file changed, 10 insertions(+), 11 deletions(-)
>
> diff --git a/sha1_file.c b/sha1_file.c
> index fca165f13..81962b019 100644
> --- a/sha1_file.c
> +++ b/sha1_file.c
> @@ -2920,20 +2920,19 @@ static int sha1_loose_object_info(const unsigned char 
> *sha1,
>  
>   /*
>* If we don't care about type or size, then we don't
> -  * need to look inside the object at all. Note that we
> -  * do not optimize out the stat call, even if the
> -  * caller doesn't care about the disk-size, since our
> -  * return value implicitly indicates whether the
> -  * object even exists.
> +  * need to look inside the object at all. We only check
> +  * for its existence.
>*/
>   if (!oi->typep && !oi->typename && !oi->sizep && !oi->contentp) {
> - const char *path;
> - struct stat st;
> - if (stat_sha1_file(sha1, , ) < 0)
> - return -1;
> - if (oi->disk_sizep)
> + if (oi->disk_sizep) {
> + const char *path;
> + struct stat st;
> + if (stat_sha1_file(sha1, , ) < 0)
> + return -1;
>   *oi->disk_sizep = st.st_size;
> - return 0;
> + return 0;
> + }
> + return has_loose_object(sha1) ? 0 : -1;
>   }
>  
>   map = map_sha1_file(sha1, );


Re: [PATCH v6 00/10] The final building block for a faster rebase -i

2017-07-20 Thread Junio C Hamano
Johannes Schindelin  writes:

> Changes since v5:
>
> - replaced a get_sha1() call by a get_oid() call already.
>
> - adjusted to hashmap API changes

Applying this to the tip of 'master' yields exactly the same result
as merging the previous round js/rebase-i-final to the tip of
'master' and then applying merge-fix/js/rebase-i-final to adjust to
the codebase, so the net effect of this reroll is none.  Which is a
good sign, as it means there wasn't any rebase mistake and the evil
merge we've been carrying was a good one.

But at the same time, I prefer to avoid rebasing to newer 'master'
until the codebase starts drifting too far apart, or until a new
feature release is made out of newer 'master'.  This is primarily
because I want dates on commits to mean something---namely, "this
change hasn't seen a need to be updated for 'oops, that was wrong'
since this date".  This use of commit dates as 'priority date'
matters much less for a topic not in 'next', but as a general
principle, my workflow tries to preserve commit dates for all
topics.

For the above reason, I may hold onto this patch series in my inbox
without actually updating js/rebase-i-final topic until the current
cycle is over; please do not mistake it as this new reroll being
ignored.

Thanks.


Re: [RFC PATCH v2 1/4] object: remove "used" field from struct object

2017-07-20 Thread Junio C Hamano
Jonathan Tan  writes:

> The "used" field in struct object is only used by builtin/fsck. Remove
> that field and modify builtin/fsck to use a flag instead.
>
> Signed-off-by: Jonathan Tan 
> ---
>  builtin/fsck.c | 24 ++--
>  object.c   |  1 -
>  object.h   |  2 +-
>  3 files changed, 15 insertions(+), 12 deletions(-)

I vaguely recall trying to do this myself a few years ago.  We can
easily spot places within this file that does things like this:

> - obj->flags = HAS_OBJ;

and correctly update it to this:

> + obj->flags &= ~(REACHABLE | SEEN);
> + obj->flags |= HAS_OBJ;

but I didn't do so because I was hesitant having to validate, and
having to maintain the invariant forever, that anything called from
these codepaths is always careful not to clobber the USED bit.

Looks like a reasonable preparatory clean-up that probably should be
doable and should be done way before the main part of the series to
me.  Will queue, together with your other fsck change.

Thanks.


Re: [RFC PATCH v2 4/4] sha1_file: support promised object hook

2017-07-20 Thread Jonathan Tan
On Thu, 20 Jul 2017 16:58:16 -0400
Ben Peart  wrote:

> >> This is meant as a temporary measure to ensure that all Git commands
> >> work in such a situation. Future patches will update some commands to
> >> either tolerate promised objects (without invoking the hook) or be more
> >> efficient in invoking the promised objects hook.
> 
> I agree that making git more tolerant of promised objects if possible 
> and precomputing a list of promised objects required to complete a 
> particular command and downloading them with a single request are good 
> optimizations to add over time.

That's good to know!

> has_sha1_file also takes a hash "whether local or in an alternate object 
> database, and whether packed or loose" but never calls 
> sha1_object_info_extended.  As a result, we had to add support in 
> check_and_freshen to download missing objects to get proper behavior in 
> all cases.  I don't think this will work correctly without it.

Thanks for the attention to detail. Is this before or after commit
e83e71c ("sha1_file: refactor has_sha1_file_with_flags", 2017-06-26)?


Re: [RFC PATCH v2 2/4] promised-object, fsck: introduce promised objects

2017-07-20 Thread Jonathan Tan
On Thu, 20 Jul 2017 15:58:51 -0400
Ben Peart  wrote:

> On 7/19/2017 8:21 PM, Jonathan Tan wrote:
> > Currently, Git does not support repos with very large numbers of objects
> > or repos that wish to minimize manipulation of certain blobs (for
> > example, because they are very large) very well, even if the user
> > operates mostly on part of the repo, because Git is designed on the
> > assumption that every referenced object is available somewhere in the
> > repo storage.
> > 
> 
> Great to see this idea making progress. Making git able to gracefully 
> handle partial clones (beyond the existing shallow clone support) is a 
> key piece of dealing with very large objects and repos.

Thanks.

> > As a first step to reducing this problem, introduce the concept of
> > promised objects. Each Git repo can contain a list of promised objects
> > and their sizes (if blobs) at $GIT_DIR/objects/promised. This patch
> > contains functions to query them; functions for creating and modifying
> > that file will be introduced in later patches.
> 
> If I'm reading this patch correctly, for a repo to successfully pass 
> "git fsck" either the object or a promise must exist for everything fsck 
> checks.  From the documentation for fsck it says "git fsck defaults to 
> using the index file, all SHA-1 references in refs namespace, and all 
> reflogs (unless --no-reflogs is given) as heads." Doesn't this then 
> imply objects or promises must exist for all objects referenced by any 
> of these locations?
> 
> We're currently in the hundreds of millions of objects on some of our 
> repos so even downloading the promises for all the objects in the index 
> is unreasonable as it is gigabytes of data and growing.

For the index to contain all the files, the repo must already have
downloaded all the trees for HEAD (at least). The trees collectively
contain entries for all the relevant blobs. We need one promise for each
blob, and the size of a promise is comparable to the size of a tree
entry, so the size (of download and storage) needed would be just double
of what we would need if we didn't need promises. This is still only
linear growth, unless you have found that the absolute numbers are too
large?

Also, if the index is ever changed to not have one entry for every file,
we also wouldn't need one promise for every file.

> I think we should have a flag (off by default) that enables someone to 
> say that promised objects are optional. If the flag is set, 
> "is_promised_object" will return success and pass the OBJ_ANY type and a 
> size of -1.
> 
> Nothing today is using the size and in the two places where the object 
> type is being checked for consistency (fsck_cache_tree and 
> fsck_handle_ref) the test can add a test for OBJ_ANY as well.
> 
> This will enable very large numbers of objects to be omitted from the 
> clone without triggering a download of the corresponding number of 
> promised objects.

Eventually I plan to use the size when implementing parameters for
history-searching commands (e.g. "git log -S"), but it's true that
that's in the future.

Allowing promised objects to be optional would indeed solve the issue of
downloading too many promises. It would make the code more complicated,
but I'm not sure by how much.

For example, in this fsck patch, the easiest way I could think of to
have promised objects was to introduce a 3rd state, called "promised",
of "struct object" - one in which the type is known, but we don't have
access to the full "struct commit" or equivalent. And thus fsck could
assume that if the "struct object" is "parsed" or "promised", the type
is known. Having optional promised objects would require that we let
this "promised" state have a type of OBJ_UNKNOWN (or something like
that) - maybe that would be fine, but I haven't looked into this in
detail.

> > A repository that is missing an object but has that object promised is not
> > considered to be in error, so also teach fsck this. As part of doing
> > this, object.{h,c} has been modified to generate "struct object" based
> > on only the information available to promised objects, without requiring
> > the object itself.
> 
> In your work on this, did you investigate if there are other commands 
> (ie repack/gc) that will need to learn about promised objects? Have you 
> had a chance (or have plans) to hack up the test suite so that it runs 
> all tests with promised objects and see what (if anything) breaks?

In one of the subsequent patches, I tried to ensure that all
object-reading functions in sha1_file.c somewhat works (albeit slowly)
in the presence of promised objects - that would cover the functionality
of the other commands. As for hacking up the test suite to run with
promised objects, that would be ideal, but I haven't figured out how to
do that yet.


[PATCH v2] t: lib-gpg: flush gpg agent on startup

2017-07-20 Thread santiago
From: Santiago Torres 

When running gpg-relevant tests, a gpg-daemon is spawned for each
GNUPGHOME used. This daemon may stay running after the test and cache
file descriptors for the trash directories, even after the trash
directory is removed. This leads to ENOENT errors when attempting to
create files if tests are run multiple times.

Add a cleanup script to force flushing the gpg-agent for that GNUPGHOME
(if any) before setting up the GPG relevant-environment.

Helped-by: Junio C Hamano 
Signed-off-by: Santiago Torres 
---
 t/lib-gpg.sh | 1 +
 1 file changed, 1 insertion(+)

diff --git a/t/lib-gpg.sh b/t/lib-gpg.sh
index ec2aa8f68..43679a4c6 100755
--- a/t/lib-gpg.sh
+++ b/t/lib-gpg.sh
@@ -31,6 +31,7 @@ then
chmod 0700 ./gpghome &&
GNUPGHOME="$(pwd)/gpghome" &&
export GNUPGHOME &&
+   (gpgconf --kill gpg-agent 2>&1 >/dev/null || : ) &&
gpg --homedir "${GNUPGHOME}" 2>/dev/null --import \
"$TEST_DIRECTORY"/lib-gpg/keyring.gpg &&
gpg --homedir "${GNUPGHOME}" 2>/dev/null --import-ownertrust \
-- 
2.13.3



Re: Handling of paths

2017-07-20 Thread Victor Toni
2017-07-20 22:30 GMT+02:00 Junio C Hamano :
>
> I've read the function again and I think the attached patch covers
> everything that ought to be a filename.
>
Your swift reaction is very much appreciated.
With the background you gave I just started to to create a patch
myself just to see that you already finished the patch.

Thanks a lot!

Best regards,
Victor


Re: [RFC PATCH v2 4/4] sha1_file: support promised object hook

2017-07-20 Thread Ben Peart



On 7/20/2017 2:23 PM, Stefan Beller wrote:

On Wed, Jul 19, 2017 at 5:21 PM, Jonathan Tan  wrote:

Teach sha1_file to invoke a hook whenever an object is requested and
unavailable but is promised. The hook is a shell command that can be
configured through "git config"; this hook takes in a list of hashes and
writes (if successful) the corresponding objects to the repo's local
storage.

The usage of the hook can be suppressed through a flag when invoking
has_object_file_with_flags() and other similar functions.
parse_or_promise_object() in object.c requires this functionality, and
has been modified to use it.

This is meant as a temporary measure to ensure that all Git commands
work in such a situation. Future patches will update some commands to
either tolerate promised objects (without invoking the hook) or be more
efficient in invoking the promised objects hook.


I agree that making git more tolerant of promised objects if possible 
and precomputing a list of promised objects required to complete a 
particular command and downloading them with a single request are good 
optimizations to add over time.




In order to determine the code changes in sha1_file.c necessary, I
investigated the following:
  (1) functions in sha1_file that take in a hash, without the user
  regarding how the object is stored (loose or packed)
  (2) functions in sha1_file that operate on packed objects (because I
  need to check callers that know about the loose/packed distinction
  and operate on both differently, and ensure that they can handle
  the concept of objects that are neither loose nor packed)

(1) is handled by the modification to sha1_object_info_extended().

For (2), I looked at for_each_packed_object and at the packed-related
functions that take in a hash. For for_each_packed_object, the callers
either already work or are fixed in this patch:
  - reachable - only to find recent objects
  - builtin/fsck - already knows about promised objects
  - builtin/cat-file - fixed in this commit

Callers of the other functions do not need to be changed:
  - parse_pack_index
- http - indirectly from http_get_info_packs
  - find_pack_entry_one
- this searches a single pack that is provided as an argument; the
  caller already knows (through other means) that the sought object
  is in a specific pack
  - find_sha1_pack
- fast-import - appears to be an optimization to not store a
  file if it is already in a pack
- http-walker - to search through a struct alt_base
- http-push - to search through remote packs
  - has_sha1_pack
- builtin/fsck - already knows about promised objects
- builtin/count-objects - informational purposes only (check if loose
  object is also packed)
- builtin/prune-packed - check if object to be pruned is packed (if
  not, don't prune it)
- revision - used to exclude packed objects if requested by user
- diff - just for optimization



has_sha1_file also takes a hash "whether local or in an alternate object 
database, and whether packed or loose" but never calls 
sha1_object_info_extended.  As a result, we had to add support in 
check_and_freshen to download missing objects to get proper behavior in 
all cases.  I don't think this will work correctly without it.



An alternative design that I considered but rejected:

  - Adding a hook whenever a packed object is requested, not on any
object.  That is, whenever we attempt to search the packfiles for an
object, if it is missing (from the packfiles and from the loose
object storage), to invoke the hook (which must then store it as a
packfile), open the packfile the hook generated, and report that the
object is found in that new packfile. This reduces the amount of
analysis needed (in that we only need to look at how packed objects
are handled), but requires that the hook generate packfiles (or for
sha1_file to pack whatever loose objects are generated), creating one
packfile for each missing object and potentially very many packfiles
that must be linearly searched. This may be tolerable now for repos
that only have a few missing objects (for example, repos that only
want to exclude large blobs), and might be tolerable in the future if
we have batching support for the most commonly used commands, but is
not tolerable now for repos that exclude a large amount of objects.

Helped-by: Ben Peart 
Signed-off-by: Jonathan Tan 
---
  Documentation/config.txt |   8 +
  Documentation/gitrepository-layout.txt   |   8 +
  Documentation/technical/read-object-protocol.txt | 102 
  builtin/cat-file.c   |   9 ++
  cache.h  |   2 +
  object.c |   3 +-
  promised-object.c| 194 

Re: Handling of paths

2017-07-20 Thread Charles Bailey
On Thu, Jul 20, 2017 at 01:30:52PM -0700, Junio C Hamano wrote:
> 
> I've read the function again and I think the attached patch covers
> everything that ought to be a filename.
> 
> By the way, to credit you, do you prefer your bloomberg or hashpling
> address?

The patch looks good to me.

It's not critical which address you credit.

I mark patches which result from my work at Bloomberg with my Bloomberg
email address and anything that I do entirely outside of work with my
hashpling address, although I will tend to use my hashpling email for
all communications because it co-operates with the mailing list
conventions a lot better.

In this case, this is a follow on from a cbaile...@bloomberg.net patch
so crediting that address seems the more appropriate option.

Charles.


Re: Binary files

2017-07-20 Thread Junio C Hamano
Igor Djordjevic  writes:

> On 20/07/2017 09:41, Volodymyr Sendetskyi wrote:
>> It is known, that git handles badly storing binary files in its
>> repositories at all.
>> This is especially about large files: even without any changes to
>> these files, their copies are snapshotted on each commit. So even
>> repositories with a small amount of code can grove very fast in size
>> if they contain some great binary files. Alongside this, the SVN is
>> much better about that, because it make changes to the server version
>> of file only if some changes were done.
>
> You already got some proposals on what you could try for making large 
> binary files handling easier, but I just wanted to comment on this 
> part of your message, as it doesn`t seem to be correct.

All correct.  Thanks.


Re: [PATCH 0/6] 2.14 RelNotes improvements

2017-07-20 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason   writes:

> Here's a few patches to improve the relnotes. I started just writing
> 6/6 since I think (I don't care about the wording) that we should in
> some way mention the items in the list in the 6/6 commit message.
>
> Along the way I noticed a few more missing things.
>
> Ævar Arnfjörð Bjarmason (6):
>   RelNotes: mention "log: add -P as a synonym for --perl-regexp"
>   RelNotes: mention "log: make --regexp-ignore-case work with
> --perl-regexp"
>   RelNotes: mention "sha1dc: optionally use sha1collisiondetection as a
> submodule"
>   RelNotes: mention that PCRE v2 exposes the same syntax
>   RelNotes: remove duplicate mention of PCRE v2
>   RelNotes: add more notes about PCRE in 2.14

Thanks.  1-3/6 went straight to 'master'.  I am not outright
rejecting the remainder, but I do not think these are release notes
material---if they need to be told, they should be in a part of the
regular documentation, and I suspect that they already are in your
series.


Re: Fwd: New Defects reported by Coverity Scan for git

2017-07-20 Thread René Scharfe
Am 18.07.2017 um 19:23 schrieb Junio C Hamano:
> Stefan Beller  writes:
> 
>> I looked at this report for a while. My current understanding:
>> * its detection was triggered by including rs/move-array,
>>f331ab9d4c (use MOVE_ARRAY, 2017-07-15)
>> * But it is harmless, because the scan logic does not understand
>>how ALLOC_GROW works. It assumes that
>>done_pbase_paths_alloc can be larger
>>than done_pbase_paths_num + 1, while done_pbase_paths
>>is NULL, such that the memory allocation is not triggered.
>>If that were the case, then we have 2 subsequent dereferences
>>of a NULL pointer right after that. But by inspecting the use
>>of _alloc and _num the initial assumption does not seem possible.
> 
> Yes, it does appear that way.  ALLOC_GROW() calls REALLOC_ARRAY()
> which safely can realloc NULL to specified size via xrealloc().

MOVE_ARRAY is passing its pointer arguments to memmove(); all it adds
is a check for (done_pbase_paths_num - pos - 1) being zero.  I don't
understand how that change can make it more likely for one of the
pointers to be NULL.

I guess the first message ('Comparing "done_pbase_paths" to null
implies that "done_pbase_paths" might be null.') has to be understood
as an explanation of how the checker arrived at the second one?

We could remove that NULL check -- it's effectively just a shortcut.
But how would that improve safety?  Well, if the array is unallocated
(NULL) and _num is greater than zero we'd get a segfault without it,
and thus would notice it.  That check currently papers over such a
hypothetical bug.  Makes sense?

-- >8 --
Subject: [PATCH] pack-objects: remove unnecessary NULL check

If done_pbase_paths is NULL then done_pbase_paths_num must be zero and
done_pbase_path_pos() returns -1 without accessing the array, so the
check is not necessary.

If the invariant was violated then the check would make sure we keep
on going and allocate the necessary amount of memory in the next
ALLOC_GROW call.  That sounds nice, but all array entries except for
one would contain garbage data.

If the invariant was violated without the check we'd get a segfault in
done_pbase_path_pos(), i.e. an observable crash, alerting us of the
presence of a bug.

Currently there is no such bug: Only the functions check_pbase_path()
and cleanup_preferred_base() change pointer and counter, and both make
sure to keep them in sync.  Get rid of the check anyway to allow us to
see if later changes introduce such a defect, and to simplify the code.

Detected by Coverity Scan.

Signed-off-by: Rene Scharfe 
---
 builtin/pack-objects.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index e730b415bf..c753e9237a 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -1289,7 +1289,7 @@ static int done_pbase_path_pos(unsigned hash)
 
 static int check_pbase_path(unsigned hash)
 {
-   int pos = (!done_pbase_paths) ? -1 : done_pbase_path_pos(hash);
+   int pos = done_pbase_path_pos(hash);
if (0 <= pos)
return 1;
pos = -pos - 1;
-- 
2.13.3


Re: Handling of paths

2017-07-20 Thread Junio C Hamano
Charles Bailey  writes:

> On Thu, Jul 20, 2017 at 12:42:40PM -0700, Junio C Hamano wrote:
>> Victor Toni  writes:
>> 
>> > What's unexpected is that paths used for sslKey or sslCert are treated
>> > differently insofar as they are expected to be absolute.
>> > Relative paths (whether with or without "~") don't work.
>> 
>> It appears that only two of these among four were made aware of the
>> "~[username]/" prefix in bf9acba2 ("http: treat config options
>> sslCAPath and sslCAInfo as paths", 2015-11-23), but "sslkey" and
>> "sslcert" were still left as plain vanilla strings.  I do not know
>> if that was an elaborate omission, or a mere oversight, as it seems
>> that it happened while I was away, so...
>
> It was more of an oversight than a deliberate omission, but more
> accurately I didn't actively consider whether the other http.ssl*
> variables were pathname-like or not.
>
> At the time I was trying to make a config which needed to set
> http.sslCAPath and/or http.sslCAInfo more portable between users and
> these were "obviously" pathname-like to me. Now that I read
> the help for http.sslCert and http.sslKey, I see no reason that they
> shouldn't also use git_config_pathname. If I'd been more thorough I
> would have proposed this at the time.

Thanks.

I've read the function again and I think the attached patch covers
everything that ought to be a filename.

By the way, to credit you, do you prefer your bloomberg or hashpling
address?

-- >8 --
Subject: http.c: http.sslcert and http.sslkey are both pathnames

Back when the modern http_options() codepath was created to parse
various http.* options at 29508e1e ("Isolate shared HTTP request
functionality", 2005-11-18), and then later was corrected for
interation between the multiple configuration files in 7059cd99
("http_init(): Fix config file parsing", 2009-03-09), we parsed
configuration variables like http.sslkey, http.sslcert as plain
vanilla strings, because git_config_pathname() that understands
"~[username]/" prefix did not exist.  Later, we converted some of
them (namely, http.sslCAPath and http.sslCAInfo) to use the
function, and added variables like http.cookeyFile http.pinnedpubkey
to use the function from the beginning.  Because of that, these
variables all understand "~[username]/" prefix.

Make the remaining two variables, http.sslcert and http.sslkey, also
aware of the convention, as they are both clearly pathnames to
files.

Noticed-by: Victor Toni 
Helped-by: Charles Bailey 
Signed-off-by: Junio C Hamano 
---
 http.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/http.c b/http.c
index c6c010f881..76ff63c14d 100644
--- a/http.c
+++ b/http.c
@@ -272,10 +272,10 @@ static int http_options(const char *var, const char 
*value, void *cb)
if (!strcmp("http.sslversion", var))
return git_config_string(_version, var, value);
if (!strcmp("http.sslcert", var))
-   return git_config_string(_cert, var, value);
+   return git_config_pathname(_cert, var, value);
 #if LIBCURL_VERSION_NUM >= 0x070903
if (!strcmp("http.sslkey", var))
-   return git_config_string(_key, var, value);
+   return git_config_pathname(_key, var, value);
 #endif
 #if LIBCURL_VERSION_NUM >= 0x070908
if (!strcmp("http.sslcapath", var))


NOTE

2017-07-20 Thread Bong Phang
BCEAO BANK TOGO has agreed to wire USD$ 7,500.000.00,get in touch with
me by my private email immediately: (myemailcham...@gmail.com)for more
details


Re: [PATCH] t: lib-gpg: flush gpg agent on startup

2017-07-20 Thread Santiago Torres
> With that "run it but ignore the outcome even if we failed to.", we
> do not have to worry about any of that ;-)

Oh right! thanks for the suggestion! Let me re-roll...

Thanks,
-Santiago.



signature.asc
Description: PGP signature


Re: Handling of paths

2017-07-20 Thread Charles Bailey
On Thu, Jul 20, 2017 at 12:42:40PM -0700, Junio C Hamano wrote:
> Victor Toni  writes:
> 
> > What's unexpected is that paths used for sslKey or sslCert are treated
> > differently insofar as they are expected to be absolute.
> > Relative paths (whether with or without "~") don't work.
> 
> It appears that only two of these among four were made aware of the
> "~[username]/" prefix in bf9acba2 ("http: treat config options
> sslCAPath and sslCAInfo as paths", 2015-11-23), but "sslkey" and
> "sslcert" were still left as plain vanilla strings.  I do not know
> if that was an elaborate omission, or a mere oversight, as it seems
> that it happened while I was away, so...

It was more of an oversight than a deliberate omission, but more
accurately I didn't actively consider whether the other http.ssl*
variables were pathname-like or not.

At the time I was trying to make a config which needed to set
http.sslCAPath and/or http.sslCAInfo more portable between users and
these were "obviously" pathname-like to me. Now that I read
the help for http.sslCert and http.sslKey, I see no reason that they
shouldn't also use git_config_pathname. If I'd been more thorough I
would have proposed this at the time.

Charles.


Re: [PATCH] t: lib-gpg: flush gpg agent on startup

2017-07-20 Thread Junio C Hamano
Santiago Torres  writes:

> This is the patch that stemmed from [1].
>
> I tried to keep it simple and not noisy, alhtough it breaks the &&
> chain, it needs to be run right before the --import command. I also
> decided to drop the switch chain in case that regression was to be
> introduced in the future in other versions (hopefully gpgconf goes
> nowhere by then).

I'm inclined to do

...
export GNUPGHOME &&
( gpgconf --kill gpg-agent 2>&1 >/dev/null || : ) &&
gpg --homedir ... --import ...

Imagine "chmod 0777 ./gpghome" failed and what happens. We skip the
part that exports GNUPGHOME and attempts to kill gpg-agent as if
nothing bad happened, and then we try to "--import".  At that point
we do not know what value GNUPGHOME has---are we clobbering the real
keychain the user who runs the test has?

With that "run it but ignore the outcome even if we failed to.", we
do not have to worry about any of that ;-)

>
> I was able to test this on debian oldstable/stable and arch.
>
> Cheers!
> -Santiago.
>
> [1] https://public-inbox.org/git/xmqqvampmnmv@gitster.mtv.corp.google.com/
>
> On Thu, Jul 20, 2017 at 12:58:14PM -0400, santi...@nyu.edu wrote:
>> From: Santiago Torres 
>> 
>> When running gpg-relevant tests, a gpg-daemon is spawned for each
>> GNUPGHOME used. This daemon may stay running after the test and cache
>> file descriptors for the trash directories, even after the trash
>> directory is removed. This leads to ENOENT errors when attempting to
>> create files if tests are run multiple times.
>> 
>> Add a cleanup script to force flushing the gpg-agent for that GNUPGHOME
>> (if any) before setting up the GPG relevant-environment.
>> 
>> Helped-by: Junio C Hamano 
>> Signed-off-by: Santiago Torres 
>> ---
>>  t/lib-gpg.sh | 1 +
>>  1 file changed, 1 insertion(+)
>> 
>> diff --git a/t/lib-gpg.sh b/t/lib-gpg.sh
>> index ec2aa8f68..7a6c7ee6f 100755
>> --- a/t/lib-gpg.sh
>> +++ b/t/lib-gpg.sh
>> @@ -31,6 +31,7 @@ then
>>  chmod 0700 ./gpghome &&
>>  GNUPGHOME="$(pwd)/gpghome" &&
>>  export GNUPGHOME &&
>> +gpgconf --kill gpg-agent 2>&1 >/dev/null
>>  gpg --homedir "${GNUPGHOME}" 2>/dev/null --import \
>>  "$TEST_DIRECTORY"/lib-gpg/keyring.gpg &&
>>  gpg --homedir "${GNUPGHOME}" 2>/dev/null --import-ownertrust \
>> -- 
>> 2.13.3
>> 


Re: [RFC PATCH v2 2/4] promised-object, fsck: introduce promised objects

2017-07-20 Thread Ben Peart



On 7/19/2017 8:21 PM, Jonathan Tan wrote:

Currently, Git does not support repos with very large numbers of objects
or repos that wish to minimize manipulation of certain blobs (for
example, because they are very large) very well, even if the user
operates mostly on part of the repo, because Git is designed on the
assumption that every referenced object is available somewhere in the
repo storage.



Great to see this idea making progress. Making git able to gracefully 
handle partial clones (beyond the existing shallow clone support) is a 
key piece of dealing with very large objects and repos.



As a first step to reducing this problem, introduce the concept of
promised objects. Each Git repo can contain a list of promised objects
and their sizes (if blobs) at $GIT_DIR/objects/promised. This patch
contains functions to query them; functions for creating and modifying
that file will be introduced in later patches.


If I'm reading this patch correctly, for a repo to successfully pass 
"git fsck" either the object or a promise must exist for everything fsck 
checks.  From the documentation for fsck it says "git fsck defaults to 
using the index file, all SHA-1 references in refs namespace, and all 
reflogs (unless --no-reflogs is given) as heads." Doesn't this then 
imply objects or promises must exist for all objects referenced by any 
of these locations?


We're currently in the hundreds of millions of objects on some of our 
repos so even downloading the promises for all the objects in the index 
is unreasonable as it is gigabytes of data and growing.


I think we should have a flag (off by default) that enables someone to 
say that promised objects are optional. If the flag is set, 
"is_promised_object" will return success and pass the OBJ_ANY type and a 
size of -1.


Nothing today is using the size and in the two places where the object 
type is being checked for consistency (fsck_cache_tree and 
fsck_handle_ref) the test can add a test for OBJ_ANY as well.


This will enable very large numbers of objects to be omitted from the 
clone without triggering a download of the corresponding number of 
promised objects.




A repository that is missing an object but has that object promised is not
considered to be in error, so also teach fsck this. As part of doing
this, object.{h,c} has been modified to generate "struct object" based
on only the information available to promised objects, without requiring
the object itself.


In your work on this, did you investigate if there are other commands 
(ie repack/gc) that will need to learn about promised objects? Have you 
had a chance (or have plans) to hack up the test suite so that it runs 
all tests with promised objects and see what (if anything) breaks?




Signed-off-by: Jonathan Tan 
---
  Documentation/technical/repository-version.txt |   6 ++
  Makefile   |   1 +
  builtin/fsck.c |  18 +++-
  cache.h|   2 +
  environment.c  |   1 +
  fsck.c |   6 +-
  object.c   |  19 
  object.h   |  19 
  promised-object.c  | 130 +
  promised-object.h  |  22 +
  setup.c|   7 +-
  t/t3907-promised-object.sh |  41 
  t/test-lib-functions.sh|   6 ++
  13 files changed, 273 insertions(+), 5 deletions(-)
  create mode 100644 promised-object.c
  create mode 100644 promised-object.h
  create mode 100755 t/t3907-promised-object.sh

diff --git a/Documentation/technical/repository-version.txt 
b/Documentation/technical/repository-version.txt
index 00ad37986..f8b82c1c7 100644
--- a/Documentation/technical/repository-version.txt
+++ b/Documentation/technical/repository-version.txt
@@ -86,3 +86,9 @@ for testing format-1 compatibility.
  When the config key `extensions.preciousObjects` is set to `true`,
  objects in the repository MUST NOT be deleted (e.g., by `git-prune` or
  `git repack -d`).
+
+`promisedObjects`
+~
+
+(Explain this - basically a string containing a command to be run
+whenever a missing object needs to be fetched.)
diff --git a/Makefile b/Makefile
index 9c9c42f8f..c1446d5ef 100644
--- a/Makefile
+++ b/Makefile
@@ -828,6 +828,7 @@ LIB_OBJS += preload-index.o
  LIB_OBJS += pretty.o
  LIB_OBJS += prio-queue.o
  LIB_OBJS += progress.o
+LIB_OBJS += promised-object.o
  LIB_OBJS += prompt.o
  LIB_OBJS += quote.o
  LIB_OBJS += reachable.o
diff --git a/builtin/fsck.c b/builtin/fsck.c
index 462b8643b..49e21f361 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -15,6 +15,7 @@
  #include "progress.h"
  #include "streaming.h"
  #include "decorate.h"
+#include "promised-object.h"
  
  #define 

Re: [GSoC][PATCH 5/8] submodule: port submodule subcommand 'sync' from shell to C

2017-07-20 Thread Stefan Beller
On Thu, Jul 20, 2017 at 12:36 PM, Prathamesh Chavan  wrote:
> Firstly, thanks for reviewing my patches. I have even checked out the
> other reviews
> and improvised the other patches according to reviews as well.
> I had a few doubts about this one though.
>
>>> +   const struct submodule *sub;
>>> +   char *sub_key, *remote_key;
>>> +   char *sub_origin_url, *super_config_url, *displaypath;
>>> +   struct strbuf sb = STRBUF_INIT;
>>> +   struct child_process cp = CHILD_PROCESS_INIT;
>>> +
>>> +   if (!is_submodule_active(the_repository, list_item->name))
>>> +   return;
>>
>> We can use the_repository here, as we also use child processes to
>> recurse, such that we always operate on the_repository as the
>> superproject.
>>
>
> Sorry, but can you explain this a bit more?

Well that was more thinking out out, in the sense of explaining why
it is the correct thing to do.

As the recursion is handled via spawning processes, each process
has the_repository pointing at a different repository (and the correct
repository for each process), at least to my understanding.

>
>>
>>> +
>>> +   sub = submodule_from_path(null_sha1, list_item->name);
>>> +
>>> +   if (!sub || !sub->url)
>>> +   die(_("no url found for submodule path '%s' in 
>>> .gitmodules"),
>>> + list_item->name);
>>
>> We do not die in the shell script when the url is missing in the
>> .gitmodules file.
>>
>
> Then to have a faithful conversion, IMO, deleting the above lines
> would be the correct way?

Well, then we may run into segfaults due to dereferencing a NULL pointer.
So we have to figure out, what the code actually does when there is
no URL set. According to my understanding this would

url=$(git config -f .gitmodules --get submodule."$name".url)
# second case, but empty vars:
sub_origin_url="$url"
super_config_url="$url"



The issue with this shell script is that there is no difference between
"" and NULL, so the place where we do

sub_origin_url="$url"
super_config_url="$url"

we would need to translate NULL -> empty string

>
>>> +
>>> +   prepare_submodule_repo_env(_array);
>>> +   cp.git_cmd = 1;
>>> +   cp.dir = list_item->name;
>>> +   argv_array_pushl(, "submodule--helper",
>>> +"print-default-remote", NULL);
>>> +   if (capture_command(, , 0))
>>> +   die(_("failed to get the default remote for submodule 
>>> '%s'"),
>>> + list_item->name);
>>> +
>>> +   strbuf_strip_suffix(, "\n");
>>> +   remote_key = xstrfmt("remote.%s.url", sb.buf);
>>> +   strbuf_release();
>>> +
>>> +   child_process_init();
>>> +   prepare_submodule_repo_env(_array);
>>> +   cp.git_cmd = 1;
>>> +   cp.dir = list_item->name;
>>> +   argv_array_pushl(, "config", remote_key, sub_origin_url, 
>>> NULL);
>>> +   if (run_command())
>>> +   die(_("failed to update remote for submodule '%s'"),
>>> + list_item->name);
>>
>> While it is a strict conversion from the shell script, we could also
>> try to do this in-process:
>> 1) we'd find out the submodules git dir using submodule_to_gitdir
>> 2) construct the path the the config file as "%s/.gitconfig"
>> 3) using git_config_set_in_file (which presumably takes file name,
>>   key and value) the value can be set
>
> Thanks for pointing that out. That surely reduced a child_process.
> Although the path of the config file for the case of submodules
> would be constructed by "%s/config".

Ah yes, that is correct.


>
> Thanks,
> Prathamesh Chavan


Re: Handling of paths

2017-07-20 Thread Junio C Hamano
Victor Toni  writes:

> What's unexpected is that paths used for sslKey or sslCert are treated
> differently insofar as they are expected to be absolute.
> Relative paths (whether with or without "~") don't work.

Looking at http.c::http_options(), I see that "sslcapath" and
"sslcainfo" do use git_config_pathname() when grabbing their values,
but "sslcert" and "sslkey" treat the value as a plain vanilla string
without expecting "~[username]/" at all.

The modern http.c codestructure was introduced at 29508e1e ("Isolate
shared HTTP request functionality", 2005-11-18) and was corrected
for interation between the multiple configuration files in 7059cd99
("http_init(): Fix config file parsing", 2009-03-09), but back in
these versions, all of them including "sslcapath" and "sslcainfo"
were all treated as plain vanilla strings.

It appears that only two of these among four were made aware of the
"~[username]/" prefix in bf9acba2 ("http: treat config options
sslCAPath and sslCAInfo as paths", 2015-11-23), but "sslkey" and
"sslcert" were still left as plain vanilla strings.  I do not know
if that was an elaborate omission, or a mere oversight, as it seems
that it happened while I was away, so...



Re: [GSoC][PATCH 5/8] submodule: port submodule subcommand 'sync' from shell to C

2017-07-20 Thread Prathamesh Chavan
Firstly, thanks for reviewing my patches. I have even checked out the
other reviews
and improvised the other patches according to reviews as well.
I had a few doubts about this one though.

>> +   const struct submodule *sub;
>> +   char *sub_key, *remote_key;
>> +   char *sub_origin_url, *super_config_url, *displaypath;
>> +   struct strbuf sb = STRBUF_INIT;
>> +   struct child_process cp = CHILD_PROCESS_INIT;
>> +
>> +   if (!is_submodule_active(the_repository, list_item->name))
>> +   return;
>
> We can use the_repository here, as we also use child processes to
> recurse, such that we always operate on the_repository as the
> superproject.
>

Sorry, but can you explain this a bit more?

>
>> +
>> +   sub = submodule_from_path(null_sha1, list_item->name);
>> +
>> +   if (!sub || !sub->url)
>> +   die(_("no url found for submodule path '%s' in .gitmodules"),
>> + list_item->name);
>
> We do not die in the shell script when the url is missing in the
> .gitmodules file.
>

Then to have a faithful conversion, IMO, deleting the above lines
would be the correct way?

>> +
>> +   prepare_submodule_repo_env(_array);
>> +   cp.git_cmd = 1;
>> +   cp.dir = list_item->name;
>> +   argv_array_pushl(, "submodule--helper",
>> +"print-default-remote", NULL);
>> +   if (capture_command(, , 0))
>> +   die(_("failed to get the default remote for submodule '%s'"),
>> + list_item->name);
>> +
>> +   strbuf_strip_suffix(, "\n");
>> +   remote_key = xstrfmt("remote.%s.url", sb.buf);
>> +   strbuf_release();
>> +
>> +   child_process_init();
>> +   prepare_submodule_repo_env(_array);
>> +   cp.git_cmd = 1;
>> +   cp.dir = list_item->name;
>> +   argv_array_pushl(, "config", remote_key, sub_origin_url, 
>> NULL);
>> +   if (run_command())
>> +   die(_("failed to update remote for submodule '%s'"),
>> + list_item->name);
>
> While it is a strict conversion from the shell script, we could also
> try to do this in-process:
> 1) we'd find out the submodules git dir using submodule_to_gitdir
> 2) construct the path the the config file as "%s/.gitconfig"
> 3) using git_config_set_in_file (which presumably takes file name,
>   key and value) the value can be set

Thanks for pointing that out. That surely reduced a child_process.
Although the path of the config file for the case of submodules
would be constructed by "%s/config".

Thanks,
Prathamesh Chavan


Re: [RFC PATCH v2 2/4] promised-object, fsck: introduce promised objects

2017-07-20 Thread Jonathan Tan
On Thu, 20 Jul 2017 11:07:29 -0700
Stefan Beller  wrote:

> > +   if (fsck_promised_objects()) {
> > +   error("Errors found in promised object list");
> > +   errors_found |= ERROR_PROMISED_OBJECT;
> > +   }
> 
> This got me thinking: It is an error if we do not have an object
> and also do not promise it, but what about the other case:
> having and object and promising it, too?
> That is not harmful to the operation, except that the promise
> machinery may be slower due to its size. (Should that be a soft
> warning then? Do we have warnings in fsck?)

Good question - having an object and also having it promised is not an
error condition (and I don't think it's a good idea to make it so,
because objects can appear quite easily from various sources). In the
future, I expect "git gc" to be extended to remove such redundant lines
from the "promised" list.

> >   * The object type is stored in 3 bits.
> >   */
> 
> We may want to remove this comment while we're here as it
> sounds stale despite being technically correct.
> 1974632c66 (Remove TYPE_* constant macros and use
> object_type enums consistently., 2006-07-11)

I agree that the comment is unnecessary, but in this commit I didn't
modify anything to do with the type, so I left it there.

> >  struct object {
> > +   /*
> > +* Set if this object is parsed. If set, "type" is populated and 
> > this
> > +* object can be casted to "struct commit" or an equivalent.
> > +*/
> > unsigned parsed : 1;
> > +   /*
> > +* Set if this object is not in the repo but is promised. If set,
> > +* "type" is populated, but this object cannot be casted to "struct
> > +* commit" or an equivalent.
> > +*/
> > +   unsigned promised : 1;
> 
> Would it make sense to have a bit field instead:
> 
> #define STATE_BITS 2
> #define STATE_PARSED (1<<0)
> #define STATE_PROMISED (1<<1)
> 
> unsigned state : STATE_BITS
> 
> This would be similar to the types and flags?

Both type and flag have to be bit fields (for different reasons), but
since we don't need such a combined field for "parsed" and "promised", I
prefer separating them each into their own field.

> > +test_expect_success 'fsck fails on missing objects' '
> > +   test_create_repo repo &&
> > +
> > +   test_commit -C repo 1 &&
> > +   test_commit -C repo 2 &&
> > +   test_commit -C repo 3 &&
> > +   git -C repo tag -a annotated_tag -m "annotated tag" &&
> > +   C=$(git -C repo rev-parse 1) &&
> > +   T=$(git -C repo rev-parse 2^{tree}) &&
> > +   B=$(git hash-object repo/3.t) &&
> > +   AT=$(git -C repo rev-parse annotated_tag) &&
> > +
> > +   # missing commit, tree, blob, and tag
> > +   rm repo/.git/objects/$(echo $C | cut -c1-2)/$(echo $C | cut -c3-40) 
> > &&
> > +   rm repo/.git/objects/$(echo $T | cut -c1-2)/$(echo $T | cut -c3-40) 
> > &&
> > +   rm repo/.git/objects/$(echo $B | cut -c1-2)/$(echo $B | cut -c3-40) 
> > &&
> > +   rm repo/.git/objects/$(echo $AT | cut -c1-2)/$(echo $AT | cut 
> > -c3-40) &&
> 
> This is a pretty cool test as it promises all sorts of objects
> from different parts of the graph.

Thanks.


Re: [PATCH 6/6] RelNotes: add more notes about PCRE in 2.14

2017-07-20 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason   writes:

> We were missing any mention that:
>
>  - PCRE is now faster with JIT
>  - That it's now faster than the other regex backends
>  - That therefore you might want to use it by default, but beware of
>the incompatible syntax.

Hmph.  All of that has been more or less deliberate, as I want the
release notes to be more like table of contents, one bullet per item
with short description, not a novelette with one paragraph per item.

These should already be in the documentation when they do want to
decide if they want to use JIT; somebody who downloads 2.15 or later
and wants to decide if they want JIT shouldn't have to dig down to
earlier release notes that introduced the option for the first time.

> Signed-off-by: Ævar Arnfjörð Bjarmason 
> ---
>  Documentation/RelNotes/2.14.0.txt | 11 ++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/RelNotes/2.14.0.txt 
> b/Documentation/RelNotes/2.14.0.txt
> index fb6a3dba31..a6a1cb963b 100644
> --- a/Documentation/RelNotes/2.14.0.txt
> +++ b/Documentation/RelNotes/2.14.0.txt
> @@ -88,7 +88,16 @@ UI, Workflows & Features
> learned to say "it's a pathspec" a bit more often when the syntax
> looks like so.
>  
> - * Update "perl-compatible regular expression" support to enable JIT.
> + * Update "perl-compatible regular expression" support to enable
> +   JIT.
> +
> +   This makes grep.patternType=perl (and -P and --perl-regexp) much
> +   faster for "git grep" and "git log", and is generally faster than
> +   the system's POSIX regular expression implementation. Users
> +   concerned with "git grep" performance or "git log --grep"
> +   performance might want to try setting grep.patternType=perl. Note
> +   that the syntax isn't compatible with git's default of
> +   grep.patternType=basic.
>  
>   * "filter-branch" learned a pseudo filter "--setup" that can be used
> to define common functions/variables that can be used by other


Re: [PATCH] PRItime: wrap PRItime for better l10n compatibility

2017-07-20 Thread Junio C Hamano
> The use of "make pot" from the top-level is already described in
> po/README, so the only thing that we need is something like this
> change.  I'll follow up this message with a sample output from the
> updated process to ask others to sanity check the result (they are
> tiny) in a separate message.

So I am inclined to apply this directly on 'master' before tagging
the first release candidate that includes timestamp_t; I'll wait for
the earth to rotate once for comments, though.

Thanks.

-- >8 --
Subject: [PATCH] Makefile: help gettext tools to cope with our custom PRItime 
format

We started using our own timestamp_t type and PRItime format
specifier to go along with it, so that we can later change the
underlying type and output format more easily, but this does not
play well with gettext tools.

Because gettext tools need to keep the *.po file portable across
platforms, they have to special-case the format specifiers like
PRIuMAX that are known types in inttypes.h, instead of letting CPP
handle strings like

"%" PRIuMAX " seconds ago"

as an ordinary string concatenation.  They fundamentally cannot do
the same for our own custom type/format.

Given that po/git.pot needs to be generated only once every release
and by only one person, i.e. the l10n coordinator, let's update the
Makefile rule to generate po/git.pot so that gettext tools are run
on a munged set of sources in which all mentions of PRItime are
replaced with PRIuMAX, which is what we happen to use right now.

This way, developers do not have to care that PRItime does not play
well with gettext, and translators do not have to care that we use
our own PRItime.

The credit for the idea to munge the source files goes to Dscho.
Possible bugs are mine.

Helped-by: Jiang Xin 
Helped-by: Johannes Schindelin 
Signed-off-by: Junio C Hamano 
---
 Makefile | 20 
 1 file changed, 20 insertions(+)

diff --git a/Makefile b/Makefile
index ba4359ef8d..527502835f 100644
--- a/Makefile
+++ b/Makefile
@@ -2216,12 +2216,32 @@ LOCALIZED_SH += t/t0200/test.sh
 LOCALIZED_PERL += t/t0200/test.perl
 endif
 
+## Note that this is meant to be run only by the localization coordinator
+## under a very controlled condition, i.e. (1) it is to be run in a
+## Git repository (not a tarball extract), (2) any local modifications
+## will be lost.
+## Gettext tools cannot work with our own custom PRItime type, so
+## we replace PRItime with PRIuMAX.  Weneed to update this if we
+## switch to a signed type with PRIdMAX.
+
 po/git.pot: $(GENERATED_H) FORCE
+   # All modifications will be reverted at the end, so we do not
+   # want to have any local changes
+   git diff --quiet HEAD && git diff --quiet --cached
+
+   @for s in $(LOCALIZED_C) $(LOCALIZED_SH) $(LOCALIZED_PERL); \
+   do \
+   sed -e 's|PRItime|PRIuMAX|g' <"$$s" >"$$s+" && \
+   cat "$$s+" >"$$s" && rm "$$s+"; \
+   done
+
$(QUIET_XGETTEXT)$(XGETTEXT) -o$@+ $(XGETTEXT_FLAGS_C) $(LOCALIZED_C)
$(QUIET_XGETTEXT)$(XGETTEXT) -o$@+ --join-existing $(XGETTEXT_FLAGS_SH) 
\
$(LOCALIZED_SH)
$(QUIET_XGETTEXT)$(XGETTEXT) -o$@+ --join-existing 
$(XGETTEXT_FLAGS_PERL) \
$(LOCALIZED_PERL)
+
+   git reset --hard
mv $@+ $@
 
 .PHONY: pot
-- 
2.14.0-rc0-194-g965e058453



Re: Binary files

2017-07-20 Thread Igor Djordjevic
Hi Volodymyr,

On 20/07/2017 09:41, Volodymyr Sendetskyi wrote:
> It is known, that git handles badly storing binary files in its
> repositories at all.
> This is especially about large files: even without any changes to
> these files, their copies are snapshotted on each commit. So even
> repositories with a small amount of code can grove very fast in size
> if they contain some great binary files. Alongside this, the SVN is
> much better about that, because it make changes to the server version
> of file only if some changes were done.

You already got some proposals on what you could try for making large 
binary files handling easier, but I just wanted to comment on this 
part of your message, as it doesn`t seem to be correct.

Even though each repository file is included in each commit (being a 
full repository state snapshot), meaning big binary files as well, 
that`s just from an end-user`s perspective.

Actual implementation side is smarter than that - if file hasn`t 
changed between commits, it won`t get copied/written to Git object 
database again.

Under the hood, many different commits can point to the same 
(unchanged) file, thus repository size _does not_ grow very fast with 
each commit if large binary file is without any changes.

Usually, the biggest concern with Git and large files[1], in 
comparison to SVN, for example, is something else - Git model 
assuming each repository clone holding the complete repository 
history with all the different file versions included, so you can`t 
get just some of them, or the last snapshot only, keeping your local 
repository small in size.

If the repository you`re cloning from is a big one, your locally 
cloned repository will be as well, even if you may not really be 
interested in the big files at all... but you got some suggestions 
for handling that already, as pointed out :)

Just note that it`s not really Git vs SVN here, but more distributed 
vs centralized approach in general, as you can`t both have everything 
and yet skip something at the same time. Different systems may have 
different workarounds for a specific workflow, though.

[1] Besides taking each file version as a full-sized snapshot (at the 
beginning, at least, until the delta compression packing occurs).

Regards,
Buga


Re: [PATCH] l10n: de.po: update German translation

2017-07-20 Thread Matthias Rüster

Hi Ralf,

I think the following should be "hinzugefügt":



  #: builtin/add.c:288
-#, fuzzy
  msgid "warn when adding an embedded repository"
-msgstr "ein Bare-Repository erstellen"
+msgstr "warnen wenn eingebettetes Repository hingefügt wird"
  



Everything else looks great! Thanks!


Kind regards,
Matthias


Re: [PATCH] PRItime: wrap PRItime for better l10n compatibility

2017-07-20 Thread Junio C Hamano
Junio C Hamano  writes:

> The use of "make pot" from the top-level is already described in
> po/README, so the only thing that we need is something like this
> change.  I'll follow up this message with a sample output from the
> updated process to ask others to sanity check the result (they are
> tiny) in a separate message.

Without the Makefile patch in the previous message, I ran "make pot"
and saved the resulting po/git.pot to git.pot-old.  And then after
"git reset --hard", I applied the Makefile patch and ran "make pot"
again, which gave me an updated po/git.pot file.  The difference is
shown below.

As expected, these look sensible to me.  All the hits from 

git grep '_(.*PRItime'

are included in the difference.




--- git.pot-old 2017-07-20 11:17:29.608343390 -0700
+++ po/git.pot  2017-07-20 11:18:14.744342564 -0700
@@ -8,7 +8,7 @@
 msgstr ""
 "Project-Id-Version: PACKAGE VERSION\n"
 "Report-Msgid-Bugs-To: Git Mailing List \n"
-"POT-Creation-Date: 2017-07-20 11:17-0700\n"
+"POT-Creation-Date: 2017-07-20 11:18-0700\n"
 "PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
 "Last-Translator: FULL NAME \n"
 "Language-Team: LANGUAGE \n"
@@ -1388,17 +1388,67 @@
 msgid "in the future"
 msgstr ""
 
-#: date.c:122 date.c:129 date.c:136 date.c:143 date.c:149 date.c:156
-#: date.c:167 date.c:175 date.c:180
-msgid "%"
-msgid_plural "%"
+#: date.c:122
+#, c-format
+msgid "% second ago"
+msgid_plural "% seconds ago"
+msgstr[0] ""
+msgstr[1] ""
+
+#: date.c:129
+#, c-format
+msgid "% minute ago"
+msgid_plural "% minutes ago"
+msgstr[0] ""
+msgstr[1] ""
+
+#: date.c:136
+#, c-format
+msgid "% hour ago"
+msgid_plural "% hours ago"
+msgstr[0] ""
+msgstr[1] ""
+
+#: date.c:143
+#, c-format
+msgid "% day ago"
+msgid_plural "% days ago"
+msgstr[0] ""
+msgstr[1] ""
+
+#: date.c:149
+#, c-format
+msgid "% week ago"
+msgid_plural "% weeks ago"
+msgstr[0] ""
+msgstr[1] ""
+
+#: date.c:156
+#, c-format
+msgid "% month ago"
+msgid_plural "% months ago"
+msgstr[0] ""
+msgstr[1] ""
+
+#: date.c:167
+#, c-format
+msgid "% year"
+msgid_plural "% years"
 msgstr[0] ""
 msgstr[1] ""
 
 #. TRANSLATORS: "%s" is " years"
 #: date.c:170
-msgid "%s, %"
-msgid_plural "%s, %"
+#, c-format
+msgid "%s, % month ago"
+msgid_plural "%s, % months ago"
+msgstr[0] ""
+msgstr[1] ""
+
+#: date.c:175 date.c:180
+#, c-format
+msgid "% year ago"
+msgid_plural "% years ago"
 msgstr[0] ""
 msgstr[1] ""
 


Re: [RFC PATCH v2 4/4] sha1_file: support promised object hook

2017-07-20 Thread Stefan Beller
On Wed, Jul 19, 2017 at 5:21 PM, Jonathan Tan  wrote:
> Teach sha1_file to invoke a hook whenever an object is requested and
> unavailable but is promised. The hook is a shell command that can be
> configured through "git config"; this hook takes in a list of hashes and
> writes (if successful) the corresponding objects to the repo's local
> storage.
>
> The usage of the hook can be suppressed through a flag when invoking
> has_object_file_with_flags() and other similar functions.
> parse_or_promise_object() in object.c requires this functionality, and
> has been modified to use it.
>
> This is meant as a temporary measure to ensure that all Git commands
> work in such a situation. Future patches will update some commands to
> either tolerate promised objects (without invoking the hook) or be more
> efficient in invoking the promised objects hook.
>
> In order to determine the code changes in sha1_file.c necessary, I
> investigated the following:
>  (1) functions in sha1_file that take in a hash, without the user
>  regarding how the object is stored (loose or packed)
>  (2) functions in sha1_file that operate on packed objects (because I
>  need to check callers that know about the loose/packed distinction
>  and operate on both differently, and ensure that they can handle
>  the concept of objects that are neither loose nor packed)
>
> (1) is handled by the modification to sha1_object_info_extended().
>
> For (2), I looked at for_each_packed_object and at the packed-related
> functions that take in a hash. For for_each_packed_object, the callers
> either already work or are fixed in this patch:
>  - reachable - only to find recent objects
>  - builtin/fsck - already knows about promised objects
>  - builtin/cat-file - fixed in this commit
>
> Callers of the other functions do not need to be changed:
>  - parse_pack_index
>- http - indirectly from http_get_info_packs
>  - find_pack_entry_one
>- this searches a single pack that is provided as an argument; the
>  caller already knows (through other means) that the sought object
>  is in a specific pack
>  - find_sha1_pack
>- fast-import - appears to be an optimization to not store a
>  file if it is already in a pack
>- http-walker - to search through a struct alt_base
>- http-push - to search through remote packs
>  - has_sha1_pack
>- builtin/fsck - already knows about promised objects
>- builtin/count-objects - informational purposes only (check if loose
>  object is also packed)
>- builtin/prune-packed - check if object to be pruned is packed (if
>  not, don't prune it)
>- revision - used to exclude packed objects if requested by user
>- diff - just for optimization
>
> An alternative design that I considered but rejected:
>
>  - Adding a hook whenever a packed object is requested, not on any
>object.  That is, whenever we attempt to search the packfiles for an
>object, if it is missing (from the packfiles and from the loose
>object storage), to invoke the hook (which must then store it as a
>packfile), open the packfile the hook generated, and report that the
>object is found in that new packfile. This reduces the amount of
>analysis needed (in that we only need to look at how packed objects
>are handled), but requires that the hook generate packfiles (or for
>sha1_file to pack whatever loose objects are generated), creating one
>packfile for each missing object and potentially very many packfiles
>that must be linearly searched. This may be tolerable now for repos
>that only have a few missing objects (for example, repos that only
>want to exclude large blobs), and might be tolerable in the future if
>we have batching support for the most commonly used commands, but is
>not tolerable now for repos that exclude a large amount of objects.
>
> Helped-by: Ben Peart 
> Signed-off-by: Jonathan Tan 
> ---
>  Documentation/config.txt |   8 +
>  Documentation/gitrepository-layout.txt   |   8 +
>  Documentation/technical/read-object-protocol.txt | 102 
>  builtin/cat-file.c   |   9 ++
>  cache.h  |   2 +
>  object.c |   3 +-
>  promised-object.c| 194 
> +++
>  promised-object.h|  12 ++
>  sha1_file.c  |  44 +++--
>  t/t3907-promised-object.sh   |  32 
>  t/t3907/read-object  | 114 +
>  11 files changed, 513 insertions(+), 15 deletions(-)
>  create mode 100644 Documentation/technical/read-object-protocol.txt
>  create mode 100755 t/t3907/read-object
>
> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index 

Re: [PATCH] PRItime: wrap PRItime for better l10n compatibility

2017-07-20 Thread Junio C Hamano
Junio C Hamano  writes:

> Johannes Schindelin  writes:
>
>> But there may be hope. Since the character sequence "PRItime" is highly
>> unlikely to occur in Git's source code in any context other than the
>> format to print/parse timestamp_t, it should be possible to automate a the
>> string replacement
>>
>>  git ls-files -z \*.[ch] |
>>  xargs -0r sed -i 's/PRItime/PRIuMAX/g'
>>
>> (assuming, of course, that you use GNU sed, not BSD sed, for which the
>> `-i` needs to read `-i ''` instead) as part of the update?
>
> I somehow missed this bit.
>
> Given that this needs to be done only once every release by only one
> person (i.e. the l10n coordinator who updates *.pot file), as long
> as the procedure is automated as much as possible to ease the pain
> for the l10n coordinator and clearly described in the "Maintaining
> the po/git.pot file" section of po/README, something along that line
> does sound like a very tempting approach.  If it works well, it is
> certainly much easier for normal developers than the other possible
> alternatives I mentioned in my previous response.

So, I was offline for most of the day yesterday and with this issue
blocking the release candidate, didn't manage to tag -rc1.

The use of "make pot" from the top-level is already described in
po/README, so the only thing that we need is something like this
change.  I'll follow up this message with a sample output from the
updated process to ask others to sanity check the result (they are
tiny) in a separate message.

Thanks.


 Makefile | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/Makefile b/Makefile
index ba4359ef8d..7069a12f75 100644
--- a/Makefile
+++ b/Makefile
@@ -2216,12 +2216,22 @@ LOCALIZED_SH += t/t0200/test.sh
 LOCALIZED_PERL += t/t0200/test.perl
 endif
 
+## Note that this is only meant to run by the localization coordinator
+## under a very controlled condition, i.e. (1) it is to be run in a
+## Git repository (not a tarball extract), (2) any local modifications
+## will be lost.
 po/git.pot: $(GENERATED_H) FORCE
+   @for s in $(LOCALIZED_C) $(LOCALIZED_SH) $(LOCALIZED_PERL); \
+   do \
+   sed -e 's|PRItime|PRIuMAX|g' <"$$s" >"$$s+" && \
+   cat "$$s+" >"$$s" && rm "$$s+"; \
+   done
$(QUIET_XGETTEXT)$(XGETTEXT) -o$@+ $(XGETTEXT_FLAGS_C) $(LOCALIZED_C)
$(QUIET_XGETTEXT)$(XGETTEXT) -o$@+ --join-existing $(XGETTEXT_FLAGS_SH) 
\
$(LOCALIZED_SH)
$(QUIET_XGETTEXT)$(XGETTEXT) -o$@+ --join-existing 
$(XGETTEXT_FLAGS_PERL) \
$(LOCALIZED_PERL)
+   @git reset --hard
mv $@+ $@
 
 .PHONY: pot


Re: [RFC PATCH v2 2/4] promised-object, fsck: introduce promised objects

2017-07-20 Thread Stefan Beller
On Wed, Jul 19, 2017 at 5:21 PM, Jonathan Tan  wrote:
> Currently, Git does not support repos with very large numbers of objects
> or repos that wish to minimize manipulation of certain blobs (for
> example, because they are very large) very well, even if the user
> operates mostly on part of the repo, because Git is designed on the
> assumption that every referenced object is available somewhere in the
> repo storage.
>
> As a first step to reducing this problem, introduce the concept of
> promised objects. Each Git repo can contain a list of promised objects
> and their sizes (if blobs) at $GIT_DIR/objects/promised. This patch
> contains functions to query them; functions for creating and modifying
> that file will be introduced in later patches.
>
> A repository that is missing an object but has that object promised is not
> considered to be in error, so also teach fsck this. As part of doing
> this, object.{h,c} has been modified to generate "struct object" based
> on only the information available to promised objects, without requiring
> the object itself.
>
> Signed-off-by: Jonathan Tan 
> ---
>  Documentation/technical/repository-version.txt |   6 ++
>  Makefile   |   1 +
>  builtin/fsck.c |  18 +++-
>  cache.h|   2 +
>  environment.c  |   1 +
>  fsck.c |   6 +-
>  object.c   |  19 
>  object.h   |  19 
>  promised-object.c  | 130 
> +
>  promised-object.h  |  22 +
>  setup.c|   7 +-
>  t/t3907-promised-object.sh |  41 
>  t/test-lib-functions.sh|   6 ++
>  13 files changed, 273 insertions(+), 5 deletions(-)
>  create mode 100644 promised-object.c
>  create mode 100644 promised-object.h
>  create mode 100755 t/t3907-promised-object.sh
>
> diff --git a/Documentation/technical/repository-version.txt 
> b/Documentation/technical/repository-version.txt
> index 00ad37986..f8b82c1c7 100644
> --- a/Documentation/technical/repository-version.txt
> +++ b/Documentation/technical/repository-version.txt
> @@ -86,3 +86,9 @@ for testing format-1 compatibility.
>  When the config key `extensions.preciousObjects` is set to `true`,
>  objects in the repository MUST NOT be deleted (e.g., by `git-prune` or
>  `git repack -d`).
> +
> +`promisedObjects`
> +~
> +
> +(Explain this - basically a string containing a command to be run
> +whenever a missing object needs to be fetched.)
> diff --git a/Makefile b/Makefile
> index 9c9c42f8f..c1446d5ef 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -828,6 +828,7 @@ LIB_OBJS += preload-index.o
>  LIB_OBJS += pretty.o
>  LIB_OBJS += prio-queue.o
>  LIB_OBJS += progress.o
> +LIB_OBJS += promised-object.o
>  LIB_OBJS += prompt.o
>  LIB_OBJS += quote.o
>  LIB_OBJS += reachable.o
> diff --git a/builtin/fsck.c b/builtin/fsck.c
> index 462b8643b..49e21f361 100644
> --- a/builtin/fsck.c
> +++ b/builtin/fsck.c
> @@ -15,6 +15,7 @@
>  #include "progress.h"
>  #include "streaming.h"
>  #include "decorate.h"
> +#include "promised-object.h"
>
>  #define REACHABLE 0x0001
>  #define SEEN  0x0002
> @@ -44,6 +45,7 @@ static int name_objects;
>  #define ERROR_REACHABLE 02
>  #define ERROR_PACK 04
>  #define ERROR_REFS 010
> +#define ERROR_PROMISED_OBJECT 011
>
>  static const char *describe_object(struct object *obj)
>  {
> @@ -436,7 +438,7 @@ static int fsck_handle_ref(const char *refname, const 
> struct object_id *oid,
>  {
> struct object *obj;
>
> -   obj = parse_object(oid);
> +   obj = parse_or_promise_object(oid);
> if (!obj) {
> error("%s: invalid sha1 pointer %s", refname, 
> oid_to_hex(oid));
> errors_found |= ERROR_REACHABLE;
> @@ -592,7 +594,7 @@ static int fsck_cache_tree(struct cache_tree *it)
> fprintf(stderr, "Checking cache tree\n");
>
> if (0 <= it->entry_count) {
> -   struct object *obj = parse_object(>oid);
> +   struct object *obj = parse_or_promise_object(>oid);
> if (!obj) {
> error("%s: invalid sha1 pointer in cache-tree",
>   oid_to_hex(>oid));
> @@ -635,6 +637,12 @@ static int mark_packed_for_connectivity(const struct 
> object_id *oid,
> return 0;
>  }
>
> +static int mark_have_promised_object(const struct object_id *oid, void *data)
> +{
> +   mark_object_for_connectivity(oid);
> +   return 0;
> +}
> +
>  static char const * const fsck_usage[] = {
> N_("git fsck [] [...]"),
> NULL
> @@ -690,6 +698,11 @@ int cmd_fsck(int argc, const char **argv, const char 
> *prefix)
>
>  

[PATCH v2 1/1] submodule--helper: teach push-check to handle HEAD

2017-07-20 Thread Brandon Williams
In 06bf4ad1d (push: propagate remote and refspec with
--recurse-submodules) push was taught how to propagate a refspec down to
submodules when the '--recurse-submodules' flag is given.  The only refspecs
that are allowed to be propagated are ones which name a ref which exists
in both the superproject and the submodule, with the caveat that 'HEAD'
was disallowed.

This patch teaches push-check (the submodule helper which determines if
a refspec can be propagated to a submodule) to permit propagating 'HEAD'
if and only if the superproject and the submodule both have the same
named branch checked out and the submodule is not in a detached head
state.

Signed-off-by: Brandon Williams 
---

Changes in V2:
 * fixed a few style issues
 * shifted argv/argc to prevent more damage to the code than is necessary.

 builtin/submodule--helper.c| 49 ++
 submodule.c| 18 +---
 t/t5531-deep-submodule-push.sh | 25 -
 3 files changed, 79 insertions(+), 13 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 6abdad329..0939e3912 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -1108,9 +1108,28 @@ static int resolve_remote_submodule_branch(int argc, 
const char **argv,
 static int push_check(int argc, const char **argv, const char *prefix)
 {
struct remote *remote;
+   const char *superproject_head;
+   char *head;
+   int detached_head = 0;
+   struct object_id head_oid;
 
-   if (argc < 2)
-   die("submodule--helper push-check requires at least 1 
argument");
+   if (argc < 3)
+   die("submodule--helper push-check requires at least 2 
arguments");
+
+   /*
+* superproject's resolved head ref.
+* if HEAD then the superproject is in a detached head state, otherwise
+* it will be the resolved head ref.
+*/
+   superproject_head = argv[1];
+   argv++;
+   argc--;
+   /* Get the submodule's head ref and determine if it is detached */
+   head = resolve_refdup("HEAD", 0, head_oid.hash, NULL);
+   if (!head)
+   die(_("Failed to resolve HEAD as a valid ref."));
+   if (!strcmp(head, "HEAD"))
+   detached_head = 1;
 
/*
 * The remote must be configured.
@@ -1133,18 +1152,30 @@ static int push_check(int argc, const char **argv, 
const char *prefix)
if (rs->pattern || rs->matching)
continue;
 
-   /*
-* LHS must match a single ref
-* NEEDSWORK: add logic to special case 'HEAD' once
-* working with submodules in a detached head state
-* ceases to be the norm.
-*/
-   if (count_refspec_match(rs->src, local_refs, NULL) != 1)
+   /* LHS must match a single ref */
+   switch (count_refspec_match(rs->src, local_refs, NULL)) 
{
+   case 1:
+   break;
+   case 0:
+   /*
+* If LHS matches 'HEAD' then we need to ensure
+* that it matches the same named branch
+* checked out in the superproject.
+*/
+   if (!strcmp(rs->src, "HEAD")) {
+   if (!detached_head &&
+   !strcmp(head, superproject_head))
+   break;
+   die("HEAD does not match the named 
branch in the superproject");
+   }
+   default:
die("src refspec '%s' must name a ref",
rs->src);
+   }
}
free_refspec(refspec_nr, refspec);
}
+   free(head);
 
return 0;
 }
diff --git a/submodule.c b/submodule.c
index 6531c5d60..36f45f5a5 100644
--- a/submodule.c
+++ b/submodule.c
@@ -1015,7 +1015,8 @@ static int push_submodule(const char *path,
  * Perform a check in the submodule to see if the remote and refspec work.
  * Die if the submodule can't be pushed.
  */
-static void submodule_push_check(const char *path, const struct remote *remote,
+static void submodule_push_check(const char *path, const char *head,
+const struct remote *remote,
 const char **refspec, int refspec_nr)
 {
struct child_process cp = CHILD_PROCESS_INIT;
@@ -1023,6 +1024,7 @@ static void submodule_push_check(const char *path, const 
struct remote *remote,
 
argv_array_push(, 

Re: [RFC PATCH v2 1/4] object: remove "used" field from struct object

2017-07-20 Thread Ben Peart



On 7/19/2017 8:55 PM, Jonathan Tan wrote:

On Wed, 19 Jul 2017 17:36:39 -0700
Stefan Beller  wrote:


On Wed, Jul 19, 2017 at 5:21 PM, Jonathan Tan  wrote:

The "used" field in struct object is only used by builtin/fsck. Remove
that field and modify builtin/fsck to use a flag instead.


The patch looks good to me (I would even claim this could
go in as an independent cleanup, not tied to the RFCish nature
of the later patches), though I have a question:
How did you select 0x0008 for USED, i.e. does it
collide with other flags (theoretically?), and if so
how do we make sure to avoid the collusion in
the future?


Thanks. 0x0008 was the next one in the series (as you can see in the
context). As for whether it collides with other flags, that is what the
chart in object.h is for (which I have added to in this patch), I
presume. As far as I can tell, each component must make sure not to
overlap with any other component running concurrently.



This patch seems reasonable to me.  I agree it could go in separately as 
a general cleanup.


Re: Binary files

2017-07-20 Thread Stefan Beller
On Thu, Jul 20, 2017 at 12:41 AM, Volodymyr Sendetskyi
 wrote:
> It is known, that git handles badly storing binary files in its
> repositories at all.
> This is especially about large files: even without any changes to
> these files, their copies are snapshotted on each commit. So even
> repositories with a small amount of code can grove very fast in size
> if they contain some great binary files. Alongside this, the SVN is
> much better about that, because it make changes to the server version
> of file only if some changes were done.
>
> So the question is: why not implementing some feature, that would
> somehow handle this problem?

There are 'external' solutions such as git LFS and git-annex, mentioned
in replies nearby.

But note there are also efforts to handle large binary files internally
https://public-inbox.org/git/3420d9ae9ef86b78af1abe721891233e3f5865a2.1500508695.git.jonathanta...@google.com/
https://public-inbox.org/git/20170713173459.3559-1-...@jeffhostetler.com/
https://public-inbox.org/git/20170620075523.26961-1-chrisc...@tuxfamily.org/


Re: [PATCH] t: lib-gpg: flush gpg agent on startup

2017-07-20 Thread Santiago Torres
This is the patch that stemmed from [1].

I tried to keep it simple and not noisy, alhtough it breaks the &&
chain, it needs to be run right before the --import command. I also
decided to drop the switch chain in case that regression was to be
introduced in the future in other versions (hopefully gpgconf goes
nowhere by then).

I was able to test this on debian oldstable/stable and arch.

Cheers!
-Santiago.

[1] https://public-inbox.org/git/xmqqvampmnmv@gitster.mtv.corp.google.com/

On Thu, Jul 20, 2017 at 12:58:14PM -0400, santi...@nyu.edu wrote:
> From: Santiago Torres 
> 
> When running gpg-relevant tests, a gpg-daemon is spawned for each
> GNUPGHOME used. This daemon may stay running after the test and cache
> file descriptors for the trash directories, even after the trash
> directory is removed. This leads to ENOENT errors when attempting to
> create files if tests are run multiple times.
> 
> Add a cleanup script to force flushing the gpg-agent for that GNUPGHOME
> (if any) before setting up the GPG relevant-environment.
> 
> Helped-by: Junio C Hamano 
> Signed-off-by: Santiago Torres 
> ---
>  t/lib-gpg.sh | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/t/lib-gpg.sh b/t/lib-gpg.sh
> index ec2aa8f68..7a6c7ee6f 100755
> --- a/t/lib-gpg.sh
> +++ b/t/lib-gpg.sh
> @@ -31,6 +31,7 @@ then
>   chmod 0700 ./gpghome &&
>   GNUPGHOME="$(pwd)/gpghome" &&
>   export GNUPGHOME &&
> + gpgconf --kill gpg-agent 2>&1 >/dev/null
>   gpg --homedir "${GNUPGHOME}" 2>/dev/null --import \
>   "$TEST_DIRECTORY"/lib-gpg/keyring.gpg &&
>   gpg --homedir "${GNUPGHOME}" 2>/dev/null --import-ownertrust \
> -- 
> 2.13.3
> 


signature.asc
Description: PGP signature


Re: --interactive mode: readline support ⌨⬆

2017-07-20 Thread Leah Neukirchen
Marcel Partap  writes:

> Dear git devs,
> wouldn't it be great to have the power of readline added to the power
> of git interactive commands? Yes, rlwrap will do the job, but still.
> Or am I missing something obvious? Am using debian's 2.11.0-2 ...

Just use "rlwrap git clean -i".

-- 
Leah Neukirchen    http://leah.zone



[PATCH] t: lib-gpg: flush gpg agent on startup

2017-07-20 Thread santiago
From: Santiago Torres 

When running gpg-relevant tests, a gpg-daemon is spawned for each
GNUPGHOME used. This daemon may stay running after the test and cache
file descriptors for the trash directories, even after the trash
directory is removed. This leads to ENOENT errors when attempting to
create files if tests are run multiple times.

Add a cleanup script to force flushing the gpg-agent for that GNUPGHOME
(if any) before setting up the GPG relevant-environment.

Helped-by: Junio C Hamano 
Signed-off-by: Santiago Torres 
---
 t/lib-gpg.sh | 1 +
 1 file changed, 1 insertion(+)

diff --git a/t/lib-gpg.sh b/t/lib-gpg.sh
index ec2aa8f68..7a6c7ee6f 100755
--- a/t/lib-gpg.sh
+++ b/t/lib-gpg.sh
@@ -31,6 +31,7 @@ then
chmod 0700 ./gpghome &&
GNUPGHOME="$(pwd)/gpghome" &&
export GNUPGHOME &&
+   gpgconf --kill gpg-agent 2>&1 >/dev/null
gpg --homedir "${GNUPGHOME}" 2>/dev/null --import \
"$TEST_DIRECTORY"/lib-gpg/keyring.gpg &&
gpg --homedir "${GNUPGHOME}" 2>/dev/null --import-ownertrust \
-- 
2.13.3



Re: [PATCH 5/6] RelNotes: remove duplicate mention of PCRE v2

2017-07-20 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason   writes:

> That we can link to PCRE v2 is already covered above in "Backward
> compatibility notes and other notable changes", no need to mention it
> twice.

This is actually deliberate, as I'd prefer to have a description
that is written at the same detail-level (i.e. "just a bullet item,
if you want to know more, go read the doc") as the other items in
the list, whether a more detailed description is given elsewhere.



> Signed-off-by: Ævar Arnfjörð Bjarmason 
> ---
>  Documentation/RelNotes/2.14.0.txt | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/Documentation/RelNotes/2.14.0.txt 
> b/Documentation/RelNotes/2.14.0.txt
> index 0e363f2af3..fb6a3dba31 100644
> --- a/Documentation/RelNotes/2.14.0.txt
> +++ b/Documentation/RelNotes/2.14.0.txt
> @@ -88,8 +88,7 @@ UI, Workflows & Features
> learned to say "it's a pathspec" a bit more often when the syntax
> looks like so.
>  
> - * Update "perl-compatible regular expression" support to enable JIT
> -   and also allow linking with the newer PCRE v2 library.
> + * Update "perl-compatible regular expression" support to enable JIT.
>  
>   * "filter-branch" learned a pseudo filter "--setup" that can be used
> to define common functions/variables that can be used by other


Re: [PATCH 4/6] RelNotes: mention that PCRE v2 exposes the same syntax

2017-07-20 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason   writes:

> For someone not familiar with PCRE or having read its own
> documentation this isn't obvious, let's explicitly mention it so
> package maintainers won't fear upgrading least they break things for
> their users.

If packagers trust our assessment on one external library's backward
compatibility, our using and recommending it is sufficient to them.
If the don't trust us but rather want to verify themselves, the new
paragraph is not all that useful to them, and more importantly, we
are not in the position or business of giving extended warranty to
other people's library.

For these reasons, I'd feel a bit hesitant to add something like
this.

>
> Signed-off-by: Ævar Arnfjörð Bjarmason 
> ---
>  Documentation/RelNotes/2.14.0.txt | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/Documentation/RelNotes/2.14.0.txt 
> b/Documentation/RelNotes/2.14.0.txt
> index 7ed93bca37..0e363f2af3 100644
> --- a/Documentation/RelNotes/2.14.0.txt
> +++ b/Documentation/RelNotes/2.14.0.txt
> @@ -28,6 +28,9 @@ Backward compatibility notes and other notable changes.
> upstream PCRE maintainer has abandoned v1 maintenance for all but
> the most critical bug fixes, use of v2 is recommended.
>  
> +   Version v2 of the library is fully backwards compatible with the
> +   Perl-compatible regular expression syntax exposed by git (sans a
> +   few obscure bugfixes).
>  
>  Updates since v2.13
>  ---


Re: Reducing redundant build at Travis?

2017-07-20 Thread Junio C Hamano
Lars Schneider  writes:

>> On 14 Jul 2017, at 17:32, Jeff King  wrote:
>> 
>> I don't know if Travis's cache storage is up to that challenge. We could
>> probably build such a lock on top of third-party storage, but things are
>> rapidly getting more complex.
>
> I think we shouldn't go there because of the complexity. I reached out
> to TravisCI and asked about the "hash build twice" problem [1]. Unfortunately,
> I got no response, yet. The issue could also be considered a feature as you
> could perform different actions in your TravisCI configuration based on
> the branch name.

Oh, no doubt that it is a feature, and a very useful one at that.
With that, we can change actions depending on the branch name in
such a way that normally we do our build and test, but when we are
on a branch (not testing a tag) and its tip is tagged, we become
no-op to avoid the cost of testing.  That is the feature we exactly
want.

The question I had, and wanted a help from you, was if there was a
way we can write that "are we on a branch (not testing a tag) and is
its tip tagged?" test only once in .travis.yml, even though we have
quite a many items in the matrix.  With the current way .travis.yml
is laid out, without such a facility, we'd need the logic sprinkled
to at the beginning of all "before_install:" or something like that,
which is not quite optimal.

> I think Junio's original suggestions for the Windows build makes a lot
> of sense because it saves Dscho's compute resources:
>
> --- a/ci/run-windows-build.sh
> +++ b/ci/run-windows-build.sh
> @@ -12,6 +12,12 @@ test -z "$GFW_CI_TOKEN" && echo "GFW_CI_TOKEN not defined" 
> && exit
> BRANCH=$1
> COMMIT=$2
>
> +if TAG=$(git describe --exact-match "$COMMIT" 2>/dev/null)
> +then
> + echo "Tip of $BRANCH exactly at $TAG"
> + exit 0
> +fi
> +
> gfwci () {
>   local CURL_ERROR_CODE HTTP_CODE
>   CONTENT_FILE=$(mktemp -t "git-windows-ci-XX")
>
> However, I don't think we need to do the same for the builds that
> use TravisCI resources. If they would be concerned about that then 
> they wouldn't build the same hash twice in the first place.

But I do care ;-) It would be nice for me not to have to wait and
keep worrying about MacOSX builds.  It also is not nice that
branches for other tests are blocked and have to wait only because
'maint' and 'vX.Y.Z' are both tested even though we know they are
the same tree.  This is where my question earlier comes from---is
there a good way to do the "not test a branch if its at a tagged
commit, because that tag will be tested anyway" test only at one
place in the .travis.yml we have?---because it's not like we only
care about Windows.



[PATCH 4/6] RelNotes: mention that PCRE v2 exposes the same syntax

2017-07-20 Thread Ævar Arnfjörð Bjarmason
For someone not familiar with PCRE or having read its own
documentation this isn't obvious, let's explicitly mention it so
package maintainers won't fear upgrading least they break things for
their users.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 Documentation/RelNotes/2.14.0.txt | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/Documentation/RelNotes/2.14.0.txt 
b/Documentation/RelNotes/2.14.0.txt
index 7ed93bca37..0e363f2af3 100644
--- a/Documentation/RelNotes/2.14.0.txt
+++ b/Documentation/RelNotes/2.14.0.txt
@@ -28,6 +28,9 @@ Backward compatibility notes and other notable changes.
upstream PCRE maintainer has abandoned v1 maintenance for all but
the most critical bug fixes, use of v2 is recommended.
 
+   Version v2 of the library is fully backwards compatible with the
+   Perl-compatible regular expression syntax exposed by git (sans a
+   few obscure bugfixes).
 
 Updates since v2.13
 ---
-- 
2.13.2.932.g7449e964c



[PATCH 2/6] RelNotes: mention "log: make --regexp-ignore-case work with --perl-regexp"

2017-07-20 Thread Ævar Arnfjörð Bjarmason
To inform users that they can use --regexp-ignore-case now, and that
existing scripts which relied on that + PCRE may be buggy. See
9e3cbc59d5 ("log: make --regexp-ignore-case work with --perl-regexp",
2017-05-20).

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 Documentation/RelNotes/2.14.0.txt | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/Documentation/RelNotes/2.14.0.txt 
b/Documentation/RelNotes/2.14.0.txt
index 9a4c2bb649..c125f8fd68 100644
--- a/Documentation/RelNotes/2.14.0.txt
+++ b/Documentation/RelNotes/2.14.0.txt
@@ -120,6 +120,9 @@ UI, Workflows & Features
  * "git log" learned -P as a synonym for --perl-regexp, "git grep"
already had such a synonym.
 
+ * "git log" didn't understand --regexp-ignore-case when combined with
+   --perl-regexp. This has been fixed.
+
 Performance, Internal Implementation, Development Support etc.
 
  * The default packed-git limit value has been raised on larger
-- 
2.13.2.932.g7449e964c



[PATCH 5/6] RelNotes: remove duplicate mention of PCRE v2

2017-07-20 Thread Ævar Arnfjörð Bjarmason
That we can link to PCRE v2 is already covered above in "Backward
compatibility notes and other notable changes", no need to mention it
twice.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 Documentation/RelNotes/2.14.0.txt | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/Documentation/RelNotes/2.14.0.txt 
b/Documentation/RelNotes/2.14.0.txt
index 0e363f2af3..fb6a3dba31 100644
--- a/Documentation/RelNotes/2.14.0.txt
+++ b/Documentation/RelNotes/2.14.0.txt
@@ -88,8 +88,7 @@ UI, Workflows & Features
learned to say "it's a pathspec" a bit more often when the syntax
looks like so.
 
- * Update "perl-compatible regular expression" support to enable JIT
-   and also allow linking with the newer PCRE v2 library.
+ * Update "perl-compatible regular expression" support to enable JIT.
 
  * "filter-branch" learned a pseudo filter "--setup" that can be used
to define common functions/variables that can be used by other
-- 
2.13.2.932.g7449e964c



[PATCH 0/6] 2.14 RelNotes improvements

2017-07-20 Thread Ævar Arnfjörð Bjarmason
> Ævar Arnfjörð Bjarmason  writes:
>
>> On Thu, Jul 13 2017, Junio C. Hamano jotted:
>>
>> Proposed improvements for the release notes (is this a good way to
>> propose RelNotes changes?)
>
> Thanks.  You could also throw a patch just like any bugfix/update
> to documentation, I would think.

Here's a few patches to improve the relnotes. I started just writing
6/6 since I think (I don't care about the wording) that we should in
some way mention the items in the list in the 6/6 commit message.

Along the way I noticed a few more missing things.

Ævar Arnfjörð Bjarmason (6):
  RelNotes: mention "log: add -P as a synonym for --perl-regexp"
  RelNotes: mention "log: make --regexp-ignore-case work with
--perl-regexp"
  RelNotes: mention "sha1dc: optionally use sha1collisiondetection as a
submodule"
  RelNotes: mention that PCRE v2 exposes the same syntax
  RelNotes: remove duplicate mention of PCRE v2
  RelNotes: add more notes about PCRE in 2.14

 Documentation/RelNotes/2.14.0.txt | 25 +++--
 1 file changed, 23 insertions(+), 2 deletions(-)

-- 
2.13.2.932.g7449e964c



[PATCH 1/6] RelNotes: mention "log: add -P as a synonym for --perl-regexp"

2017-07-20 Thread Ævar Arnfjörð Bjarmason
To inform users that they can use the short form now. See
7531a2dd87 ("log: add -P as a synonym for --perl-regexp", 2017-05-25).

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 Documentation/RelNotes/2.14.0.txt | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/Documentation/RelNotes/2.14.0.txt 
b/Documentation/RelNotes/2.14.0.txt
index 2f3879fe96..9a4c2bb649 100644
--- a/Documentation/RelNotes/2.14.0.txt
+++ b/Documentation/RelNotes/2.14.0.txt
@@ -117,6 +117,8 @@ UI, Workflows & Features
  * "git pull --rebase --recurse-submodules" learns to rebase the
branch in the submodules to an updated base.
 
+ * "git log" learned -P as a synonym for --perl-regexp, "git grep"
+   already had such a synonym.
 
 Performance, Internal Implementation, Development Support etc.
 
-- 
2.13.2.932.g7449e964c



[PATCH 3/6] RelNotes: mention "sha1dc: optionally use sha1collisiondetection as a submodule"

2017-07-20 Thread Ævar Arnfjörð Bjarmason
To note that merely cloning git.git without --recurse-submodules
doesn't get you a full copy of the code anymore. See
5f6482d642 ("RelNotes: mention "log: make --regexp-ignore-case work
with --perl-regexp"", 2017-07-20).

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 Documentation/RelNotes/2.14.0.txt | 5 +
 1 file changed, 5 insertions(+)

diff --git a/Documentation/RelNotes/2.14.0.txt 
b/Documentation/RelNotes/2.14.0.txt
index c125f8fd68..7ed93bca37 100644
--- a/Documentation/RelNotes/2.14.0.txt
+++ b/Documentation/RelNotes/2.14.0.txt
@@ -235,6 +235,11 @@ Performance, Internal Implementation, Development Support 
etc.
behaviour of the comparison function can be specified at the time a
hashmap is initialized.
 
+ * The "collision detecting" SHA-1 implementation shipped with 2.13 is
+   now integrated into git.git as a submodule (the first submodule to
+   ship with git.git). Clone git.git with --recurse-submodules to get
+   it. For now a non-submodule copy of the same code is also shipped
+   as part of the tree.
 
 Also contains various documentation updates and code clean-ups.
 
-- 
2.13.2.932.g7449e964c



[PATCH 6/6] RelNotes: add more notes about PCRE in 2.14

2017-07-20 Thread Ævar Arnfjörð Bjarmason
We were missing any mention that:

 - PCRE is now faster with JIT
 - That it's now faster than the other regex backends
 - That therefore you might want to use it by default, but beware of
   the incompatible syntax.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 Documentation/RelNotes/2.14.0.txt | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/Documentation/RelNotes/2.14.0.txt 
b/Documentation/RelNotes/2.14.0.txt
index fb6a3dba31..a6a1cb963b 100644
--- a/Documentation/RelNotes/2.14.0.txt
+++ b/Documentation/RelNotes/2.14.0.txt
@@ -88,7 +88,16 @@ UI, Workflows & Features
learned to say "it's a pathspec" a bit more often when the syntax
looks like so.
 
- * Update "perl-compatible regular expression" support to enable JIT.
+ * Update "perl-compatible regular expression" support to enable
+   JIT.
+
+   This makes grep.patternType=perl (and -P and --perl-regexp) much
+   faster for "git grep" and "git log", and is generally faster than
+   the system's POSIX regular expression implementation. Users
+   concerned with "git grep" performance or "git log --grep"
+   performance might want to try setting grep.patternType=perl. Note
+   that the syntax isn't compatible with git's default of
+   grep.patternType=basic.
 
  * "filter-branch" learned a pseudo filter "--setup" that can be used
to define common functions/variables that can be used by other
-- 
2.13.2.932.g7449e964c



Re: --interactive mode: readline support ⌨⬆

2017-07-20 Thread Marcel Partap
Haha, totally slipped by me that there exist two kinds of interactive mode. Not 
that I haven't used both... Sorry for overlooking/being to unspecific.

#Regards/Marcel X )


Re: --interactive mode: readline support ⌨⬆

2017-07-20 Thread Martin Ågren
On 20 July 2017 at 11:20, Marcel Partap  wrote:
> So the readline library powers the advanced line editing capabilities behind 
> f.e. the bash or the ipython shell. Besides navigating with the cursor keys, 
> it provides a history function accessible by the up cursor key ⌨⬆ .
> At the moment, git interactive mode seems (?) not to make use of it, so 
> there's no line editing at all. A typo at the beginning of a line must be 
> corrected by reverse deleting up to it, then retyping the rest unchanged. 
> With readline, the home/end keys for jumping to beginning or end work, as do 
> the left/right keys in a familiar way.
> The history function comes in handy when f.e. repeatedly using `git clean -i` 
> and feeding the "filter by pattern" command a string like "*.patch". Like, 
> that's the use case that prompted me to write to this list. : )

Ok, I see. When I saw your first mail, I was thinking about "git
rebase -i" and thought, "how could that possibly help?". :) I have no
idea what it would take to implement this (portably!) in git.

Martin


Expected behavior of "git check-ignore"...

2017-07-20 Thread John Szakmeister
A StackOverflow user posted a question about how to reliably check
whether a file would be ignored by "git add" and expected "git
check-ignore" to return results that matched git add's behavior.  It
turns out that it doesn't.  If there is a negation rule, we end up
returning that exclude and printing it and exiting with 0 (there are
some ignored files) even though the file has been marked to not be
ignored.

Is the expected behavior of "git check-ignore" to return 0 even if the
file is not ignore when a negation is present?


git init .
echo 'foo/*' > .gitignore
echo '!foo/bar' > .gitignore
mkdir foo
touch foo/bar
git check-ignore foo/bar


I expect the last command to return 1 (no files are ignored), but it
doesn't.  The StackOverflow user had the same expectation, and imagine
others do as well.  OTOH, it looks like the command is really meant to
be a debugging tool--to show me the line in a .gitignore associated
with this file, if there is one.  In which case, the behavior is
correct but the return code description is a bit misleading (0 means
the file is ignored, which isn't true here).

Thoughts?  It seems like this question was asked before several years
ago but didn't get a response.

Thanks!

-John

PS The SO question is here:
https://stackoverflow.com/questions/45210790/how-to-reliably-check-whether-a-file-is-ignored-by-git


Dear Talented

2017-07-20 Thread Kim Sharma
Dear Talented,

I am Talent Scout For BLUE SKY FILM STUDIO, Present Blue sky Studio a
Film Corporation Located in the United State, is Soliciting for the
Right to use Your Photo/Face and Personality as One of the Semi -Major
Role/ Character in our Upcoming ANIMATED Stereoscope 3D Movie-The Story
of Anubis (Anubis 2018) The Movie is Currently Filming (In
Production) Please Note That There Will Be No Auditions, Traveling or
Any Special / Professional Acting Skills, Since the Production of This
Movie Will Be Done with our State of Art Computer -Generating Imagery
Equipment. We Are Prepared to Pay the Total Sum of $620,000.00 USD. For
More Information/Understanding, Please Write us on the E-Mail Below.
CONTACT EMAIL: blueskyfi...@usa.com
All Reply to: blueskyfi...@usa.com
Note: Only the Response send to this mail will be Given a Prior
Consideration.


Talent Scout
Kim Sharma


Re: --interactive mode: readline support ⌨⬆

2017-07-20 Thread Marcel Partap
Ok very good point Martin ; )
I nefariously hid one obvious use case as trailing emoji™ in the subject, but a 
better way to make a point is to properly explain.
So the readline library powers the advanced line editing capabilities behind 
f.e. the bash or the ipython shell. Besides navigating with the cursor keys, it 
provides a history function accessible by the up cursor key ⌨⬆ .
At the moment, git interactive mode seems (?) not to make use of it, so there's 
no line editing at all. A typo at the beginning of a line must be corrected by 
reverse deleting up to it, then retyping the rest unchanged. With readline, the 
home/end keys for jumping to beginning or end work, as do the left/right keys 
in a familiar way.
The history function comes in handy when f.e. repeatedly using `git clean -i` 
and feeding the "filter by pattern" command a string like "*.patch". Like, 
that's the use case that prompted me to write to this list. : )

#Best Regards/Marcel


Re: --interactive mode: readline support ⌨⬆

2017-07-20 Thread Martin Ågren
On 20 July 2017 at 10:21, Marcel Partap  wrote:
> wouldn't it be great to have the power of readline added to the power of git 
> interactive commands? Yes, rlwrap will do the job, but still.
> Or am I missing something obvious?

Well maybe *I* am missing something obvious. :) Could you be a bit
more specific? What is the use-case? Once this feature were in place,
what would it look like? Could you give an example of what you as a
user would do to solve some particular problem -- and how that differs
from how you would solve it today?

Martin


Re: Binary files

2017-07-20 Thread Lars Schneider

> On 20 Jul 2017, at 09:41, Volodymyr Sendetskyi  wrote:
> 
> It is known, that git handles badly storing binary files in its
> repositories at all.
> This is especially about large files: even without any changes to
> these files, their copies are snapshotted on each commit. So even
> repositories with a small amount of code can grove very fast in size
> if they contain some great binary files. Alongside this, the SVN is
> much better about that, because it make changes to the server version
> of file only if some changes were done.
> 
> So the question is: why not implementing some feature, that would
> somehow handle this problem?
> 
> Of course, I don't know the internal git structure and the way of
> working + some nuances (likely about the snapshots at all and the way
> they are done), so handling this may be a great problem. But the
> easiest feature for me as an end user will be something like
> '.gitbinary', where I can list binary files, that would behave like on
> SVN, or even more optimal, if you can implement it. Maybe there will
> be a need for separate kinds of repositories, or even servers. But
> that would be a great change and a logical way of next git's
> evolution.

GitLFS [1] might be the workaround you want. There are efforts to bring 
large file support natively to Git [2].

I tried to explain GitLFS in more detail here: 
https://www.youtube.com/watch?v=YQzNfb4IwEY

- Lars


[1] https://git-lfs.github.com/
[2] https://public-inbox.org/git/20170620075523.26961-1-chrisc...@tuxfamily.org/



--interactive mode: readline support ⌨⬆

2017-07-20 Thread Marcel Partap
Dear git devs,
wouldn't it be great to have the power of readline added to the power of git 
interactive commands? Yes, rlwrap will do the job, but still.
Or am I missing something obvious? Am using debian's 2.11.0-2 ...

#BestRegards/Marcel Partap


Re: Reducing redundant build at Travis?

2017-07-20 Thread Lars Schneider

> On 14 Jul 2017, at 17:32, Jeff King  wrote:
> 
> On Fri, Jul 14, 2017 at 07:54:16AM -0700, Junio C Hamano wrote:
> 
>>> The "git test" script[1] uses this strategy with git-notes as the
>>> storage, and I've found it quite useful. I don't think we can rely on
>>> git-notes, but I think Travis gives us some storage options. Even just a
>>> best-effort cache directory would probably be sufficient (this is an
>>> optimization, after all).
>> 
>> We do seem to use some persistence to order prove tests already, but
>> I do not think it helps the common case, where my end-of-day push
>> pushes out 'maint' and 'v2.13.3' at the same time, because the push
>> is made with "git push --follow-tags $there maint master next pu"
>> and the new tag happens to be at 'maint'.  It would be nice if
>> Travis runs were sequential, but I often observe that it creates
>> jobs for these multiple branches and tags pushed at the same time,
>> and start running a few of them.
> 
> Ah, right, I didn't think about how these are racing. You'd need storage
> which allows some kind of atomic operation to "claim" the tree as a
> work-in-progress (and anybody who loses the race to get the lock would
> have to spin waiting for the winner to tell them the real status).
> 
> I don't know if Travis's cache storage is up to that challenge. We could
> probably build such a lock on top of third-party storage, but things are
> rapidly getting more complex.

I think we shouldn't go there because of the complexity. I reached out
to TravisCI and asked about the "hash build twice" problem [1]. Unfortunately,
I got no response, yet. The issue could also be considered a feature as you
could perform different actions in your TravisCI configuration based on
the branch name.

I think Junio's original suggestions for the Windows build makes a lot
of sense because it saves Dscho's compute resources:

--- a/ci/run-windows-build.sh
+++ b/ci/run-windows-build.sh
@@ -12,6 +12,12 @@ test -z "$GFW_CI_TOKEN" && echo "GFW_CI_TOKEN not defined" 
&& exit
BRANCH=$1
COMMIT=$2

+if TAG=$(git describe --exact-match "$COMMIT" 2>/dev/null)
+then
+   echo "Tip of $BRANCH exactly at $TAG"
+   exit 0
+fi
+
gfwci () {
local CURL_ERROR_CODE HTTP_CODE
CONTENT_FILE=$(mktemp -t "git-windows-ci-XX")


However, I don't think we need to do the same for the builds that
use TravisCI resources. If they would be concerned about that then 
they wouldn't build the same hash twice in the first place.

- Lars


[1] https://twitter.com/kit3bus/status/885902189692112896



Re: Binary files

2017-07-20 Thread Konstantin Khomoutov
On Thu, Jul 20, 2017 at 10:41:48AM +0300, Volodymyr Sendetskyi wrote:

> It is known, that git handles badly storing binary files in its
> repositories at all.
[...]
> So the question is: why not implementing some feature, that would
> somehow handle this problem?
[...]

Have you examined git-lfs and git-annex?
(Actually, there are/were more solutions [1] but these two appear to be
the most used novadays.)

Such solutions allow one to use Git for what it does best and defer
handling of big files (or files for which lock-modify-unlock works better
than the usual modify-merge) to a specialized solution.

1. http://blog.deveo.com/storing-large-binary-files-in-git-repositories/



Re: Binary files

2017-07-20 Thread Bryan Turner
On Thu, Jul 20, 2017 at 12:41 AM, Volodymyr Sendetskyi
 wrote:
> It is known, that git handles badly storing binary files in its
> repositories at all.
> This is especially about large files: even without any changes to
> these files, their copies are snapshotted on each commit. So even
> repositories with a small amount of code can grove very fast in size
> if they contain some great binary files. Alongside this, the SVN is
> much better about that, because it make changes to the server version
> of file only if some changes were done.
>
> So the question is: why not implementing some feature, that would
> somehow handle this problem?

Like Git LFS or git annex? Features have been implemented to better
handle large files; they're just not necessarily part of core Git.
Have you checked whether one of those solutions might work for your
use case?

Best regards,
Bryan Turner


Re: Binary files

2017-07-20 Thread Volodymyr Sendetskyi
It is known, that git handles badly storing binary files in its
repositories at all.
This is especially about large files: even without any changes to
these files, their copies are snapshotted on each commit. So even
repositories with a small amount of code can grove very fast in size
if they contain some great binary files. Alongside this, the SVN is
much better about that, because it make changes to the server version
of file only if some changes were done.

So the question is: why not implementing some feature, that would
somehow handle this problem?

Of course, I don't know the internal git structure and the way of
working + some nuances (likely about the snapshots at all and the way
they are done), so handling this may be a great problem. But the
easiest feature for me as an end user will be something like
'.gitbinary', where I can list binary files, that would behave like on
SVN, or even more optimal, if you can implement it. Maybe there will
be a need for separate kinds of repositories, or even servers. But
that would be a great change and a logical way of next git's
evolution.