Re: [PATCH] git-clean: remove fifo, devices, socket files

2016-07-14 Thread Johannes Sixt

Am 15.07.2016 um 04:42 schrieb Andrey Vagin:

Currently git-clean removes only links and files, but
there can be special files like fifo, sockets, devices.

I think git-clean has to remove them too.


I think that is not necessary. If you do

  mkfifo fifo && sudo mknod zero c 1 5

then 'git status' will not report them and 'git add' will not add them 
to a repository.


Similarly, if you happen to have a special file under a name in your 
working tree where the repository has regular files, then 'git status' 
will report the names as modified, and 'git reset --hard' will replace 
the special files by the files recorded in the repository.


IOW: These special files are invisible for Git unless it already knows 
the names. The latter case is outside 'git clean's domain, and the 
former case really means that special files in the working tree are left 
at the user's discretion.


-- Hannes

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 3/5] archive-tar: write extended headers for file sizes >= 8GB

2016-07-14 Thread Torsten Bögershausen



On 07/14/2016 06:48 PM, Johannes Sixt wrote:

Am 30.06.2016 um 11:09 schrieb Jeff King:

+/*
+ * This is the max value that a ustar size header can specify, as it
is fixed
+ * at 11 octal digits. POSIX specifies that we switch to extended
headers at
+ * this size.
+ */
+#define USTAR_MAX_SIZE 0777UL


This is too large by one bit for our 32-bit unsigned long on Windows:

archive-tar.c: In function 'write_tar_entry':
archive-tar.c:295: warning: integer constant is too large for 'unsigned
long' type
archive-tar.c: In function 'write_global_extended_header':
archive-tar.c:332: warning: integer constant is too large for 'unsigned
long' type
archive-tar.c:335: warning: integer constant is too large for 'unsigned
long' type
archive-tar.c:335: warning: overflow in implicit constant conversion

-- Hannes

Similar problem on 32 Bit Linux:
 In function ‘write_global_extended_header’:
archive-tar.c:29:25: warning: overflow in implicit constant conversion 
[-Woverflow]

 #define USTAR_MAX_MTIME 0777UL
 ^
archive-tar.c:335:16: note: in expansion of macro ‘USTAR_MAX_MTIME’
   args->time = USTAR_MAX_MTIME;

--
I want to volunteer to do more tests on 32 bit Linux ;-)
Does this fix it and look as a good thing to change ?


--- a/archive-tar.c
+++ b/archive-tar.c
@@ -332,7 +332,7 @@ static void write_global_extended_header(struct 
archiver_args *args)

if (args->time > USTAR_MAX_MTIME) {
strbuf_append_ext_header_uint(_header, "mtime",
  args->time);
-   args->time = USTAR_MAX_MTIME;
+   args->time = (time_t)USTAR_MAX_MTIME;
}
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] git-clean: remove fifo, devices, socket files

2016-07-14 Thread Andrey Vagin
Currently git-clean removes only links and files, but
there can be special files like fifo, sockets, devices.

I think git-clean has to remove them too.

Signed-off-by: Andrey Vagin 
---
 cache.h | 8 
 dir.c   | 4 
 2 files changed, 12 insertions(+)

diff --git a/cache.h b/cache.h
index f1dc289..a2f1258 100644
--- a/cache.h
+++ b/cache.h
@@ -77,10 +77,18 @@ struct object_id {
 #undef DT_DIR
 #undef DT_REG
 #undef DT_LNK
+#undef DT_FIFO
+#undef DT_BLK
+#undef DT_CHR
+#undef DT_SOCK
 #define DT_UNKNOWN 0
 #define DT_DIR 1
 #define DT_REG 2
 #define DT_LNK 3
+#define DT_FIFO4
+#define DT_BLK 5
+#define DT_CHR 6
+#define DT_SOCK7
 #define DTYPE(de)  DT_UNKNOWN
 #endif
 
diff --git a/dir.c b/dir.c
index 6172b34..930dd99 100644
--- a/dir.c
+++ b/dir.c
@@ -1470,8 +1470,12 @@ static enum path_treatment treat_one_path(struct 
dir_struct *dir,
strbuf_addch(path, '/');
return treat_directory(dir, untracked, path->buf, path->len,
   baselen, exclude, simplify);
+   case DT_BLK:
+   case DT_CHR:
+   case DT_FIFO:
case DT_REG:
case DT_LNK:
+   case DT_SOCK:
return exclude ? path_excluded : path_untracked;
}
 }
-- 
2.5.5

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] blame: Allow to blame paths freshly added to the index

2016-07-14 Thread Mike Hommey
When blaming files, changes in the work tree are taken into account
and displayed as being "Not Committed Yet".

However, when blaming a file that is not known to the current HEAD,
git blame fails with `no such path 'foo' in HEAD`, even when the file
was git add'ed.

This would seem uninteresting with the plain `git blame` case, which
it is, but it becomes useful when using copy detection, and the new file
was created from pieces already in HEAD, moved or copied from other
files.
---
 builtin/blame.c   |  4 +++-
 t/t8003-blame-corner-cases.sh | 23 +++
 2 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/builtin/blame.c b/builtin/blame.c
index 1e214bd..0858b18 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -2240,7 +2240,9 @@ static void verify_working_tree_path(struct commit 
*work_tree, const char *path)
sha1_object_info(blob_sha1, NULL) == OBJ_BLOB)
return;
}
-   die("no such path '%s' in HEAD", path);
+
+   if (cache_name_pos(path, strlen(path)) < 0)
+   die("no such path '%s' in HEAD", path);
 }
 
 static struct commit_list **append_parent(struct commit_list **tail, const 
unsigned char *sha1)
diff --git a/t/t8003-blame-corner-cases.sh b/t/t8003-blame-corner-cases.sh
index a9b266f..a0a09e2 100755
--- a/t/t8003-blame-corner-cases.sh
+++ b/t/t8003-blame-corner-cases.sh
@@ -137,6 +137,29 @@ test_expect_success 'blame wholesale copy and more' '
 
 '
 
+test_expect_success 'blame wholesale copy and more in the index' '
+
+   {
+   echo ABC
+   echo DEF
+   echo 
+   echo 
+   echo GHIJK
+   } >horse &&
+   git add horse &&
+   git blame -f -C -C1 -- horse | sed -e "$pick_fc" >current &&
+   {
+   echo mouse-Initial
+   echo mouse-Second
+   echo cow-Fifth
+   echo horse-Not
+   echo mouse-Third
+   } >expected &&
+   test_cmp expected current &&
+   git rm -f horse
+
+'
+
 test_expect_success 'blame path that used to be a directory' '
mkdir path &&
echo A A A A A >path/file &&
-- 
2.9.1.276.geea30e8

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 32-bit Travis, was Re: [ANNOUNCE] Git v2.9.1

2016-07-14 Thread Mike Hommey
On Thu, Jul 14, 2016 at 12:58:47PM +0200, Johannes Schindelin wrote:
> Hi Mike,
> 
> On Thu, 14 Jul 2016, Mike Hommey wrote:
> 
> > On Thu, Jul 14, 2016 at 09:58:45AM +0200, Johannes Schindelin wrote:
> > > Hi Junio,
> > > 
> > > On Wed, 13 Jul 2016, Junio C Hamano wrote:
> > > 
> > > > Does Travis CI testing have an option to run our tests on some
> > > > 32-bit platforms?
> > > 
> > > AFAIR Docker does not support 32-bit, and IIRC that's what Travis uses.
> > > 
> > > However, it is possible to install a 32-bit toolchain and use that to
> > > compile Git.
> > 
> > You just need to install gcc-multilib on travis, and you can use -m32. I
> > did that for jemalloc recently.
> > See https://github.com/jemalloc/jemalloc/blob/dev/.travis.yml
> 
> Would we not also need
> 
>   apt:
>   packages:
>   lib32z1
> 
> (or ia32libs in case of an older Ubuntu)?

And probably some libcurl-something-dev:i386 package too. 

Mike
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v14 00/21] index-helper/watchman

2016-07-14 Thread Ben Peart
Duy Nguyen  gmail.com> writes:

> 
> On Wed, Jul 13, 2016 at 11:59 PM, David Turner  novalis.org> 
wrote:
> > On 07/12/2016 02:24 PM, Duy Nguyen wrote:
> >>
> >> Just thinking out loud. I've been thinking about this more about this.
> >> After the move from signal-based to unix socket for communication, we
> >> probably are better off with a simpler design than the shm-alike one
> >> we have now.
> >>
> >> What if we send everything over a socket or a pipe? Sending 500MB over
> >> a unix socket takes 253ms, that's insignificant when operations on an
> >> index that size usually take seconds. If we send everything over
> >> socket/pipe, we can trust data integrity and don't have to verify,
> >> even the trailing SHA-1 in shm file.
> >
> >
> > I think it would be good to make index operations not take seconds.
> >
> > In general, we should not need to verify the trailing SHA-1 for shm data.
> > So the index-helper verifies it when it loads it, but the git (e.g.) status
> > should not need to verify.
> >
> > Also, if we have two git commands running at the same time, the index-helper
> > can only serve one at a time; with shm, both can run at full speed.
> 
> We still have an option to send a (shm, possibly) path to git to pick
> up and skip verification. If we can exchange capabilities then sending
> the index some way else is always possible.
> 
> >> So, what I have in mind is this, at read index time, instead of open a
> >> socket, we run a separate program and communicate via pipes. We can
> >> exchange capabilities if needed, then the program sends the entire
> >> current index, the list of updated files back (and/or the list of dirs
> >> to invalidate). The design looks very much like a smudge/clean filter.
> >
> >
> > This seems very complicated.  Now git status talks to the separate program,
> > which talks to the index-helper, which talks to watchman.  That is a lot of
> > steps!
> 
> I was suggesting this because I think it would simplify things, not
> complicate stuff further. Yes the separate program plays the role of
> our unix client, if we keep the index-helper. But we don't have to.
> 
> Do you remember Junio once suggested to put the index on tmpfs? That's
> what I imagine in common, medium scale setups. We don't need an extra
> daemon:
> 
> 1) when git needs the index, the script looks at its tmpfs mount, if
> found, pass the path back
> 2) when git announces the index has been updated, the script reads the
> index and saves it in tmpfs
> 3) when git refreshes and asks for watchman support, the script simply
> runs "watchman" command, post processes the output a bit and send the
> file list to git
> 
> Because there is no separate daemon in this case, we don't need
> --kill, we don't need --autorun. We still need WAMA extension but it
> can contain just an arbitrary clock string, this is completely opaque
> to git. If we can get rid of the index-helper (with an example script
> probably landed in contrib folder), that's a lot of less headache down
> the road.
> 
> For giant-scale repos, you probably want something more efficient than
> a script like this. And the good thing is you have freedom to do
> whatever you want. You can run one daemon per repo, you can run one
> daemon per system... In some previous mail exchange with Dscho, it was
> mentioned that something other than watchman may be desired. This
> opens up that door without much headache from outside.
> 
> > I think the daemon also has the advantage that it can reload the index as
> > soon as it changes.  This is not quite implemented, but it would be pretty
> > easy to do.  That would save a lot of time in the typical workflow.
> 
> A script has the same advantage, that is if git notifies it (like we
> do now). You can also do it using watchman trigger, which does not
> need any special support from git.


Taking a step back, git needs (at least) 3 things to update the index quickly, 
the old index, the list of potentially modified files and the list of 
directories with changes.  When looked at this way, the question becomes how do 
we provide this data to git in the fastest, simplest, most compatible way.

It makes sense to separate the code that git can call to abstract away some of 
the potentially platform dependent aspects of how we store and retrieve this 
data quickly.  For example, we would like to use something other than Watchman 
to track changes as it isn’t well supported on Windows.

It would be nice to simplify this for git as much as possible by eliminating 
the 
need for it to manage a daemon.  No need for --autorun, --kill, poking sockets, 
etc - just run the program and get back the data.

It also makes sense to allow git to negotiate for the data it needs.  It may 
only need the index, it may need the list of updated files and if 
untrackedcache 
is turned on, it would need the list of directories to invalidate.  The 
separate 
code should also be able to say which parts it can provide quickly and have git 

Re: [PATCH 1/4] push options: {pre,post}-receive hook learns about push options

2016-07-14 Thread Stefan Beller
On Thu, Jul 14, 2016 at 3:46 PM, Jeff King  wrote:
> On Thu, Jul 14, 2016 at 02:49:45PM -0700, Stefan Beller wrote:
>
>> We limit the push options for now
>> * to not exceed an arbitrary count, and
>> * to not exceed an arbitrary size.
>>
>> This serves two purposes:
>> * DoS protection (i.e. one connection can add no more than 32kB
>>   now)
>> * We need to figure out how to handle large (>64kB). Jeff wrote:
>>   > Yes, but people are also happy when they can use a flexible and
>>   > standardized tool to do a thing. I'd be more frustrated when I found out
>>   > that Git's data-pushing protocol has arbitrary limitations (like, say, I
>>   > can't push a data item larger than a single 64K pkt-line), which would
>>   > easily just work with something like HTTP POSTs.
>>   So to keep a way open in the future to deal with large pay loads,
>>   the size is restricted for now.
>
> Should this bit get dropped from the commit message?
>
> -Peff

Right. :/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/4] push options: {pre,post}-receive hook learns about push options

2016-07-14 Thread Jeff King
On Thu, Jul 14, 2016 at 02:49:45PM -0700, Stefan Beller wrote:

> We limit the push options for now
> * to not exceed an arbitrary count, and
> * to not exceed an arbitrary size.
> 
> This serves two purposes:
> * DoS protection (i.e. one connection can add no more than 32kB
>   now)
> * We need to figure out how to handle large (>64kB). Jeff wrote:
>   > Yes, but people are also happy when they can use a flexible and
>   > standardized tool to do a thing. I'd be more frustrated when I found out
>   > that Git's data-pushing protocol has arbitrary limitations (like, say, I
>   > can't push a data item larger than a single 64K pkt-line), which would
>   > easily just work with something like HTTP POSTs.
>   So to keep a way open in the future to deal with large pay loads,
>   the size is restricted for now.

Should this bit get dropped from the commit message?

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 2/5] t5000: test tar files that overflow ustar headers

2016-07-14 Thread Jeff King
On Thu, Jul 14, 2016 at 03:30:58PM -0700, Junio C Hamano wrote:

> > If we move to time_t everywhere, I think we'll need an extra
> > TIME_T_IS_64BIT, but we can cross that bridge when we come to it.
> >
> > Likewise I think we'll need SIZE_T_IS_64BIT eventually (for real 32-bit
> > systems; LLP64 systems like Windows will then be able to run the test).
> 
> I guess I wrote essentially the same thing before refreshing my
> Inbox.
> 
> I am a bit fuzzy between off_t and size_t; the former is for the
> size of things you see on the filesystem, while the latter is for
> you to give malloc(3).  I would have thought that off_t is the type
> we would want at the end of the raw object header, denoting the size
> of a blob object when deflated, which could be larger than the size
> of a region of memory we can get from malloc(3), in which case we
> would use the streaming interface.

Yeah, your understanding is right (s/deflated/inflated/). I agree that
off_t is probably a better size for blobs. Traditionally git assumed any
object could fit in memory. The streaming interface helps that somewhat,
but I think there are cases where we still must load a blob (e.g., if it
is stored as a delta). In theory that never happens because of
core.bigfilethreshold, but you may get a packfile from somebody with a
higher threshold than you.

I wouldn't be surprised if there are other cases that aren't smart
enough to use the streaming interface yet, but the solution there is to
make them smarter. :)

So off_t is probably better. We do need to be careful, though, when
allocating objects. E.g., this:

  off_t size;
  struct git_istream *stream;
  void *buf;

  stream = open_istream(sha1, , , NULL);
  buf = xmalloc(size);
  while (1) {
/* read stream into buf */
  }

is a security hole when size_t is less than off_t (it gets truncated in
the call to xmalloc, which allocates too few bytes). This is a toy
example, obviously, but it's something to watch out for.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv7 0/4] Push options

2016-07-14 Thread Junio C Hamano
Stefan Beller  writes:

> Jeff,
>
> here is the more idiomatic way.
>
> Thanks,
> Stefan

Looks good to me.  Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/2] archive-tar: huge offset and future timestamps would not work on 32-bit

2016-07-14 Thread Junio C Hamano
Jeff King  writes:

> On Thu, Jul 14, 2016 at 01:43:57PM -0700, Junio C Hamano wrote:
>
>> As we are not yet moving everything to size_t but still using ulong
>> internally when talking about the size of object, platforms with
>> 32-bit long will not be able to produce tar archive with 4GB+ file,
>> and cannot grok 0777UL as a constant.  Disable the extended
>> header feature and do not test it on them.
>
> I tried testing this in a VM with 32-bit Debian. It fixes the build
> problems, but t5000 still fails.

Thanks for testing.  I need to find some time hunting for (or
building) an environment to do that myself.

> I think you need to add the prereq to one more test:
>
> diff --git a/t/t5000-tar-tree.sh b/t/t5000-tar-tree.sh
> index 699355b..80b2387 100755
> --- a/t/t5000-tar-tree.sh
> +++ b/t/t5000-tar-tree.sh
> @@ -347,7 +347,7 @@ test_lazy_prereq TAR_HUGE '
>   test_cmp expect actual
>  '
>  
> -test_expect_success 'set up repository with huge blob' '
> +test_expect_success LONG_IS_64BIT 'set up repository with huge blob' '
>   obj_d=19 &&
>   obj_f=f9c8273ec45a8938e6999cb59b3ff66739902a &&
>   obj=${obj_d}${obj_f} &&
>
> We shouldn't be accessing the blob in update-index, but I think "git
> commit" does so for the diff (and then after seeing the size says
> "whoops, that's binary", but even the size check fails on 32-bit
> systems).
>
> So another solution would be to use "commit -q" at the end of that test.
> I don't think there's much point, though; it's just setting up a state
> for other tests that need LONG_IS_64BIT.
>
> As an aside, it is inadvertently testing that our diff code does not
> bother to read the whole blob in such a case. Which maybe argues for
> using "commit -q", just because that is not a thing we are intending to
> test here.

Thanks.  Let's add a prereq there.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 2/5] t5000: test tar files that overflow ustar headers

2016-07-14 Thread Junio C Hamano
Jeff King  writes:

> If we move to time_t everywhere, I think we'll need an extra
> TIME_T_IS_64BIT, but we can cross that bridge when we come to it.
>
> Likewise I think we'll need SIZE_T_IS_64BIT eventually (for real 32-bit
> systems; LLP64 systems like Windows will then be able to run the test).

I guess I wrote essentially the same thing before refreshing my
Inbox.

I am a bit fuzzy between off_t and size_t; the former is for the
size of things you see on the filesystem, while the latter is for
you to give malloc(3).  I would have thought that off_t is the type
we would want at the end of the raw object header, denoting the size
of a blob object when deflated, which could be larger than the size
of a region of memory we can get from malloc(3), in which case we
would use the streaming interface.


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 2/5] t5000: test tar files that overflow ustar headers

2016-07-14 Thread Junio C Hamano
Johannes Sixt  writes:

> My first thought was that this is not warranted because t0006 is about
> commit time stamps, but the huge-tar breakage is file sizes, and the
> cases should be treated differently.
>
> But on second thought, under the hood, both boil down to the size of
> unsigned long in our implementation. It may make sense to tie both
> cases to the same prerequisite.
>
> On third thought, however, I think the two requirements could diverge
> in the future. The file size case should depend on the size of
> size_t. The timestamp case may become dependent on the size of time_t
> if we decide to move timestamp handling away from unsigned long: in
> modern(!) Microsoft SDKs, time_t is 64 bits, but unsigned long is 32
> bits, in both the 32-bit and 64-bit environments!

I had the same three toughts, but this being a 'maint' material
stopped me going too deep into them.  Right now, "long being 32-bit"
is the source of all of these issues, and we would solve them on the
development track (not necessarily during this cycle) by deciding on
more appropriate types.  Timestamps may become time_t, and object
sizes may become off_t, such changes will come separately, and each
of them would need to lift "unless long is 64-bit, skip this test"
limitation and swap it with something else.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/2] archive-tar: huge offset and future timestamps would not work on 32-bit

2016-07-14 Thread Jeff King
On Thu, Jul 14, 2016 at 01:43:57PM -0700, Junio C Hamano wrote:

> As we are not yet moving everything to size_t but still using ulong
> internally when talking about the size of object, platforms with
> 32-bit long will not be able to produce tar archive with 4GB+ file,
> and cannot grok 0777UL as a constant.  Disable the extended
> header feature and do not test it on them.

I tried testing this in a VM with 32-bit Debian. It fixes the build
problems, but t5000 still fails.

I think you need to add the prereq to one more test:

diff --git a/t/t5000-tar-tree.sh b/t/t5000-tar-tree.sh
index 699355b..80b2387 100755
--- a/t/t5000-tar-tree.sh
+++ b/t/t5000-tar-tree.sh
@@ -347,7 +347,7 @@ test_lazy_prereq TAR_HUGE '
test_cmp expect actual
 '
 
-test_expect_success 'set up repository with huge blob' '
+test_expect_success LONG_IS_64BIT 'set up repository with huge blob' '
obj_d=19 &&
obj_f=f9c8273ec45a8938e6999cb59b3ff66739902a &&
obj=${obj_d}${obj_f} &&

We shouldn't be accessing the blob in update-index, but I think "git
commit" does so for the diff (and then after seeing the size says
"whoops, that's binary", but even the size check fails on 32-bit
systems).

So another solution would be to use "commit -q" at the end of that test.
I don't think there's much point, though; it's just setting up a state
for other tests that need LONG_IS_64BIT.

As an aside, it is inadvertently testing that our diff code does not
bother to read the whole blob in such a case. Which maybe argues for
using "commit -q", just because that is not a thing we are intending to
test here.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Use path comparison for patch ids before the file content

2016-07-14 Thread Junio C Hamano
Kevin Willford  writes:

> When limiting the list in a revision walk using cherry pick, patch ids are
> calculated by producing the diff of the content of the files.  This would
> be more efficent by using a patch id looking at the paths that were
> changed in the commits and only if all the file changed are the same fall
> back to getting the content of the files in the commits to determine if
> the commits are the same.

The basic idea of this change makes sense.  When we have many
commits, but if we can tell that no other commit changes the same
set of paths as this commit does, we can immediately know that this
commit cannot have an equivalent other commit among the rest.  By
first computing a lot cheaper "hash of touched paths" for commits,
and throwing them into separate bins keyed by the "hash of touched
paths", you can narrow the commits whose patch IDs must be compared,
and if a bin happens to be a singleton, you do not even need to
produce any patch ID by running a textual diff.  I like it.

Explaining this as "hash of touched paths" is somewhat misleading.
Your "use_path_only" mode actually hashes a lot more than just
paths.  Because the "use_path_only" mode actually hashes the entire
basic diff header and not just paths, it can differentiate a commit
that adds a file and another commit that modifies the same file, for
example.

> ...  This will speed up a rebase where the
> upstream has many changes but none of them have been pulled into the
> current branch.
> ---

Missing sign-off.

>  diff.c  |  16 +
>  diff.h  |   2 +-

The changes in the above two files looked OK to me.
I didn't read the changes to the other three files carefully.

>  patch-ids.c | 114 
> +---
>  patch-ids.h |   7 ++--
>  revision.c  |  19 ++
>  5 files changed, 73 insertions(+), 85 deletions(-)

>
> diff --git a/patch-ids.c b/patch-ids.c
> index a4d0016..f0262ce 100644
> --- a/patch-ids.c
> +++ b/patch-ids.c
> @@ -4,8 +4,9 @@
> ...
> +}
> \ No newline at end of file

No newline at end of file.


Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/4] receive-pack: implement advertising and receiving push options

2016-07-14 Thread Stefan Beller
The pre/post receive hook may be interested in more information from the
user. This information can be transmitted when both client and server
support the "push-options" capability, which when used is a phase directly
after update commands ended by a flush pkt.

Similar to the atomic option, the server capability can be disabled via
the `receive.advertisePushOptions` config variable. While documenting
this, fix a nit in the `receive.advertiseAtomic` wording.

Signed-off-by: Stefan Beller 
---
 Documentation/config.txt  |  9 +--
 Documentation/technical/pack-protocol.txt | 10 +---
 Documentation/technical/protocol-capabilities.txt |  9 +++
 builtin/receive-pack.c| 30 +++
 4 files changed, 52 insertions(+), 6 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 16dc22d..0bb6daa 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2417,8 +2417,13 @@ rebase.instructionFormat
 
 receive.advertiseAtomic::
By default, git-receive-pack will advertise the atomic push
-   capability to its clients. If you don't want to this capability
-   to be advertised, set this variable to false.
+   capability to its clients. If you don't want to advertise this
+   capability, set this variable to false.
+
+receive.advertisePushOptions::
+   By default, git-receive-pack will advertise the push options
+   capability to its clients. If you don't want to advertise this
+   capability, set this variable to false.
 
 receive.autogc::
By default, git-receive-pack will run "git-gc --auto" after
diff --git a/Documentation/technical/pack-protocol.txt 
b/Documentation/technical/pack-protocol.txt
index 8b36343..7a2ed30 100644
--- a/Documentation/technical/pack-protocol.txt
+++ b/Documentation/technical/pack-protocol.txt
@@ -454,7 +454,8 @@ The reference discovery phase is done nearly the same way 
as it is in the
 fetching protocol. Each reference obj-id and name on the server is sent
 in packet-line format to the client, followed by a flush-pkt.  The only
 real difference is that the capability listing is different - the only
-possible values are 'report-status', 'delete-refs' and 'ofs-delta'.
+possible values are 'report-status', 'delete-refs', 'ofs-delta' and
+'push-options'.
 
 Reference Update Request and Packfile Transfer
 --
@@ -465,9 +466,10 @@ that it wants to update, it sends a line listing the 
obj-id currently on
 the server, the obj-id the client would like to update it to and the name
 of the reference.
 
-This list is followed by a flush-pkt and then the packfile that should
-contain all the objects that the server will need to complete the new
-references.
+This list is followed by a flush-pkt. Then the push options are transmitted
+one per packet followed by another flush-pkt. After that the packfile that
+should contain all the objects that the server will need to complete the new
+references will be sent.
 
 
   update-request=  *shallow ( command-list | push-cert ) [packfile]
diff --git a/Documentation/technical/protocol-capabilities.txt 
b/Documentation/technical/protocol-capabilities.txt
index eaab6b4..4c28d3a 100644
--- a/Documentation/technical/protocol-capabilities.txt
+++ b/Documentation/technical/protocol-capabilities.txt
@@ -253,6 +253,15 @@ atomic pushes. If the pushing client requests this 
capability, the server
 will update the refs in one atomic transaction. Either all refs are
 updated or none.
 
+push-options
+
+
+If the server sends the 'push-options' capability it is able to accept
+push options after the update commands have been sent, but before the
+packfile is streamed. If the pushing client requests this capability,
+the server will pass the options to the pre- and post- receive hooks
+that process this push request.
+
 allow-tip-sha1-in-want
 --
 
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 6cdd2c6..3c9360a 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -44,10 +44,12 @@ static struct strbuf fsck_msg_types = STRBUF_INIT;
 static int receive_unpack_limit = -1;
 static int transfer_unpack_limit = -1;
 static int advertise_atomic_push = 1;
+static int advertise_push_options;
 static int unpack_limit = 100;
 static int report_status;
 static int use_sideband;
 static int use_atomic;
+static int use_push_options;
 static int quiet;
 static int prefer_ofs_delta = 1;
 static int auto_update_server_info;
@@ -193,6 +195,11 @@ static int receive_pack_config(const char *var, const char 
*value, void *cb)
return 0;
}
 
+   if (strcmp(var, "receive.advertisepushoptions") == 0) {
+   advertise_push_options = git_config_bool(var, value);
+   return 0;
+   }
+
return git_default_config(var, value, cb);
 }
 
@@ -211,6 

[PATCH 3/4] push: accept push options

2016-07-14 Thread Stefan Beller
This implements everything that is required on the client side to make use
of push options from the porcelain push command.

Signed-off-by: Stefan Beller 
---
 Documentation/git-push.txt |  8 +++-
 builtin/push.c | 21 ++---
 send-pack.c| 27 +++
 send-pack.h|  3 +++
 transport.c|  1 +
 transport.h|  7 +++
 6 files changed, 63 insertions(+), 4 deletions(-)

diff --git a/Documentation/git-push.txt b/Documentation/git-push.txt
index 93c3527..ec514f6 100644
--- a/Documentation/git-push.txt
+++ b/Documentation/git-push.txt
@@ -11,7 +11,7 @@ SYNOPSIS
 [verse]
 'git push' [--all | --mirror | --tags] [--follow-tags] [--atomic] [-n | 
--dry-run] [--receive-pack=]
   [--repo=] [-f | --force] [-d | --delete] [--prune] [-v | 
--verbose]
-  [-u | --set-upstream]
+  [-u | --set-upstream] [--push-option=]
   [--[no-]signed|--sign=(true|false|if-asked)]
   [--force-with-lease[=[:]]]
   [--no-verify] [ [...]]
@@ -156,6 +156,12 @@ already exists on the remote side.
Either all refs are updated, or on error, no refs are updated.
If the server does not support atomic pushes the push will fail.
 
+-o::
+--push-option::
+   Transmit the given string to the server, which passes them to
+   the pre-receive as well as the post-receive hook. The given string
+   must not contain a NUL or LF character.
+
 --receive-pack=::
 --exec=::
Path to the 'git-receive-pack' program on the remote
diff --git a/builtin/push.c b/builtin/push.c
index 4e9e4db..3bb9d6b 100644
--- a/builtin/push.c
+++ b/builtin/push.c
@@ -353,7 +353,8 @@ static int push_with_options(struct transport *transport, 
int flags)
return 1;
 }
 
-static int do_push(const char *repo, int flags)
+static int do_push(const char *repo, int flags,
+  const struct string_list *push_options)
 {
int i, errs;
struct remote *remote = pushremote_get(repo);
@@ -376,6 +377,9 @@ static int do_push(const char *repo, int flags)
if (remote->mirror)
flags |= (TRANSPORT_PUSH_MIRROR|TRANSPORT_PUSH_FORCE);
 
+   if (push_options->nr)
+   flags |= TRANSPORT_PUSH_OPTIONS;
+
if ((flags & TRANSPORT_PUSH_ALL) && refspec) {
if (!strcmp(*refspec, "refs/tags/*"))
return error(_("--all and --tags are incompatible"));
@@ -406,13 +410,16 @@ static int do_push(const char *repo, int flags)
for (i = 0; i < url_nr; i++) {
struct transport *transport =
transport_get(remote, url[i]);
+   if (flags & TRANSPORT_PUSH_OPTIONS)
+   transport->push_options = push_options;
if (push_with_options(transport, flags))
errs++;
}
} else {
struct transport *transport =
transport_get(remote, NULL);
-
+   if (flags & TRANSPORT_PUSH_OPTIONS)
+   transport->push_options = push_options;
if (push_with_options(transport, flags))
errs++;
}
@@ -500,6 +507,9 @@ int cmd_push(int argc, const char **argv, const char 
*prefix)
int push_cert = -1;
int rc;
const char *repo = NULL;/* default repository */
+   static struct string_list push_options = STRING_LIST_INIT_DUP;
+   static struct string_list_item *item;
+
struct option options[] = {
OPT__VERBOSITY(),
OPT_STRING( 0 , "repo", , N_("repository"), 
N_("repository")),
@@ -533,6 +543,7 @@ int cmd_push(int argc, const char **argv, const char 
*prefix)
  0, "signed", _cert, "yes|no|if-asked", N_("GPG sign the 
push"),
  PARSE_OPT_OPTARG, option_parse_push_signed },
OPT_BIT(0, "atomic", , N_("request atomic transaction on 
remote side"), TRANSPORT_PUSH_ATOMIC),
+   OPT_STRING_LIST('o', "push-option", _options, 
N_("server-specific"), N_("option to transmit")),
OPT_SET_INT('4', "ipv4", , N_("use IPv4 addresses only"),
TRANSPORT_FAMILY_IPV4),
OPT_SET_INT('6', "ipv6", , N_("use IPv6 addresses only"),
@@ -563,7 +574,11 @@ int cmd_push(int argc, const char **argv, const char 
*prefix)
set_refspecs(argv + 1, argc - 1, repo);
}
 
-   rc = do_push(repo, flags);
+   for_each_string_list_item(item, _options)
+   if (strchr(item->string, '\n'))
+   die(_("push options must not have new line 
characters"));
+
+   rc = do_push(repo, flags, _options);
if (rc == -1)
usage_with_options(push_usage, options);
else
diff --git a/send-pack.c b/send-pack.c

[PATCH 1/4] push options: {pre,post}-receive hook learns about push options

2016-07-14 Thread Stefan Beller
The environment variable GIT_PUSH_OPTION_COUNT is set to the number of
push options sent, and GIT_PUSH_OPTION_{0,1,..} is set to the transmitted
option.

The code is not executed as the push options are set to NULL, nor is the
new capability advertised.

There was some discussion back and forth how to present these push options
to the user as there are some ways to do it:

Keep all options in one environment variable

+ easiest way to implement in Git
- This would make things hard to parse correctly in the hook.

Put the options in files instead,
filenames are in GIT_PUSH_OPTION_FILES
==
+ After a discussion about environment variables and shells, we may not
  want to put user data into an environment variable (see [1] for example).
+ We could transmit binaries, i.e. we're not bound to C strings as
  we are when using environment variables to the user.
+ Maybe easier to parse than constructing environment variable names
  GIT_PUSH_OPTION_{0,1,..} yourself
- cleanup of the temporary files is hard to do reliably
- we have race conditions with multiple clients pushing, hence we'd need
  to use mkstemp. That's not too bad, but still.

Use environment variables, but restrict to key/value pairs
==
(When the user pushes a push option `foo=bar`, we'd
GIT_PUSH_OPTION_foo=bar)
+ very easy to parse for a simple model of push options
- it's not sufficient for more elaborate models, e.g.
  it doesn't allow doubles (e.g. cc=reviewer@email)

Present the options in different environment variables
==
(This is implemented)
* harder to parse as a user, but we have a sample hook for that.
- doesn't allow binary files
+ allows the same option twice, i.e. is not restrictive about
  options, except for binary files.
+ doesn't clutter a remote directory with (possibly stale)
  temporary files

As we first want to focus on getting simple strings to work
reliably, we go with the last option for now. If we want to
do transmission of binaries later, we can just attach a
'side-channel', e.g. "any push option that contains a '\0' is
put into a file instead of the environment variable and we'd
have new GIT_PUSH_OPTION_FILES, GIT_PUSH_OPTION_FILENAME_{0,1,..}
environment variables".

We limit the push options for now
* to not exceed an arbitrary count, and
* to not exceed an arbitrary size.

This serves two purposes:
* DoS protection (i.e. one connection can add no more than 32kB
  now)
* We need to figure out how to handle large (>64kB). Jeff wrote:
  > Yes, but people are also happy when they can use a flexible and
  > standardized tool to do a thing. I'd be more frustrated when I found out
  > that Git's data-pushing protocol has arbitrary limitations (like, say, I
  > can't push a data item larger than a single 64K pkt-line), which would
  > easily just work with something like HTTP POSTs.
  So to keep a way open in the future to deal with large pay loads,
  the size is restricted for now.

[1] 'Shellshock' https://lwn.net/Articles/614218/

Signed-off-by: Stefan Beller 
---
 Documentation/githooks.txt  | 18 ++
 builtin/receive-pack.c  | 47 +++--
 templates/hooks--pre-receive.sample | 24 +++
 3 files changed, 76 insertions(+), 13 deletions(-)
 create mode 100644 templates/hooks--pre-receive.sample

diff --git a/Documentation/githooks.txt b/Documentation/githooks.txt
index d82e912..9565dc3 100644
--- a/Documentation/githooks.txt
+++ b/Documentation/githooks.txt
@@ -247,6 +247,15 @@ Both standard output and standard error output are 
forwarded to
 'git send-pack' on the other end, so you can simply `echo` messages
 for the user.
 
+The number of push options given on the command line of
+`git push --push-option=...` can be read from the environment
+variable `GIT_PUSH_OPTION_COUNT`, and the options themselves are
+found in `GIT_PUSH_OPTION_0`, `GIT_PUSH_OPTION_1`,...
+If it is negotiated to not use the push options phase, the
+environment variables will not be set. If the client selects
+to use push options, but doesn't transmit any, the count variable
+will be set to zero, `GIT_PUSH_OPTION_COUNT=0`.
+
 [[update]]
 update
 ~~
@@ -322,6 +331,15 @@ a sample script `post-receive-email` provided in the 
`contrib/hooks`
 directory in Git distribution, which implements sending commit
 emails.
 
+The number of push options given on the command line of
+`git push --push-option=...` can be read from the environment
+variable `GIT_PUSH_OPTION_COUNT`, and the options themselves are
+found in `GIT_PUSH_OPTION_0`, `GIT_PUSH_OPTION_1`,...
+If it is negotiated to not use the push options phase, the
+environment variables will not be set. If the client selects
+to use push options, but doesn't transmit any, the count variable
+will be set to zero, 

[PATCH 4/4] add a test for push options

2016-07-14 Thread Stefan Beller
The functions `mk_repo_pair` as well as `test_refs` are borrowed from
t5543-atomic-push, with additional hooks installed.

Signed-off-by: Stefan Beller 
---
 t/t5545-push-options.sh | 103 
 1 file changed, 103 insertions(+)
 create mode 100755 t/t5545-push-options.sh

diff --git a/t/t5545-push-options.sh b/t/t5545-push-options.sh
new file mode 100755
index 000..ea813b9
--- /dev/null
+++ b/t/t5545-push-options.sh
@@ -0,0 +1,103 @@
+#!/bin/sh
+
+test_description='pushing to a repository using push options'
+
+. ./test-lib.sh
+
+mk_repo_pair () {
+   rm -rf workbench upstream &&
+   test_create_repo upstream &&
+   test_create_repo workbench &&
+   (
+   cd upstream &&
+   git config receive.denyCurrentBranch warn &&
+   mkdir -p .git/hooks &&
+   cat >.git/hooks/pre-receive <<-'EOF' &&
+   #!/bin/sh
+   if test -n "$GIT_PUSH_OPTION_COUNT"; then
+   i=0
+   >hooks/pre-receive.push_options
+   while test "$i" -lt "$GIT_PUSH_OPTION_COUNT"; do
+   eval "value=\$GIT_PUSH_OPTION_$i"
+   echo $value >>hooks/pre-receive.push_options
+   i=$((i + 1))
+   done
+   fi
+   EOF
+   chmod u+x .git/hooks/pre-receive
+
+   cat >.git/hooks/post-receive <<-'EOF' &&
+   #!/bin/sh
+   if test -n "$GIT_PUSH_OPTION_COUNT"; then
+   i=0
+   >hooks/post-receive.push_options
+   while test "$i" -lt "$GIT_PUSH_OPTION_COUNT"; do
+   eval "value=\$GIT_PUSH_OPTION_$i"
+   echo $value >>hooks/post-receive.push_options
+   i=$((i + 1))
+   done
+   fi
+   EOF
+   chmod u+x .git/hooks/post-receive
+   ) &&
+   (
+   cd workbench &&
+   git remote add up ../upstream
+   )
+}
+
+# Compare the ref ($1) in upstream with a ref value from workbench ($2)
+# i.e. test_refs second HEAD@{2}
+test_refs () {
+   test $# = 2 &&
+   git -C upstream rev-parse --verify "$1" >expect &&
+   git -C workbench rev-parse --verify "$2" >actual &&
+   test_cmp expect actual
+}
+
+test_expect_success 'one push option works for a single branch' '
+   mk_repo_pair &&
+   git -C upstream config receive.advertisePushOptions true &&
+   (
+   cd workbench &&
+   test_commit one &&
+   git push --mirror up &&
+   test_commit two &&
+   git push --push-option=asdf up master
+   ) &&
+   test_refs master master &&
+   echo "asdf" >expect &&
+   test_cmp expect upstream/.git/hooks/pre-receive.push_options &&
+   test_cmp expect upstream/.git/hooks/post-receive.push_options
+'
+
+test_expect_success 'push option denied by remote' '
+   mk_repo_pair &&
+   git -C upstream config receive.advertisePushOptions false &&
+   (
+   cd workbench &&
+   test_commit one &&
+   git push --mirror up &&
+   test_commit two &&
+   test_must_fail git push --push-option=asdf up master
+   ) &&
+   test_refs master HEAD@{1}
+'
+
+test_expect_success 'two push options work' '
+   mk_repo_pair &&
+   git -C upstream config receive.advertisePushOptions true &&
+   (
+   cd workbench &&
+   test_commit one &&
+   git push --mirror up &&
+   test_commit two &&
+   git push --push-option=asdf --push-option="more structured 
text" up master
+   ) &&
+   test_refs master master &&
+   printf "asdf\nmore structured text\n" >expect &&
+   test_cmp expect upstream/.git/hooks/pre-receive.push_options &&
+   test_cmp expect upstream/.git/hooks/post-receive.push_options
+'
+
+test_done
-- 
2.9.0.247.gf748855.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv7 0/4] Push options

2016-07-14 Thread Stefan Beller
Jeff,

here is the more idiomatic way.

Thanks,
Stefan

diff to v6:
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 9bb9afc..3c9360a 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -1499,11 +1499,8 @@ static struct command *read_head_info(struct sha1_array 
*shallow)
return commands;
 }
 
-static struct string_list *read_push_options(void)
+static void read_push_options(struct string_list *options)
 {
-   struct string_list *ret = xmalloc(sizeof(*ret));
-   string_list_init(ret, 1);
-
while (1) {
char *line;
int len;
@@ -1513,10 +1510,8 @@ static struct string_list *read_push_options(void)
if (!line)
break;
 
-   string_list_append(ret, line);
+   string_list_append(options, line);
}
-
-   return ret;
 }
 
 static const char *parse_pack_header(struct pack_header *hdr)
@@ -1804,10 +1799,10 @@ int cmd_receive_pack(int argc, const char **argv, const 
char *prefix)
 
if ((commands = read_head_info()) != NULL) {
const char *unpack_status = NULL;
-   struct string_list *push_options = NULL;
+   struct string_list push_options = STRING_LIST_INIT_DUP;
 
if (use_push_options)
-   push_options = read_push_options();
+   read_push_options(_options);
 
prepare_shallow_info(, );
if (!si.nr_ours && !si.nr_theirs)
@@ -1817,18 +1812,16 @@ int cmd_receive_pack(int argc, const char **argv, const 
char *prefix)
update_shallow_info(commands, , );
}
execute_commands(commands, unpack_status, ,
-push_options);
+_options);
if (pack_lockfile)
unlink_or_warn(pack_lockfile);
if (report_status)
report(commands, unpack_status);
run_receive_hook(commands, "post-receive", 1,
-push_options);
+_options);
run_update_post_hook(commands);
-   if (push_options) {
-   string_list_clear(push_options, 0);
-   free(push_options);
-   }
+   if (push_options.nr)
+   string_list_clear(_options, 0);
if (auto_gc) {
const char *argv_gc_auto[] = {
"gc", "--auto", "--quiet", NULL,

v6 consisted of 2/4 only:

 Junio,
 please replace v5 2/4 with this patch (I only resend this single patch
 as the other 3 remain as is).

 This only changes read_push_options to not care at all about any
 limitations.
 
 Thanks,
 Stefan

# interdiff to v5:
# diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
# index 917ac18..9bb9afc 100644
# --- a/builtin/receive-pack.c
# +++ b/builtin/receive-pack.c
# @@ -1505,7 +1505,7 @@ static struct string_list *read_push_options(void)
#   string_list_init(ret, 1);
#  
#   while (1) {
# - char *line, *lf;
# + char *line;
#   int len;
#  
#   line = packet_read_line(0, );
# @@ -1513,30 +1513,6 @@ static struct string_list *read_push_options(void)
#   if (!line)
#   break;
#  
# - /*
# - * NEEDSWORK: expose the limitations to be configurable;
# - * Once the limit can be lifted, include a way for payloads
# - * larger than one pkt, e.g use last byte to indicate if
# - * the push option continues in the next packet or implement
# - * larger packets.
# - */
# - if (len > LARGE_PACKET_MAX - 1) {
# - /*
# -  * NEEDSWORK: The error message in die(..) is not
# -  * transmitted in call cases, so ideally all die(..)
# -  * calls are prefixed with rp_error and then we can
# -  * combine rp_error && die into one helper function.
# -  */
# - rp_error("protocol error: server configuration allows 
push "
# -  "options of size up to %d bytes",
# -  LARGE_PACKET_MAX - 1);
# - die("protocol error: push options too large");
# - }
# -
# - lf = strchr(line, '\n');
# - if (lf)
# - *lf = '\0';
# -
#   string_list_append(ret, line);
#   }

v5:

Jeff wrote:
> Junio wrote:
>> I think those extra knobs can come later.  If we are not going to
>> limit with max_options in the end, however, wouldn't it be more
>> natural for the initial iteration without any configuration not to
>> have hard-coded max_options at all?
>
> Yeah, I am OK 

Re: [PATCHv6 2/4] receive-pack: implement advertising and receiving push options

2016-07-14 Thread Stefan Beller
On Thu, Jul 14, 2016 at 2:35 PM, Jeff King  wrote:
> On Thu, Jul 14, 2016 at 02:24:54PM -0700, Stefan Beller wrote:
>
>> # interdiff to v5:
>> [...giant deletion...]
>
> Much nicer. :)
>
>> +static struct string_list *read_push_options(void)
>> +{
>> + struct string_list *ret = xmalloc(sizeof(*ret));
>
> This struck me as a little non-idiomatic for our code base. The usual
> technique is to take a pointer to a stack-allocated struct, and write
> into that.

Oh, right! :(

That's what you get when you grow up with object orientation all along.
read_push_options() mentally mapped to "create a push options object if any"
such that I can see if it is NULL or not.

I'll reroll a more idiomatic thing. Thanks,
Stefan

>
> I guess that here:
>
>> @@ -1774,6 +1806,9 @@ int cmd_receive_pack(int argc, const char **argv, 
>> const char *prefix)
>>   const char *unpack_status = NULL;
>>   struct string_list *push_options = NULL;
>>
>> + if (use_push_options)
>> + push_options = read_push_options();
>> +
>
> You will want to later check whether push_options is NULL. But you can
> also just check push_options.nr.
>
> -Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv6 2/4] receive-pack: implement advertising and receiving push options

2016-07-14 Thread Jeff King
On Thu, Jul 14, 2016 at 02:24:54PM -0700, Stefan Beller wrote:

> # interdiff to v5:
> [...giant deletion...]

Much nicer. :)

> +static struct string_list *read_push_options(void)
> +{
> + struct string_list *ret = xmalloc(sizeof(*ret));

This struck me as a little non-idiomatic for our code base. The usual
technique is to take a pointer to a stack-allocated struct, and write
into that.

I guess that here:

> @@ -1774,6 +1806,9 @@ int cmd_receive_pack(int argc, const char **argv, const 
> char *prefix)
>   const char *unpack_status = NULL;
>   struct string_list *push_options = NULL;
>  
> + if (use_push_options)
> + push_options = read_push_options();
> +

You will want to later check whether push_options is NULL. But you can
also just check push_options.nr.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 2/5] t5000: test tar files that overflow ustar headers

2016-07-14 Thread Jeff King
On Thu, Jul 14, 2016 at 10:52:55PM +0200, Johannes Sixt wrote:

> > Dscho?  I'll revert the merge of 'js/t0006-for-v2.9.2' out of
> > 'next' so that we can replace it with your newer version, but it
> > needs to be massaged to lose the strong linkage with "time", as
> > it is no longer "is our time big enough", I would think.
> 
> My first thought was that this is not warranted because t0006 is about
> commit time stamps, but the huge-tar breakage is file sizes, and the cases
> should be treated differently.
> 
> But on second thought, under the hood, both boil down to the size of
> unsigned long in our implementation. It may make sense to tie both cases to
> the same prerequisite.
> 
> On third thought, however, I think the two requirements could diverge in the
> future. The file size case should depend on the size of size_t. The
> timestamp case may become dependent on the size of time_t if we decide to
> move timestamp handling away from unsigned long: in modern(!) Microsoft
> SDKs, time_t is 64 bits, but unsigned long is 32 bits, in both the 32-bit
> and 64-bit environments!

The tar tests actually cover both: big files and big timestamps.

I think as long as the prereq is labeled LONG_IS_64BIT, we can then
adjust the appropriate tests if and when they become runnable on more
systems.

If we move to time_t everywhere, I think we'll need an extra
TIME_T_IS_64BIT, but we can cross that bridge when we come to it.

Likewise I think we'll need SIZE_T_IS_64BIT eventually (for real 32-bit
systems; LLP64 systems like Windows will then be able to run the test).

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv6 2/4] receive-pack: implement advertising and receiving push options

2016-07-14 Thread Stefan Beller
The pre/post receive hook may be interested in more information from the
user. This information can be transmitted when both client and server
support the "push-options" capability, which when used is a phase directly
after update commands ended by a flush pkt.

Similar to the atomic option, the server capability can be disabled via
the `receive.advertisePushOptions` config variable. While documenting
this, fix a nit in the `receive.advertiseAtomic` wording.

Signed-off-by: Stefan Beller 
---

 Junio,
 please replace v5 2/4 with this patch (I only resend this single patch
 as the other 3 remain as is).

 This only changes read_push_options to not care at all about any
 limitations.
 
 Thanks,
 Stefan

# interdiff to v5:
# diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
# index 917ac18..9bb9afc 100644
# --- a/builtin/receive-pack.c
# +++ b/builtin/receive-pack.c
# @@ -1505,7 +1505,7 @@ static struct string_list *read_push_options(void)
#   string_list_init(ret, 1);
#  
#   while (1) {
# - char *line, *lf;
# + char *line;
#   int len;
#  
#   line = packet_read_line(0, );
# @@ -1513,30 +1513,6 @@ static struct string_list *read_push_options(void)
#   if (!line)
#   break;
#  
# - /*
# - * NEEDSWORK: expose the limitations to be configurable;
# - * Once the limit can be lifted, include a way for payloads
# - * larger than one pkt, e.g use last byte to indicate if
# - * the push option continues in the next packet or implement
# - * larger packets.
# - */
# - if (len > LARGE_PACKET_MAX - 1) {
# - /*
# -  * NEEDSWORK: The error message in die(..) is not
# -  * transmitted in call cases, so ideally all die(..)
# -  * calls are prefixed with rp_error and then we can
# -  * combine rp_error && die into one helper function.
# -  */
# - rp_error("protocol error: server configuration allows 
push "
# -  "options of size up to %d bytes",
# -  LARGE_PACKET_MAX - 1);
# - die("protocol error: push options too large");
# - }
# -
# - lf = strchr(line, '\n');
# - if (lf)
# - *lf = '\0';
# -
#   string_list_append(ret, line);
#   }


 Documentation/config.txt  |  9 --
 Documentation/technical/pack-protocol.txt | 10 ---
 Documentation/technical/protocol-capabilities.txt |  9 ++
 builtin/receive-pack.c| 35 +++
 4 files changed, 57 insertions(+), 6 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 16dc22d..0bb6daa 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2417,8 +2417,13 @@ rebase.instructionFormat
 
 receive.advertiseAtomic::
By default, git-receive-pack will advertise the atomic push
-   capability to its clients. If you don't want to this capability
-   to be advertised, set this variable to false.
+   capability to its clients. If you don't want to advertise this
+   capability, set this variable to false.
+
+receive.advertisePushOptions::
+   By default, git-receive-pack will advertise the push options
+   capability to its clients. If you don't want to advertise this
+   capability, set this variable to false.
 
 receive.autogc::
By default, git-receive-pack will run "git-gc --auto" after
diff --git a/Documentation/technical/pack-protocol.txt 
b/Documentation/technical/pack-protocol.txt
index 8b36343..7a2ed30 100644
--- a/Documentation/technical/pack-protocol.txt
+++ b/Documentation/technical/pack-protocol.txt
@@ -454,7 +454,8 @@ The reference discovery phase is done nearly the same way 
as it is in the
 fetching protocol. Each reference obj-id and name on the server is sent
 in packet-line format to the client, followed by a flush-pkt.  The only
 real difference is that the capability listing is different - the only
-possible values are 'report-status', 'delete-refs' and 'ofs-delta'.
+possible values are 'report-status', 'delete-refs', 'ofs-delta' and
+'push-options'.
 
 Reference Update Request and Packfile Transfer
 --
@@ -465,9 +466,10 @@ that it wants to update, it sends a line listing the 
obj-id currently on
 the server, the obj-id the client would like to update it to and the name
 of the reference.
 
-This list is followed by a flush-pkt and then the packfile that should
-contain all the objects that the server will need to complete the new
-references.
+This list is followed by a flush-pkt. Then the push options are transmitted
+one per packet followed by another flush-pkt. 

Re: [PATCH v4 2/5] t5000: test tar files that overflow ustar headers

2016-07-14 Thread Johannes Sixt

Am 14.07.2016 um 19:08 schrieb Junio C Hamano:

Johannes Sixt  writes:


Am 14.07.2016 um 17:47 schrieb Johannes Schindelin:

On Thu, 30 Jun 2016, Jeff King wrote:

The ustar format only has room for 11 (or 12, depending on
some implementations) octal digits for the size and mtime of
each file. For values larger than this, we have to add pax
extended headers to specify the real data, and git does not
yet know how to do so.

[...]
 t/t5000/19f9c8273ec45a8938e6999cb59b3ff66739902a | Bin 0 -> 2048 bytes


It appears that this blob cannot be read when sizeof(unsigned long) == 4.
This happens to break the t5000 test on Windows, where that comparison
holds true.


The problem occurs in parse_sha1_header_extended(), where the
calculation of the size in the object header overflows our 32-bit
unsigned long.


OK, then we'd want something that measures how big "unsigned long"
is, and use it as a lazy prerequisite 64BIT_LONG, tweaking the other
patch to t0006 the other Johannes sent yesterday.

Dscho?  I'll revert the merge of 'js/t0006-for-v2.9.2' out of
'next' so that we can replace it with your newer version, but it
needs to be massaged to lose the strong linkage with "time", as
it is no longer "is our time big enough", I would think.


My first thought was that this is not warranted because t0006 is about 
commit time stamps, but the huge-tar breakage is file sizes, and the 
cases should be treated differently.


But on second thought, under the hood, both boil down to the size of 
unsigned long in our implementation. It may make sense to tie both cases 
to the same prerequisite.


On third thought, however, I think the two requirements could diverge in 
the future. The file size case should depend on the size of size_t. The 
timestamp case may become dependent on the size of time_t if we decide 
to move timestamp handling away from unsigned long: in modern(!) 
Microsoft SDKs, time_t is 64 bits, but unsigned long is 32 bits, in both 
the 32-bit and 64-bit environments!


-- Hannes

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 0/2] ulong may only be 32-bit wide

2016-07-14 Thread Junio C Hamano
So here is the final reroll from me for now that targets 'maint'
(eventually).

Jeff King (1):
  t0006: skip "far in the future" test when unsigned long is not long
enough

Junio C Hamano (1):
  archive-tar: huge offset and future timestamps would not work on
32-bit

 archive-tar.c   |  5 +
 help.c  |  6 ++
 t/t0006-date.sh |  6 +++---
 t/t5000-tar-tree.sh | 10 +-
 t/test-lib.sh   |  9 +
 5 files changed, 28 insertions(+), 8 deletions(-)

-- 
2.9.1-545-g8c0a069

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 1/2] t0006: skip "far in the future" test when unsigned long is not long enough

2016-07-14 Thread Junio C Hamano
From: Jeff King 

Git's source code refers to timestamps as unsigned longs.  On 32-bit
platforms, as well as on Windows, unsigned long is not large enough
to capture dates that are "absurdly far in the future".

While we can fix this issue properly by replacing unsigned long with
a larger type, we want to be a bit more conservative and just skip
those tests on the maint track.

Signed-off-by: Jeff King 
Helped-by: Johannes Schindelin 
Signed-off-by: Junio C Hamano 
---
 help.c  | 6 ++
 t/t0006-date.sh | 6 +++---
 t/test-lib.sh   | 9 +
 3 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/help.c b/help.c
index 19328ea..2ff3b5a 100644
--- a/help.c
+++ b/help.c
@@ -419,6 +419,12 @@ int cmd_version(int argc, const char **argv, const char 
*prefix)
 * with external projects that rely on the output of "git version".
 */
printf("git version %s\n", git_version_string);
+   while (*++argv) {
+   if (!strcmp(*argv, "--build-options")) {
+   printf("sizeof-long: %d\n", (int)sizeof(long));
+   /* NEEDSWORK: also save and output GIT-BUILD_OPTIONS? */
+   }
+   }
return 0;
 }
 
diff --git a/t/t0006-date.sh b/t/t0006-date.sh
index 04ce535..4c8cf58 100755
--- a/t/t0006-date.sh
+++ b/t/t0006-date.sh
@@ -31,7 +31,7 @@ check_show () {
format=$1
time=$2
expect=$3
-   test_expect_${4:-success} "show date ($format:$time)" '
+   test_expect_success $4 "show date ($format:$time)" '
echo "$time -> $expect" >expect &&
test-date show:$format "$time" >actual &&
test_cmp expect actual
@@ -50,8 +50,8 @@ check_show iso-local "$TIME" '2016-06-15 14:13:20 +'
 
 # arbitrary time absurdly far in the future
 FUTURE="5758122296 -0400"
-check_show iso   "$FUTURE" "2152-06-19 18:24:56 -0400"
-check_show iso-local "$FUTURE" "2152-06-19 22:24:56 +"
+check_show iso   "$FUTURE" "2152-06-19 18:24:56 -0400" LONG_IS_64BIT
+check_show iso-local "$FUTURE" "2152-06-19 22:24:56 +" LONG_IS_64BIT
 
 check_parse() {
echo "$1 -> $2" >expect
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 0055ebb..11201e9 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -,3 +,12 @@ run_with_limited_cmdline () {
 }
 
 test_lazy_prereq CMDLINE_LIMIT 'run_with_limited_cmdline true'
+
+build_option () {
+   git version --build-options |
+   sed -ne "s/^$1: //p"
+}
+
+test_lazy_prereq LONG_IS_64BIT '
+   test 8 -le "$(build_option sizeof-long)"
+'
-- 
2.9.1-545-g8c0a069

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 2/2] archive-tar: huge offset and future timestamps would not work on 32-bit

2016-07-14 Thread Junio C Hamano
As we are not yet moving everything to size_t but still using ulong
internally when talking about the size of object, platforms with
32-bit long will not be able to produce tar archive with 4GB+ file,
and cannot grok 0777UL as a constant.  Disable the extended
header feature and do not test it on them.

Signed-off-by: Junio C Hamano 
---
 archive-tar.c   |  5 +
 t/t5000-tar-tree.sh | 10 +-
 2 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/archive-tar.c b/archive-tar.c
index 7ea4e90..5568240 100644
--- a/archive-tar.c
+++ b/archive-tar.c
@@ -25,8 +25,13 @@ static int write_tar_filter_archive(const struct archiver 
*ar,
  *
  * Likewise for the mtime (which happens to use a buffer of the same size).
  */
+#if ULONG_MAX == 0x
+#define USTAR_MAX_SIZE ULONG_MAX
+#define USTAR_MAX_MTIME ULONG_MAX
+#else
 #define USTAR_MAX_SIZE 0777UL
 #define USTAR_MAX_MTIME 0777UL
+#endif
 
 /* writes out the whole block, but only if it is full */
 static void write_if_needed(void)
diff --git a/t/t5000-tar-tree.sh b/t/t5000-tar-tree.sh
index 96d208d..699355b 100755
--- a/t/t5000-tar-tree.sh
+++ b/t/t5000-tar-tree.sh
@@ -360,7 +360,7 @@ test_expect_success 'set up repository with huge blob' '
 
 # We expect git to die with SIGPIPE here (otherwise we
 # would generate the whole 64GB).
-test_expect_success 'generate tar with huge size' '
+test_expect_success LONG_IS_64BIT 'generate tar with huge size' '
{
git archive HEAD
echo $? >exit-code
@@ -369,13 +369,13 @@ test_expect_success 'generate tar with huge size' '
test_cmp expect exit-code
 '
 
-test_expect_success TAR_HUGE 'system tar can read our huge size' '
+test_expect_success TAR_HUGE,LONG_IS_64BIT 'system tar can read our huge size' 
'
echo 68719476737 >expect &&
tar_info huge.tar | cut -d" " -f1 >actual &&
test_cmp expect actual
 '
 
-test_expect_success 'set up repository with far-future commit' '
+test_expect_success LONG_IS_64BIT 'set up repository with far-future commit' '
rm -f .git/index &&
echo content >file &&
git add file &&
@@ -383,11 +383,11 @@ test_expect_success 'set up repository with far-future 
commit' '
git commit -m "tempori parendum"
 '
 
-test_expect_success 'generate tar with future mtime' '
+test_expect_success LONG_IS_64BIT 'generate tar with future mtime' '
git archive HEAD >future.tar
 '
 
-test_expect_success TAR_HUGE 'system tar can read our future mtime' '
+test_expect_success TAR_HUGE,LONG_IS_64BIT 'system tar can read our future 
mtime' '
echo 4147 >expect &&
tar_info future.tar | cut -d" " -f2 >actual &&
test_cmp expect actual
-- 
2.9.1-545-g8c0a069

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 2/5] t5000: test tar files that overflow ustar headers

2016-07-14 Thread Junio C Hamano
Jeff King  writes:

> Yeah, that is what I was trying to get at, but you stated it much more
> clearly. LONG_IS_64BIT is good. I wonder if the "git version
> --build-options" should be "sizeof-long", too. It's shorter, and
> indicates our assumption that we are talking about all longs, not just
> unsigned ones.

OK, I'll do a final reroll and then wait for Europeans ;-)
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 2/5] t5000: test tar files that overflow ustar headers

2016-07-14 Thread Jeff King
On Thu, Jul 14, 2016 at 01:22:01PM -0700, Junio C Hamano wrote:

> > Also, shouldn't it be checking against 0x?
> 
> Correct.  Somehow I thought I was checking with LONG_MAX.  Will correct.
> 
> > An easier check would be "sizeof()", but I guess we can't use that in a
> > preprocessor directive.
> 
> Yes, I tried it and failed ;-)

I also found it funny that you used "==" instead "<=", but I cannot
really think of a case where it would matter. If it is "<= 0x",
that basically covers 16-bit platforms. I really hope nobody is trying
to run git on such a platform. Doing "< 0x" to check
for "less than 64-bit" would make more sense, but would probably choke
on a 32-bit preprocessor.

So that everybody is either 32- or 64-bit these days, I think it doesn't
matter in practice.

> >> -test_expect_success TAR_HUGE 'system tar can read our huge size' '
> >> +test_expect_success TAR_HUGE,64BIT 'system tar can read our huge size' '
> >
> > The 64BIT prereq is really "unsigned long is 64-bit". I wonder if we
> > should call it UL64 or something like that to make it more clear.
> >
> > That makes it unnecessarily tied-in with the implementation, but it does
> > make it more clear what we care about; the distinction matters for
> > things like LP64 vs LLP64.
> 
> I do not think any platform is weird enough to have different sizes
> for long and ulong, so I am not sure you need UL64.
> 
> But pointer size can legitimately be different, so it has a value to
> differentiate between LP64 and LLP64, if we start doing things like
> "does this platform have large virtual address space?"
> 
> LONG_IS_64BIT perhaps to be more readable?

Yeah, that is what I was trying to get at, but you stated it much more
clearly. LONG_IS_64BIT is good. I wonder if the "git version
--build-options" should be "sizeof-long", too. It's shorter, and
indicates our assumption that we are talking about all longs, not just
unsigned ones.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 2/5] t5000: test tar files that overflow ustar headers

2016-07-14 Thread Junio C Hamano
Jeff King  writes:

>> +#if ULONG_MAX == 0x7FFF
>> +#define USTAR_MAX_SIZE ULONG_MAX
>> +#define USTAR_MAX_MTIME ULONG_MAX
>> +#else
>>  #define USTAR_MAX_SIZE 0777UL
>>  #define USTAR_MAX_MTIME 0777UL
>> +#endif
>>  
>>  /* writes out the whole block, but only if it is full */
>>  static void write_if_needed(void)
>
> If for some reason we are wrong that objects cannot be larger than
> ULONG_MAX (e.g., later on we convert everything to size_t, and 64-bit
> LLP platforms handle large objects just fine), then we would prematurely
> switch to extended headers on those platforms.
>
> I think that's OK. This would just need cleaned up as part of the
> conversion from "unsigned long" to "size_t" (the correct check would
> then be against the max size_t).

> Also, shouldn't it be checking against 0x?

Correct.  Somehow I thought I was checking with LONG_MAX.  Will correct.

> An easier check would be "sizeof()", but I guess we can't use that in a
> preprocessor directive.

Yes, I tried it and failed ;-)

>> -test_expect_success TAR_HUGE 'system tar can read our huge size' '
>> +test_expect_success TAR_HUGE,64BIT 'system tar can read our huge size' '
>
> The 64BIT prereq is really "unsigned long is 64-bit". I wonder if we
> should call it UL64 or something like that to make it more clear.
>
> That makes it unnecessarily tied-in with the implementation, but it does
> make it more clear what we care about; the distinction matters for
> things like LP64 vs LLP64.

I do not think any platform is weird enough to have different sizes
for long and ulong, so I am not sure you need UL64.

But pointer size can legitimately be different, so it has a value to
differentiate between LP64 and LLP64, if we start doing things like
"does this platform have large virtual address space?"

LONG_IS_64BIT perhaps to be more readable?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Use path comparison for patch ids before the file content

2016-07-14 Thread Kevin Willford
When limiting the list in a revision walk using cherry pick, patch ids are
calculated by producing the diff of the content of the files.  This would
be more efficent by using a patch id looking at the paths that were
changed in the commits and only if all the file changed are the same fall
back to getting the content of the files in the commits to determine if
the commits are the same.

This change uses a hashmap to store entries with a hash of the patch id
calculated by just using the file paths.  The entries constist of the
commit and the patch id calculated using file contents which is initially
empty.  If there are commits that are found in the hashmap it means that
the same files were changed in the commits and the file contents need to
be checked in order to determine if the commits are truly the same.  The
patch id that is calcuated based on the file contents is then stored in the
hashmap entry if needed in later comparisons.  If the commits are determined to 
be
the same the cherry_flag is set on the commit being checked as well as the
commit in the hashmap entry saving running though the original list of
commits checking a seen flag.  This will speed up a rebase where the
upstream has many changes but none of them have been pulled into the
current branch.
---
 diff.c  |  16 +
 diff.h  |   2 +-
 patch-ids.c | 114 +---
 patch-ids.h |   7 ++--
 revision.c  |  19 ++
 5 files changed, 73 insertions(+), 85 deletions(-)

diff --git a/diff.c b/diff.c
index fa78fc1..f38b663 100644
--- a/diff.c
+++ b/diff.c
@@ -4449,7 +4449,7 @@ static void patch_id_consume(void *priv, char *line, 
unsigned long len)
 }
 
 /* returns 0 upon success, and writes result into sha1 */
-static int diff_get_patch_id(struct diff_options *options, unsigned char *sha1)
+static int diff_get_patch_id(struct diff_options *options, unsigned char 
*sha1, int use_path_only)
 {
struct diff_queue_struct *q = _queued_diff;
int i;
@@ -4484,9 +4484,6 @@ static int diff_get_patch_id(struct diff_options 
*options, unsigned char *sha1)
 
diff_fill_sha1_info(p->one);
diff_fill_sha1_info(p->two);
-   if (fill_mmfile(, p->one) < 0 ||
-   fill_mmfile(, p->two) < 0)
-   return error("unable to read files to diff");
 
len1 = remove_space(p->one->path, strlen(p->one->path));
len2 = remove_space(p->two->path, strlen(p->two->path));
@@ -4521,6 +4518,13 @@ static int diff_get_patch_id(struct diff_options 
*options, unsigned char *sha1)
len2, p->two->path);
git_SHA1_Update(, buffer, len1);
 
+   if (use_path_only)
+   continue;
+
+   if (fill_mmfile(, p->one) < 0 ||
+   fill_mmfile(, p->two) < 0)
+   return error("unable to read files to diff");
+
if (diff_filespec_is_binary(p->one) ||
diff_filespec_is_binary(p->two)) {
git_SHA1_Update(, sha1_to_hex(p->one->sha1), 40);
@@ -4541,11 +4545,11 @@ static int diff_get_patch_id(struct diff_options 
*options, unsigned char *sha1)
return 0;
 }
 
-int diff_flush_patch_id(struct diff_options *options, unsigned char *sha1)
+int diff_flush_patch_id(struct diff_options *options, unsigned char *sha1, int 
use_path_only)
 {
struct diff_queue_struct *q = _queued_diff;
int i;
-   int result = diff_get_patch_id(options, sha1);
+   int result = diff_get_patch_id(options, sha1, use_path_only);
 
for (i = 0; i < q->nr; i++)
diff_free_filepair(q->queue[i]);
diff --git a/diff.h b/diff.h
index 125447b..7883729 100644
--- a/diff.h
+++ b/diff.h
@@ -342,7 +342,7 @@ extern int run_diff_files(struct rev_info *revs, unsigned 
int option);
 extern int run_diff_index(struct rev_info *revs, int cached);
 
 extern int do_diff_cache(const unsigned char *, struct diff_options *);
-extern int diff_flush_patch_id(struct diff_options *, unsigned char *);
+extern int diff_flush_patch_id(struct diff_options *, unsigned char *, int);
 
 extern int diff_result_code(struct diff_options *, int);
 
diff --git a/patch-ids.c b/patch-ids.c
index a4d0016..f0262ce 100644
--- a/patch-ids.c
+++ b/patch-ids.c
@@ -4,8 +4,9 @@
 #include "sha1-lookup.h"
 #include "patch-ids.h"
 
-int commit_patch_id(struct commit *commit, struct diff_options *options,
-   unsigned char *sha1)
+
+static int ll_commit_patch_id(struct commit *commit, struct diff_options 
*options,
+   unsigned char *sha1, int use_path_only)
 {
if (commit->parents)
diff_tree_sha1(commit->parents->item->object.oid.hash,
@@ -13,93 +14,90 @@ int commit_patch_id(struct commit *commit, struct 
diff_options *options,
else
diff_root_tree_sha1(commit->object.oid.hash, "", 

Re: [PATCH v4 2/5] t5000: test tar files that overflow ustar headers

2016-07-14 Thread Jeff King
On Thu, Jul 14, 2016 at 01:03:35PM -0700, Junio C Hamano wrote:

> Junio C Hamano  writes:
> 
> > OK, how about this on top of a replacement for js/t0006-for-v2.9.2
> > that I'll send out as a reply to this message?
> 
> which is this one, which is largely from your $gmane/299310.

Mostly OK, but two minor nits:

> diff --git a/help.c b/help.c
> index 19328ea..0cea240 100644
> --- a/help.c
> +++ b/help.c
> @@ -419,6 +419,13 @@ int cmd_version(int argc, const char **argv, const char 
> *prefix)
>* with external projects that rely on the output of "git version".
>*/
>   printf("git version %s\n", git_version_string);
> + while (*++argv) {
> + if (!strcmp(*argv, "--build-options")) {
> + printf("sizeof-unsigned-long: %d",
> +(int)sizeof(unsigned long));
> + /* maybe also save and output GIT-BUILD_OPTIONS? */
> + }

That comment was mostly for explaining my mindset in the discussion. If
we're going to keep it, maybe mark it with a TODO or XXX or something?

> +test_lazy_prereq 64BIT '
> + test 8 -le "$(build_option sizeof-unsigned-long)"
> +'

As mentioned in my previous email, I wonder if this should be UL64 or
something.

As noted earlier in the thread, t0006 actually cares most directly about
LONG_MAX, but I think it's probably OK to assume it is directly related
to "unsigned long". Also, even if we got past that initial "strtol", I
suspect "unsigned long" _would_ come into play anyway, as that is our
internal representation.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 2/5] t5000: test tar files that overflow ustar headers

2016-07-14 Thread Jeff King
On Thu, Jul 14, 2016 at 01:00:08PM -0700, Junio C Hamano wrote:

> > There's tons of discussion in:
> >
> >   http://thread.gmane.org/gmane.comp.version-control.git/297409
> >
> > but frankly it is not worth your time to read it. These tests are about
> > overflowing the tar limits, which can only happen with times and sizes
> > greater than 32-bits. The right thing to do is to skip the tests
> > entirely on systems where sizeof(unsigned long) is less than 8 (the
> > actual value is 64GB+1, so technically a 37-bit system would work, but I
> > think it is OK for the test-skipping to be less specific).
> 
> OK, how about this on top of a replacement for js/t0006-for-v2.9.2
> that I'll send out as a reply to this message?

Yeah, I think the patch here mostly makes sense. I tried to think what
could go wrong in this hunk:

> diff --git a/archive-tar.c b/archive-tar.c
> index 7ea4e90..4d2832c 100644
> --- a/archive-tar.c
> +++ b/archive-tar.c
> @@ -25,8 +25,13 @@ static int write_tar_filter_archive(const struct archiver 
> *ar,
>   *
>   * Likewise for the mtime (which happens to use a buffer of the same size).
>   */
> +#if ULONG_MAX == 0x7FFF
> +#define USTAR_MAX_SIZE ULONG_MAX
> +#define USTAR_MAX_MTIME ULONG_MAX
> +#else
>  #define USTAR_MAX_SIZE 0777UL
>  #define USTAR_MAX_MTIME 0777UL
> +#endif
>  
>  /* writes out the whole block, but only if it is full */
>  static void write_if_needed(void)

If for some reason we are wrong that objects cannot be larger than
ULONG_MAX (e.g., later on we convert everything to size_t, and 64-bit
LLP platforms handle large objects just fine), then we would prematurely
switch to extended headers on those platforms.

I think that's OK. This would just need cleaned up as part of the
conversion from "unsigned long" to "size_t" (the correct check would
then be against the max size_t).

Also, shouldn't it be checking against 0x?

An easier check would be "sizeof()", but I guess we can't use that in a
preprocessor directive.

> -test_expect_success TAR_HUGE 'system tar can read our huge size' '
> +test_expect_success TAR_HUGE,64BIT 'system tar can read our huge size' '

The 64BIT prereq is really "unsigned long is 64-bit". I wonder if we
should call it UL64 or something like that to make it more clear.

That makes it unnecessarily tied-in with the implementation, but it does
make it more clear what we care about; the distinction matters for
things like LP64 vs LLP64.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 2/5] t5000: test tar files that overflow ustar headers

2016-07-14 Thread Junio C Hamano
Junio C Hamano  writes:

> OK, how about this on top of a replacement for js/t0006-for-v2.9.2
> that I'll send out as a reply to this message?

I won't repeat the patch body, but I had to write a log message, so 
here is what I tentatively committed (not queued yet).

Subject: archive-tar: huge offset and future timestamps would not work on 32-bit

As we are not yet moving everything to size_t but still using ulong
internally when talking about the size of object, platforms with
32-bit long will not be able to produce tar archive with 4GB+ file,
and cannot grok 0777UL as a constant.  Disable the extended
header feature and do not test it on them.

Signed-off-by: Junio C Hamano 
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/4] receive-pack: implement advertising and receiving push options

2016-07-14 Thread Junio C Hamano
Jeff King  writes:

> packet_read() does NUL-terminate for you. It gets the extra bytes
> because it doesn't store the 4-byte size in the output (whereas the
> client does not ever send anything over LARGE_PACKET_MAX, _including_
> those bytes, so we always have room to store its result in our
> LARGE_PACKET_MAX buffer, plus the NUL, with 3 bytes to spare).

Good; then the loop will further be simplified ;-)

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 2/5] t5000: test tar files that overflow ustar headers

2016-07-14 Thread Junio C Hamano
Junio C Hamano  writes:

> OK, how about this on top of a replacement for js/t0006-for-v2.9.2
> that I'll send out as a reply to this message?

which is this one, which is largely from your $gmane/299310.

-- >8 --
From: Jeff King 
Date: Mon, 11 Jul 2016 19:54:18 -0400
Subject: [PATCH] t0006: skip "far in the future" test when unsigned long is
 not long enough

Git's source code refers to timestamps as unsigned longs.  On 32-bit
platforms, as well as on Windows, unsigned long is not large enough
to capture dates that are "absurdly far in the future".

While we can fix this issue properly by replacing unsigned long with
a larger type, we want to be a bit more conservative and just skip
those tests on the maint track.

Signed-off-by: Jeff King 
Helped-by: Johannes Schindelin 
Signed-off-by: Junio C Hamano 
---
 help.c  | 7 +++
 t/t0006-date.sh | 6 +++---
 t/test-lib.sh   | 9 +
 3 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/help.c b/help.c
index 19328ea..0cea240 100644
--- a/help.c
+++ b/help.c
@@ -419,6 +419,13 @@ int cmd_version(int argc, const char **argv, const char 
*prefix)
 * with external projects that rely on the output of "git version".
 */
printf("git version %s\n", git_version_string);
+   while (*++argv) {
+   if (!strcmp(*argv, "--build-options")) {
+   printf("sizeof-unsigned-long: %d",
+  (int)sizeof(unsigned long));
+   /* maybe also save and output GIT-BUILD_OPTIONS? */
+   }
+   }
return 0;
 }
 
diff --git a/t/t0006-date.sh b/t/t0006-date.sh
index 04ce535..a0b8497 100755
--- a/t/t0006-date.sh
+++ b/t/t0006-date.sh
@@ -31,7 +31,7 @@ check_show () {
format=$1
time=$2
expect=$3
-   test_expect_${4:-success} "show date ($format:$time)" '
+   test_expect_success $4 "show date ($format:$time)" '
echo "$time -> $expect" >expect &&
test-date show:$format "$time" >actual &&
test_cmp expect actual
@@ -50,8 +50,8 @@ check_show iso-local "$TIME" '2016-06-15 14:13:20 +'
 
 # arbitrary time absurdly far in the future
 FUTURE="5758122296 -0400"
-check_show iso   "$FUTURE" "2152-06-19 18:24:56 -0400"
-check_show iso-local "$FUTURE" "2152-06-19 22:24:56 +"
+check_show iso   "$FUTURE" "2152-06-19 18:24:56 -0400" 64BIT
+check_show iso-local "$FUTURE" "2152-06-19 22:24:56 +" 64BIT
 
 check_parse() {
echo "$1 -> $2" >expect
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 0055ebb..211f1a8 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -,3 +,12 @@ run_with_limited_cmdline () {
 }
 
 test_lazy_prereq CMDLINE_LIMIT 'run_with_limited_cmdline true'
+
+build_option () {
+   git version --build-options |
+   sed -ne "s/^$1: //p"
+}
+
+test_lazy_prereq 64BIT '
+   test 8 -le "$(build_option sizeof-unsigned-long)"
+'
-- 
2.9.1-545-g8c0a069

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 2/5] t5000: test tar files that overflow ustar headers

2016-07-14 Thread Junio C Hamano
Jeff King  writes:

> On Thu, Jul 14, 2016 at 05:47:41PM +0200, Johannes Schindelin wrote:
>
>> On Thu, 30 Jun 2016, Jeff King wrote:
>> 
>> > The ustar format only has room for 11 (or 12, depending on
>> > some implementations) octal digits for the size and mtime of
>> > each file. For values larger than this, we have to add pax
>> > extended headers to specify the real data, and git does not
>> > yet know how to do so.
>> >
>> > [...]
>> >  t/t5000/19f9c8273ec45a8938e6999cb59b3ff66739902a | Bin 0 -> 2048 bytes
>> 
>> It appears that this blob cannot be read when sizeof(unsigned long) == 4.
>> This happens to break the t5000 test on Windows, where that comparison
>> holds true.
>> 
>> I am sure that I missed some other discussion about this issue... could
>> you point me to it?
>
> There's tons of discussion in:
>
>   http://thread.gmane.org/gmane.comp.version-control.git/297409
>
> but frankly it is not worth your time to read it. These tests are about
> overflowing the tar limits, which can only happen with times and sizes
> greater than 32-bits. The right thing to do is to skip the tests
> entirely on systems where sizeof(unsigned long) is less than 8 (the
> actual value is 64GB+1, so technically a 37-bit system would work, but I
> think it is OK for the test-skipping to be less specific).

OK, how about this on top of a replacement for js/t0006-for-v2.9.2
that I'll send out as a reply to this message?




 archive-tar.c   |  5 +
 t/t5000-tar-tree.sh | 10 +-
 2 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/archive-tar.c b/archive-tar.c
index 7ea4e90..4d2832c 100644
--- a/archive-tar.c
+++ b/archive-tar.c
@@ -25,8 +25,13 @@ static int write_tar_filter_archive(const struct archiver 
*ar,
  *
  * Likewise for the mtime (which happens to use a buffer of the same size).
  */
+#if ULONG_MAX == 0x7FFF
+#define USTAR_MAX_SIZE ULONG_MAX
+#define USTAR_MAX_MTIME ULONG_MAX
+#else
 #define USTAR_MAX_SIZE 0777UL
 #define USTAR_MAX_MTIME 0777UL
+#endif
 
 /* writes out the whole block, but only if it is full */
 static void write_if_needed(void)
diff --git a/t/t5000-tar-tree.sh b/t/t5000-tar-tree.sh
index 96d208d..9c97789 100755
--- a/t/t5000-tar-tree.sh
+++ b/t/t5000-tar-tree.sh
@@ -360,7 +360,7 @@ test_expect_success 'set up repository with huge blob' '
 
 # We expect git to die with SIGPIPE here (otherwise we
 # would generate the whole 64GB).
-test_expect_success 'generate tar with huge size' '
+test_expect_success 64BIT 'generate tar with huge size' '
{
git archive HEAD
echo $? >exit-code
@@ -369,13 +369,13 @@ test_expect_success 'generate tar with huge size' '
test_cmp expect exit-code
 '
 
-test_expect_success TAR_HUGE 'system tar can read our huge size' '
+test_expect_success TAR_HUGE,64BIT 'system tar can read our huge size' '
echo 68719476737 >expect &&
tar_info huge.tar | cut -d" " -f1 >actual &&
test_cmp expect actual
 '
 
-test_expect_success 'set up repository with far-future commit' '
+test_expect_success 64BIT 'set up repository with far-future commit' '
rm -f .git/index &&
echo content >file &&
git add file &&
@@ -383,11 +383,11 @@ test_expect_success 'set up repository with far-future 
commit' '
git commit -m "tempori parendum"
 '
 
-test_expect_success 'generate tar with future mtime' '
+test_expect_success 64BIT 'generate tar with future mtime' '
git archive HEAD >future.tar
 '
 
-test_expect_success TAR_HUGE 'system tar can read our future mtime' '
+test_expect_success TAR_HUGE,64BIT 'system tar can read our future mtime' '
echo 4147 >expect &&
tar_info future.tar | cut -d" " -f2 >actual &&
test_cmp expect actual
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/4] receive-pack: implement advertising and receiving push options

2016-07-14 Thread Jeff King
On Thu, Jul 14, 2016 at 12:07:15PM -0700, Junio C Hamano wrote:

> > Although as you remarked in another email, this would not pose a problem for
> > the shell variable, so we could also drop it to allow multi line
> > options. will do.
> 
> One thing to note is that I do not think there is a guarantee that
> packet_buf[] is NUL-terminated, and when len == LAGE_PACKET_MAX, you
> do not have room to NUL-terminate it yourself.

packet_read() does NUL-terminate for you. It gets the extra bytes
because it doesn't store the 4-byte size in the output (whereas the
client does not ever send anything over LARGE_PACKET_MAX, _including_
those bytes, so we always have room to store its result in our
LARGE_PACKET_MAX buffer, plus the NUL, with 3 bytes to spare).

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 00/16] Use merge_recursive() directly in the builtin am

2016-07-14 Thread Junio C Hamano
Johannes Schindelin  writes:

> The two topics that are in 'pu' and conflict with this series are
> 'jh/clean-smudge-annex' and 'bc/cocci'.
>
> It also conflicted with 'va/i18n-even-more', but that one was merged to
> 'master'.
>
> Now, I think it would be okay to wait for 'bc/cocci' to go to 'master'
> before integrating the 'am-3-merge-recursive-direct' branch, but I would
> want to avoid waiting for 'jh/clean-smudge-annex'.
>
> Do you concur? If so, I will rebase onto 'master' as soon as 'bc/cocci'
> lands in there.

I do not have a strong preference myself.  As you are not proposing
to eject jh/clean-smudge-annex from my tree, I'd have to resolve the
conflicts when the second topic is merged after one topic, whichever
happens to be more ready.  IOW, such a rebase adds to my workload.

Like it or not, it is normal for different topics to want to touch
the overlapping area or one topic to invalidate an assumption the
other topic is based on, and working well together does not have to
mean leaving the conflict resolution to the sole responsibility of
the integrator.  A clean way forward may be for everybody involved
(and I see you forgot to add Joey to this thread) to agree that this
topic is more ready than jh/clean-smudge-annex and you and Joey to
work together to rebase jh/clean-smudge-annex on top of this topic
(or possibly the other way around), making him wait for you.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] diff: document diff-filter exclusion

2016-07-14 Thread Stefan Beller
On Thu, Jul 14, 2016 at 12:20 PM, Junio C Hamano  wrote:
> In v1.8.5 days, 7f2ea5f0 (diff: allow lowercase letter to specify
> what change class to exclude, 2013-07-17) taught the "--diff-filter"
> mechanism to take lowercase letters as exclusion, but we forgot to
> document it.
>
> Signed-off-by: Junio C Hamano 
> ---
>  Documentation/diff-options.txt | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
> index 3ad6404..073c7e5 100644
> --- a/Documentation/diff-options.txt
> +++ b/Documentation/diff-options.txt
> @@ -399,6 +399,9 @@ ifndef::git-format-patch[]
> paths are selected if there is any file that matches
> other criteria in the comparison; if there is no file
> that matches other criteria, nothing is selected.
> ++
> +Also, these upper-case letters can be downcased to exclude.  E.g.
> +`--diff-filter=ad` excludes added and deleted paths.
>

This looks good for the neighboring thread. :)
http://thread.gmane.org/gmane.comp.version-control.git/299512
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] diff: document diff-filter exclusion

2016-07-14 Thread Junio C Hamano
In v1.8.5 days, 7f2ea5f0 (diff: allow lowercase letter to specify
what change class to exclude, 2013-07-17) taught the "--diff-filter"
mechanism to take lowercase letters as exclusion, but we forgot to
document it.

Signed-off-by: Junio C Hamano 
---
 Documentation/diff-options.txt | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index 3ad6404..073c7e5 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -399,6 +399,9 @@ ifndef::git-format-patch[]
paths are selected if there is any file that matches
other criteria in the comparison; if there is no file
that matches other criteria, nothing is selected.
++
+Also, these upper-case letters can be downcased to exclude.  E.g.
+`--diff-filter=ad` excludes added and deleted paths.
 
 -S::
Look for differences that change the number of occurrences of
-- 
2.9.1-545-g8c0a069

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Server-side preventing some files from being overwritten

2016-07-14 Thread Junio C Hamano
Thorsten Glaser  writes:

> On Thu, 14 Jul 2016, Junio C Hamano wrote:
>
>> Can't this become simpler, e.g.
>> 
>>  if ! git diff-tree --quiet "$old" "$new" -- "$subdir"
>
> Thought about diff-tree, but additions are permitted,

diff-tree -r --diff-filter=a "$old" "$new" -- "$subdir" 

or something like that?

> and diffing the actual file content has overhead too.

It is why exactly "diff-tree" is desired here, as you do not have to
diff actual file at all.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/4] receive-pack: implement advertising and receiving push options

2016-07-14 Thread Junio C Hamano
Stefan Beller  writes:

>>> + lf = strchr(line, '\n');
>>> + if (lf)
>>> + *lf = '\0';
>>
>> packet_read_line() -> packet_read_line_generic() calls packet_read()
>> with PACKET_READ_CHOMP_NEWLINE flag bit; do we need this check?
>
> This check was not about "option with lf at end\n", but rather we want to chop
> off "option\nover\nmultiple\nlines" ?

Ahh, I did misread the check.

> Although as you remarked in another email, this would not pose a problem for
> the shell variable, so we could also drop it to allow multi line
> options. will do.

One thing to note is that I do not think there is a guarantee that
packet_buf[] is NUL-terminated, and when len == LAGE_PACKET_MAX, you
do not have room to NUL-terminate it yourself.

string_list_append(ret, line) that assumes the "string" is NUL
terminated may become an issue that you need to solve by appending
the result of xmemdupz() into a non-duping string list.


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/4] receive-pack: implement advertising and receiving push options

2016-07-14 Thread Stefan Beller
> packet_read_line() calls packet_read_line_generic() that calls
> packet_read() with the fixed packet_buffer[] that is sized to be
> LARGE_PACKET_MAX.

>
> Can this check even trigger?

I thought when len == LARGE_PACKET_MAX,
this could trigger. Though on inspection of packet_read,
we already reject packets that have size LARGE_PACKET_MAX,
and the largest size allowed is LARGE_PACKET_MAX - 1.

I guess we can remove it altogether and still be future proof.
If we ever want to allow larger push options we either need to
have larger (variable sized) packets or a capability push-options-v2,
so I'll rip this check out.

>
>> + lf = strchr(line, '\n');
>> + if (lf)
>> + *lf = '\0';
>
> packet_read_line() -> packet_read_line_generic() calls packet_read()
> with PACKET_READ_CHOMP_NEWLINE flag bit; do we need this check?

This check was not about "option with lf at end\n", but rather we want to chop
off "option\nover\nmultiple\nlines" ?

Although as you remarked in another email, this would not pose a problem for
the shell variable, so we could also drop it to allow multi line
options. will do.

>
>> +
>> + string_list_append(ret, line);
>> + }
>> +
>> + return ret;
>> +}
>
> Other than that, looks good to me.
>
> Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Server-side preventing some files from being overwritten

2016-07-14 Thread Stefan Beller
On Thu, Jul 14, 2016 at 11:44 AM, Junio C Hamano  wrote:
> On Thu, Jul 14, 2016 at 11:27 AM, Junio C Hamano  wrote:
>> Thorsten Glaser  writes:
>>
>>>   if test x"0" != x"$(comm -23z \
>>>   <(git ls-tree -r -z "$old" "$subdir" | sort -z) \
>>>   <(git ls-tree -r -z "$new" "$subdir" | sort -z) | wc -c)"; then
>>>   echo >&2 'Untouchable files touched, commit rejected!'
>>>   exit 1
>>>   fi
>>
>> Can't this become simpler, e.g.
>>
>> if ! git diff-tree --quiet "$old" "$new" -- "$subdir"
>> then
>> echo >&2 "Ooh, $subdir is touched"

No need to go for >&2 here, as it makes no difference to
the client.

>> exit 1
>> fi
>
> Ehh, you need to tell diff-tree to recurse, i.e. "diff-tree -r".
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Server-side preventing some files from being overwritten

2016-07-14 Thread Thorsten Glaser
On Thu, 14 Jul 2016, Junio C Hamano wrote:

> Can't this become simpler, e.g.
> 
>   if ! git diff-tree --quiet "$old" "$new" -- "$subdir"

Thought about diff-tree, but additions are permitted,
and diffing the actual file content has overhead too.

Just counting the number of object hashes removed from
the old tree (recursed) works out just fine.

bye,
//mirabilos
-- 
tarent solutions GmbH
Rochusstraße 2-4, D-53123 Bonn • http://www.tarent.de/
Tel: +49 228 54881-393 • Fax: +49 228 54881-235
HRB 5168 (AG Bonn) • USt-ID (VAT): DE122264941
Geschäftsführer: Dr. Stefan Barth, Kai Ebenrett, Boris Esser, Alexander Steeg
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv5 0/4] Push options

2016-07-14 Thread Jeff King
On Thu, Jul 14, 2016 at 11:48:58AM -0700, Stefan Beller wrote:

> > Hmm. So that is a downside for people who have implemented separate
> > DoS protection only in that upgrading git will open a new "hole"
> > that they need to plug. But that is their problem, not upstream
> > git's.
> >
> > -Peff
> 
> But this is not specific to DoS protection, but like most features.
> Just look at the vendors using linux that think carrying patches out
> of tree for their special snowflake hardware is cheaper than getting
> it upstream (This is not to be read as a snarky comment, but
> rather as a pointer to the similarity of the mechanisms involved).
> Although I cannot tell offhand another feature that would easily break
> downstream customization.

Right. I was serious when I said "their problem, not git's".

Most features are a little better off in that they are not a possible
negative for somebody downstream to receive them. But again, I don't
think that needs to restrict what git releases.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv5 0/4] Push options

2016-07-14 Thread Stefan Beller
On Thu, Jul 14, 2016 at 11:41 AM, Jeff King  wrote:
> On Thu, Jul 14, 2016 at 10:39:16AM -0700, Stefan Beller wrote:
>
>> Jeff wrote:
>> > Junio wrote:
>> >> I think those extra knobs can come later.  If we are not going to
>> >> limit with max_options in the end, however, wouldn't it be more
>> >> natural for the initial iteration without any configuration not to
>> >> have hard-coded max_options at all?
>> >
>> > Yeah, I am OK with adding restrictive knobs later as a separate topic.
>> > As Stefan notes, upstream does not have the other knobs anyway, and IIRC
>> > the push-options feature is not even enabled by default.
>>
>> * now it actually is not a default. ;)
>
> Hmm. So that is a downside for people who have implemented separate DoS
> protection only in that upgrading git will open a new "hole" that they
> need to plug. But that is their problem, not upstream git's.
>
> -Peff

But this is not specific to DoS protection, but like most features.
Just look at the vendors
using linux that think carrying patches out of tree for their special
snowflake hardware
is cheaper than getting it upstream (This is not to be read as a
snarky comment, but
rather as a pointer to the similarity of the mechanisms involved).
Although I cannot tell
offhand another feature that would easily break downstream customization.

Stefan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Server-side preventing some files from being overwritten

2016-07-14 Thread Junio C Hamano
On Thu, Jul 14, 2016 at 11:27 AM, Junio C Hamano  wrote:
> Thorsten Glaser  writes:
>
>>   if test x"0" != x"$(comm -23z \
>>   <(git ls-tree -r -z "$old" "$subdir" | sort -z) \
>>   <(git ls-tree -r -z "$new" "$subdir" | sort -z) | wc -c)"; then
>>   echo >&2 'Untouchable files touched, commit rejected!'
>>   exit 1
>>   fi
>
> Can't this become simpler, e.g.
>
> if ! git diff-tree --quiet "$old" "$new" -- "$subdir"
> then
> echo >&2 "Ooh, $subdir is touched"
> exit 1
> fi

Ehh, you need to tell diff-tree to recurse, i.e. "diff-tree -r".
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv5 0/4] Push options

2016-07-14 Thread Jeff King
On Thu, Jul 14, 2016 at 10:39:16AM -0700, Stefan Beller wrote:

> Jeff wrote:
> > Junio wrote:
> >> I think those extra knobs can come later.  If we are not going to
> >> limit with max_options in the end, however, wouldn't it be more
> >> natural for the initial iteration without any configuration not to
> >> have hard-coded max_options at all?
> >
> > Yeah, I am OK with adding restrictive knobs later as a separate topic.
> > As Stefan notes, upstream does not have the other knobs anyway, and IIRC
> > the push-options feature is not even enabled by default.
> 
> * now it actually is not a default. ;)

Hmm. So that is a downside for people who have implemented separate DoS
protection only in that upgrading git will open a new "hole" that they
need to plug. But that is their problem, not upstream git's.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] mingw: fix regression in t1308-config-set

2016-07-14 Thread Jeff King
On Thu, Jul 14, 2016 at 09:18:18AM -0700, Junio C Hamano wrote:

> I added a few missing Cc: and quoted the whole patch here to those
> who were involved; I think this update is correct, but just trying
> to make sure people know.
> 
> Not limited to this particular topic, there probably are some things
> we can and should add to the procedure to prevent further episodes
> like this, but I am not seeing anything immediately obvious offhand.
> There already is a way to prominently mark a topic to be not-ready
> with an outstanding issue called "What's cooking" report, but it is
> maintained manually and it can be leaky without extra set of eyes
> constantly monitoring.

Thanks, this fix looks good.

I'm open to to suggestions to make life easier for Windows folks.
Usually when dealing with paths, the suggestion is to use $(pwd), which
I did in the original, but as 58461bd noted, that broke other cases.

So code-wise, maybe this technique could be more general. I.e., could
$TRASH_DIRECTORY or $HOME already be in whatever format is likely to be
produced by git on the platform? Or does that just screw things up more
because Windows sometimes needs one form and sometimes the other?

Process-wise, I'm not sure. I seem to recall a few times when Windows
issues have come up in the past that somebody (JSixt?) seemed content to
let topics graduate with potential portability problems, and then have
the Git for Windows project fix them up separately. These days GfW seems
to track upstream more closely (which is wonderful!), but it means
portability problems cause a more immediate headache and slow down that
process.

> > Side note: it was not at all clear to me how 58461bd fixed the
> > problem by replacing $(pwd) with $HOME, given that HOME is set to
> > $TRASH_DIRECTORY which is set to $TEST_OUTPUT_DIRECTORY/... after
> > TEST_OUTPUT_DIRECTORY was set to TEST_DIRECTORY which in turn was
> > set to $(pwd).
> >
> > I guess the reason is that -P in `cd -P "$TRASH DIRECTORY"`, but
> > then I *really* do not understand how $(pwd) and $PWD could
> > disagree.

I don't think they did disagree. The issue is that "cd -P" caused _both_
of them to print the physical directory. But git is not using either of
them. It is blindly using "$HOME". So basically:

  HOME=/path/with/symlinks
  cd -P "$HOME"

will cause "$PWD" and "$HOME" to disagree. Likewise with "$(pwd)" and
"$HOME".

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/4] receive-pack: implement advertising and receiving push options

2016-07-14 Thread Junio C Hamano
Stefan Beller  writes:

> +static struct string_list *read_push_options(void)
> +{
> + struct string_list *ret = xmalloc(sizeof(*ret));
> + string_list_init(ret, 1);
> +
> + while (1) {
> + char *line, *lf;
> + int len;
> +
> + line = packet_read_line(0, );
> +
> + if (!line)
> + break;
> +
> + /*
> + * NEEDSWORK: expose the limitations to be configurable;
> + * Once the limit can be lifted, include a way for payloads
> + * larger than one pkt, e.g use last byte to indicate if
> + * the push option continues in the next packet or implement
> + * larger packets.
> + */
> + if (len > LARGE_PACKET_MAX - 1) {

packet_read_line() calls packet_read_line_generic() that calls
packet_read() with the fixed packet_buffer[] that is sized to be
LARGE_PACKET_MAX.

Can this check even trigger?

> + /*
> +  * NEEDSWORK: The error message in die(..) is not
> +  * transmitted in call cases, so ideally all die(..)
> +  * calls are prefixed with rp_error and then we can
> +  * combine rp_error && die into one helper function.
> +  */
> + rp_error("protocol error: server configuration allows 
> push "
> +  "options of size up to %d bytes",
> +  LARGE_PACKET_MAX - 1);
> + die("protocol error: push options too large");
> + }

> + lf = strchr(line, '\n');
> + if (lf)
> + *lf = '\0';

packet_read_line() -> packet_read_line_generic() calls packet_read()
with PACKET_READ_CHOMP_NEWLINE flag bit; do we need this check?

> +
> + string_list_append(ret, line);
> + }
> +
> + return ret;
> +}

Other than that, looks good to me.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Server-side preventing some files from being overwritten

2016-07-14 Thread Junio C Hamano
Thorsten Glaser  writes:

> Although I’m ordinarily loath to write GNU bash scripts, this
> helps avoiding temporary files. This works:
>
> -cutting here may damage your screen surface-
> #!/bin/bash
> export LC_ALL=C
> subdir=x/y
> while IFS=' ' read -r old new name; do
>   test x"$name" = x"refs/heads/master" || continue
>   if test x"0" != x"$(comm -23z \
>   <(git ls-tree -r -z "$old" "$subdir" | sort -z) \
>   <(git ls-tree -r -z "$new" "$subdir" | sort -z) | wc -c)"; then
>   echo >&2 'Untouchable files touched, commit rejected!'
>   exit 1
>   fi

Can't this become simpler, e.g.

if ! git diff-tree --quiet "$old" "$new" -- "$subdir"
then
echo >&2 "Ooh, $subdir is touched"
exit 1
fi

> done
> exit 0
> -cutting here may damage your screen surface-
>
> Of course, set “subdir” in line 3 correctly, and GNU coreutils
> are required for the NUL line termination, which is not an issue
> here. (BSD has “-R ''” for sort(1), for example.)
>
> bye,
> //mirabilos
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 2/5] t5000: test tar files that overflow ustar headers

2016-07-14 Thread Jeff King
On Thu, Jul 14, 2016 at 06:45:47PM +0200, Johannes Sixt wrote:

> Am 14.07.2016 um 17:47 schrieb Johannes Schindelin:
> > On Thu, 30 Jun 2016, Jeff King wrote:
> > > The ustar format only has room for 11 (or 12, depending on
> > > some implementations) octal digits for the size and mtime of
> > > each file. For values larger than this, we have to add pax
> > > extended headers to specify the real data, and git does not
> > > yet know how to do so.
> > > 
> > > [...]
> > >  t/t5000/19f9c8273ec45a8938e6999cb59b3ff66739902a | Bin 0 -> 2048 bytes
> > 
> > It appears that this blob cannot be read when sizeof(unsigned long) == 4.
> > This happens to break the t5000 test on Windows, where that comparison
> > holds true.
> 
> The problem occurs in parse_sha1_header_extended(), where the calculation of
> the size in the object header overflows our 32-bit unsigned long.

Yep, unsurprising.

I do think git is wrong to use "unsigned long" there. It really ought to
be "size_t". My understanding is that 64-bit Windows is LLP64, which
means that you cannot currently have blobs greater than 4GB there, even
though it would work correctly with size_t.

Switching all of the things that look at blob sizes to size_t will be a
big job, though.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 2/5] t5000: test tar files that overflow ustar headers

2016-07-14 Thread Jeff King
On Thu, Jul 14, 2016 at 05:47:41PM +0200, Johannes Schindelin wrote:

> On Thu, 30 Jun 2016, Jeff King wrote:
> 
> > The ustar format only has room for 11 (or 12, depending on
> > some implementations) octal digits for the size and mtime of
> > each file. For values larger than this, we have to add pax
> > extended headers to specify the real data, and git does not
> > yet know how to do so.
> >
> > [...]
> >  t/t5000/19f9c8273ec45a8938e6999cb59b3ff66739902a | Bin 0 -> 2048 bytes
> 
> It appears that this blob cannot be read when sizeof(unsigned long) == 4.
> This happens to break the t5000 test on Windows, where that comparison
> holds true.
> 
> I am sure that I missed some other discussion about this issue... could
> you point me to it?

There's tons of discussion in:

  http://thread.gmane.org/gmane.comp.version-control.git/297409

but frankly it is not worth your time to read it. These tests are about
overflowing the tar limits, which can only happen with times and sizes
greater than 32-bits. The right thing to do is to skip the tests
entirely on systems where sizeof(unsigned long) is less than 8 (the
actual value is 64GB+1, so technically a 37-bit system would work, but I
think it is OK for the test-skipping to be less specific).

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Jun 2016, #05; Thu, 16)

2016-07-14 Thread Junio C Hamano
Torsten Bögershausen  writes:

>> There's less redundant work going on than at first seems, because
>> .gitattribute files are only read once and cached. Verified by strace.
>>
> OK, I think I missed that work (not enough time for Git at the moment)
> Junio, please help me out, do we have a cache here now?

The call convert_attrs() makes to the attribute subsystem is:

git_check_attr(path, NUM_CONV_ATTRS, ccheck)

Conceptually, this call has to do the following, and for the very
first call that is actually what it does:

 1. Read all the relevant attrubutes files.  If you are asking for
"a/x/1", we need to read $GIT_DIR/info/attributes,
".gitattributes", "a/.gitattributes" and "a/x/.gitattributes".

 2. Find matching patterns that cover "a/x/1", and pick up the
attribute definition from the above.

If you have asked for "a/x/1", it is very likely that you would next
ask for "a/x/2" (think: "git checkout a/x"), and we can realize that
exactly the same set of attributes files apply to that path.  So an
obvious optimization is to cache the result of the first step.

In addition, it is also likely that you would later ask for "a/y/3"
before asking for "b/z/4" (think: "git add .").  A part of the step
1. that was done when you asked for "a/x/1" and then was reused when
you asked for "a/x/2" can further be reused for this request, by
discarding only what was read from "a/x/.gitattributes" and reading
only from "a/y/.gitattributes".

The above two optimizations are done in prepare_attr_stack().

Unfortunately this is one of the three reasons why the attribute
subsystem is not thread-ready.  I.e. there is only one global cache,
so if you spawn two threads to speed up "git add ." by handing paths
[a-m]* to one and [n-z]* to the other, they would thrash the cache
and making it ineffective (even if we protect the cache with mutex,
which obviously has not been done).

I earlier started looking into this, but the effort stalled.  But
for a single-thread use, the attributes read from the filesystem are
cached, and the cache is designed to perform well as long as you
make requests in-order.

To make the attribute look-up thread-ready, the attribute cache
needs to become per-thread.

Orthogonal of the threading issue, there is another posssible
optimization that is not there yet.  The cache can be tied to what
is in ccheck[] to further reduce the size of the cache, making step
2. a lot cheaper.  Currently in step 1. we read and keep everything,
but if we tie the cache to the contents of ccheck[], we can read and
ignore entries we read in step 1. that does not talk about the
attributes ccheck[] is interested in.  My plan is to either

 (1) make the cache per-thread, limit the reading done in 1. to
 ccheck[], but invalidate the cache when a different set of
 attributes are asked; or

 (2) make the cache per .
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 3/5] archive-tar: write extended headers for file sizes >= 8GB

2016-07-14 Thread Jeff King
On Thu, Jul 14, 2016 at 10:11:18AM -0700, Junio C Hamano wrote:

> Johannes Sixt  writes:
> 
> > Am 30.06.2016 um 11:09 schrieb Jeff King:
> >> +/*
> >> + * This is the max value that a ustar size header can specify, as it is 
> >> fixed
> >> + * at 11 octal digits. POSIX specifies that we switch to extended headers 
> >> at
> >> + * this size.
> >> + */
> >> +#define USTAR_MAX_SIZE 0777UL
> >
> > This is too large by one bit for our 32-bit unsigned long on Windows:
> >
> > archive-tar.c: In function 'write_tar_entry':
> > archive-tar.c:295: warning: integer constant is too large for
> > unsigned long' type
> > archive-tar.c: In function 'write_global_extended_header':
> > archive-tar.c:332: warning: integer constant is too large for
> > unsigned long' type
> > archive-tar.c:335: warning: integer constant is too large for
> > unsigned long' type
> > archive-tar.c:335: warning: overflow in implicit constant conversion
> 
> Yikes.  I guess we need ULL here, and it cascade to all the
> variables used to interact with this constant, but not everybody has
> "long long", so

On 32-bit systems, I suspect the tar limits are a non-issue. For real
filesystem operations, tar on such a system would use off_t. But we use
"unsigned long" internally for object sizes, so I imagine that objects
larger than 4G are simply impossible on such systems.

So one option would be to simply "#if"-out these checks on 32-bit
systems.

I think it would also be OK to just set USTAR_MAX_SIZE to ULONG_MAX on
such a system, too (which effectively eliminates the check, but keeps
the conditional bits contained).

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Server-side preventing some files from being overwritten

2016-07-14 Thread Thorsten Glaser
On Thu, 14 Jul 2016, Stefan Beller wrote:

> go roughly like

Thanks, that did the trick!

Although I’m ordinarily loath to write GNU bash scripts, this
helps avoiding temporary files. This works:

-cutting here may damage your screen surface-
#!/bin/bash
export LC_ALL=C
subdir=x/y
while IFS=' ' read -r old new name; do
test x"$name" = x"refs/heads/master" || continue
if test x"0" != x"$(comm -23z \
<(git ls-tree -r -z "$old" "$subdir" | sort -z) \
<(git ls-tree -r -z "$new" "$subdir" | sort -z) | wc -c)"; then
echo >&2 'Untouchable files touched, commit rejected!'
exit 1
fi
done
exit 0
-cutting here may damage your screen surface-

Of course, set “subdir” in line 3 correctly, and GNU coreutils
are required for the NUL line termination, which is not an issue
here. (BSD has “-R ''” for sort(1), for example.)

bye,
//mirabilos
-- 
tarent solutions GmbH
Rochusstraße 2-4, D-53123 Bonn • http://www.tarent.de/
Tel: +49 228 54881-393 • Fax: +49 228 54881-235
HRB 5168 (AG Bonn) • USt-ID (VAT): DE122264941
Geschäftsführer: Dr. Stefan Barth, Kai Ebenrett, Boris Esser, Alexander Steeg
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/4] push: accept push options

2016-07-14 Thread Stefan Beller
This implements everything that is required on the client side to make use
of push options from the porcelain push command.

Signed-off-by: Stefan Beller 
---
 Documentation/git-push.txt |  8 +++-
 builtin/push.c | 21 ++---
 send-pack.c| 27 +++
 send-pack.h|  3 +++
 transport.c|  1 +
 transport.h|  7 +++
 6 files changed, 63 insertions(+), 4 deletions(-)

diff --git a/Documentation/git-push.txt b/Documentation/git-push.txt
index 19f46b6..e960258 100644
--- a/Documentation/git-push.txt
+++ b/Documentation/git-push.txt
@@ -11,7 +11,7 @@ SYNOPSIS
 [verse]
 'git push' [--all | --mirror | --tags] [--follow-tags] [--atomic] [-n | 
--dry-run] [--receive-pack=]
   [--repo=] [-f | --force] [-d | --delete] [--prune] [-v | 
--verbose]
-  [-u | --set-upstream]
+  [-u | --set-upstream] [--push-option=]
   [--[no-]signed|--sign=(true|false|if-asked)]
   [--force-with-lease[=[:]]]
   [--no-verify] [ [...]]
@@ -156,6 +156,12 @@ already exists on the remote side.
Either all refs are updated, or on error, no refs are updated.
If the server does not support atomic pushes the push will fail.
 
+-o::
+--push-option::
+   Transmit the given string to the server, which passes them to
+   the pre-receive as well as the post-receive hook. The given string
+   must not contain a NUL or LF character.
+
 --receive-pack=::
 --exec=::
Path to the 'git-receive-pack' program on the remote
diff --git a/builtin/push.c b/builtin/push.c
index 4e9e4db..3bb9d6b 100644
--- a/builtin/push.c
+++ b/builtin/push.c
@@ -353,7 +353,8 @@ static int push_with_options(struct transport *transport, 
int flags)
return 1;
 }
 
-static int do_push(const char *repo, int flags)
+static int do_push(const char *repo, int flags,
+  const struct string_list *push_options)
 {
int i, errs;
struct remote *remote = pushremote_get(repo);
@@ -376,6 +377,9 @@ static int do_push(const char *repo, int flags)
if (remote->mirror)
flags |= (TRANSPORT_PUSH_MIRROR|TRANSPORT_PUSH_FORCE);
 
+   if (push_options->nr)
+   flags |= TRANSPORT_PUSH_OPTIONS;
+
if ((flags & TRANSPORT_PUSH_ALL) && refspec) {
if (!strcmp(*refspec, "refs/tags/*"))
return error(_("--all and --tags are incompatible"));
@@ -406,13 +410,16 @@ static int do_push(const char *repo, int flags)
for (i = 0; i < url_nr; i++) {
struct transport *transport =
transport_get(remote, url[i]);
+   if (flags & TRANSPORT_PUSH_OPTIONS)
+   transport->push_options = push_options;
if (push_with_options(transport, flags))
errs++;
}
} else {
struct transport *transport =
transport_get(remote, NULL);
-
+   if (flags & TRANSPORT_PUSH_OPTIONS)
+   transport->push_options = push_options;
if (push_with_options(transport, flags))
errs++;
}
@@ -500,6 +507,9 @@ int cmd_push(int argc, const char **argv, const char 
*prefix)
int push_cert = -1;
int rc;
const char *repo = NULL;/* default repository */
+   static struct string_list push_options = STRING_LIST_INIT_DUP;
+   static struct string_list_item *item;
+
struct option options[] = {
OPT__VERBOSITY(),
OPT_STRING( 0 , "repo", , N_("repository"), 
N_("repository")),
@@ -533,6 +543,7 @@ int cmd_push(int argc, const char **argv, const char 
*prefix)
  0, "signed", _cert, "yes|no|if-asked", N_("GPG sign the 
push"),
  PARSE_OPT_OPTARG, option_parse_push_signed },
OPT_BIT(0, "atomic", , N_("request atomic transaction on 
remote side"), TRANSPORT_PUSH_ATOMIC),
+   OPT_STRING_LIST('o', "push-option", _options, 
N_("server-specific"), N_("option to transmit")),
OPT_SET_INT('4', "ipv4", , N_("use IPv4 addresses only"),
TRANSPORT_FAMILY_IPV4),
OPT_SET_INT('6', "ipv6", , N_("use IPv6 addresses only"),
@@ -563,7 +574,11 @@ int cmd_push(int argc, const char **argv, const char 
*prefix)
set_refspecs(argv + 1, argc - 1, repo);
}
 
-   rc = do_push(repo, flags);
+   for_each_string_list_item(item, _options)
+   if (strchr(item->string, '\n'))
+   die(_("push options must not have new line 
characters"));
+
+   rc = do_push(repo, flags, _options);
if (rc == -1)
usage_with_options(push_usage, options);
else
diff --git a/send-pack.c b/send-pack.c

[PATCH 4/4] add a test for push options

2016-07-14 Thread Stefan Beller
The functions `mk_repo_pair` as well as `test_refs` are borrowed from
t5543-atomic-push, with additional hooks installed.

Signed-off-by: Stefan Beller 
---
 t/t5545-push-options.sh | 103 
 1 file changed, 103 insertions(+)
 create mode 100755 t/t5545-push-options.sh

diff --git a/t/t5545-push-options.sh b/t/t5545-push-options.sh
new file mode 100755
index 000..ea813b9
--- /dev/null
+++ b/t/t5545-push-options.sh
@@ -0,0 +1,103 @@
+#!/bin/sh
+
+test_description='pushing to a repository using push options'
+
+. ./test-lib.sh
+
+mk_repo_pair () {
+   rm -rf workbench upstream &&
+   test_create_repo upstream &&
+   test_create_repo workbench &&
+   (
+   cd upstream &&
+   git config receive.denyCurrentBranch warn &&
+   mkdir -p .git/hooks &&
+   cat >.git/hooks/pre-receive <<-'EOF' &&
+   #!/bin/sh
+   if test -n "$GIT_PUSH_OPTION_COUNT"; then
+   i=0
+   >hooks/pre-receive.push_options
+   while test "$i" -lt "$GIT_PUSH_OPTION_COUNT"; do
+   eval "value=\$GIT_PUSH_OPTION_$i"
+   echo $value >>hooks/pre-receive.push_options
+   i=$((i + 1))
+   done
+   fi
+   EOF
+   chmod u+x .git/hooks/pre-receive
+
+   cat >.git/hooks/post-receive <<-'EOF' &&
+   #!/bin/sh
+   if test -n "$GIT_PUSH_OPTION_COUNT"; then
+   i=0
+   >hooks/post-receive.push_options
+   while test "$i" -lt "$GIT_PUSH_OPTION_COUNT"; do
+   eval "value=\$GIT_PUSH_OPTION_$i"
+   echo $value >>hooks/post-receive.push_options
+   i=$((i + 1))
+   done
+   fi
+   EOF
+   chmod u+x .git/hooks/post-receive
+   ) &&
+   (
+   cd workbench &&
+   git remote add up ../upstream
+   )
+}
+
+# Compare the ref ($1) in upstream with a ref value from workbench ($2)
+# i.e. test_refs second HEAD@{2}
+test_refs () {
+   test $# = 2 &&
+   git -C upstream rev-parse --verify "$1" >expect &&
+   git -C workbench rev-parse --verify "$2" >actual &&
+   test_cmp expect actual
+}
+
+test_expect_success 'one push option works for a single branch' '
+   mk_repo_pair &&
+   git -C upstream config receive.advertisePushOptions true &&
+   (
+   cd workbench &&
+   test_commit one &&
+   git push --mirror up &&
+   test_commit two &&
+   git push --push-option=asdf up master
+   ) &&
+   test_refs master master &&
+   echo "asdf" >expect &&
+   test_cmp expect upstream/.git/hooks/pre-receive.push_options &&
+   test_cmp expect upstream/.git/hooks/post-receive.push_options
+'
+
+test_expect_success 'push option denied by remote' '
+   mk_repo_pair &&
+   git -C upstream config receive.advertisePushOptions false &&
+   (
+   cd workbench &&
+   test_commit one &&
+   git push --mirror up &&
+   test_commit two &&
+   test_must_fail git push --push-option=asdf up master
+   ) &&
+   test_refs master HEAD@{1}
+'
+
+test_expect_success 'two push options work' '
+   mk_repo_pair &&
+   git -C upstream config receive.advertisePushOptions true &&
+   (
+   cd workbench &&
+   test_commit one &&
+   git push --mirror up &&
+   test_commit two &&
+   git push --push-option=asdf --push-option="more structured 
text" up master
+   ) &&
+   test_refs master master &&
+   printf "asdf\nmore structured text\n" >expect &&
+   test_cmp expect upstream/.git/hooks/pre-receive.push_options &&
+   test_cmp expect upstream/.git/hooks/post-receive.push_options
+'
+
+test_done
-- 
2.9.0.247.gf748855.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/4] push options: {pre,post}-receive hook learns about push options

2016-07-14 Thread Stefan Beller
The environment variable GIT_PUSH_OPTION_COUNT is set to the number of
push options sent, and GIT_PUSH_OPTION_{0,1,..} is set to the transmitted
option.

The code is not executed as the push options are set to NULL, nor is the
new capability advertised.

There was some discussion back and forth how to present these push options
to the user as there are some ways to do it:

Keep all options in one environment variable

+ easiest way to implement in Git
- This would make things hard to parse correctly in the hook.

Put the options in files instead,
filenames are in GIT_PUSH_OPTION_FILES
==
+ After a discussion about environment variables and shells, we may not
  want to put user data into an environment variable (see [1] for example).
+ We could transmit binaries, i.e. we're not bound to C strings as
  we are when using environment variables to the user.
+ Maybe easier to parse than constructing environment variable names
  GIT_PUSH_OPTION_{0,1,..} yourself
- cleanup of the temporary files is hard to do reliably
- we have race conditions with multiple clients pushing, hence we'd need
  to use mkstemp. That's not too bad, but still.

Use environment variables, but restrict to key/value pairs
==
(When the user pushes a push option `foo=bar`, we'd
GIT_PUSH_OPTION_foo=bar)
+ very easy to parse for a simple model of push options
- it's not sufficient for more elaborate models, e.g.
  it doesn't allow doubles (e.g. cc=reviewer@email)

Present the options in different environment variables
==
(This is implemented)
* harder to parse as a user, but we have a sample hook for that.
- doesn't allow binary files
+ allows the same option twice, i.e. is not restrictive about
  options, except for binary files.
+ doesn't clutter a remote directory with (possibly stale)
  temporary files

As we first want to focus on getting simple strings to work
reliably, we go with the last option for now. If we want to
do transmission of binaries later, we can just attach a
'side-channel', e.g. "any push option that contains a '\0' is
put into a file instead of the environment variable and we'd
have new GIT_PUSH_OPTION_FILES, GIT_PUSH_OPTION_FILENAME_{0,1,..}
environment variables".

We limit the push options for now
* to not exceed an arbitrary count, and
* to not exceed an arbitrary size.

This serves two purposes:
* DoS protection (i.e. one connection can add no more than 32kB
  now)
* We need to figure out how to handle large (>64kB). Jeff wrote:
  > Yes, but people are also happy when they can use a flexible and
  > standardized tool to do a thing. I'd be more frustrated when I found out
  > that Git's data-pushing protocol has arbitrary limitations (like, say, I
  > can't push a data item larger than a single 64K pkt-line), which would
  > easily just work with something like HTTP POSTs.
  So to keep a way open in the future to deal with large pay loads,
  the size is restricted for now.

[1] 'Shellshock' https://lwn.net/Articles/614218/

Signed-off-by: Stefan Beller 
---
 Documentation/githooks.txt  | 18 ++
 builtin/receive-pack.c  | 49 +++--
 templates/hooks--pre-receive.sample | 24 ++
 3 files changed, 78 insertions(+), 13 deletions(-)
 create mode 100644 templates/hooks--pre-receive.sample

diff --git a/Documentation/githooks.txt b/Documentation/githooks.txt
index d82e912..9565dc3 100644
--- a/Documentation/githooks.txt
+++ b/Documentation/githooks.txt
@@ -247,6 +247,15 @@ Both standard output and standard error output are 
forwarded to
 'git send-pack' on the other end, so you can simply `echo` messages
 for the user.
 
+The number of push options given on the command line of
+`git push --push-option=...` can be read from the environment
+variable `GIT_PUSH_OPTION_COUNT`, and the options themselves are
+found in `GIT_PUSH_OPTION_0`, `GIT_PUSH_OPTION_1`,...
+If it is negotiated to not use the push options phase, the
+environment variables will not be set. If the client selects
+to use push options, but doesn't transmit any, the count variable
+will be set to zero, `GIT_PUSH_OPTION_COUNT=0`.
+
 [[update]]
 update
 ~~
@@ -322,6 +331,15 @@ a sample script `post-receive-email` provided in the 
`contrib/hooks`
 directory in Git distribution, which implements sending commit
 emails.
 
+The number of push options given on the command line of
+`git push --push-option=...` can be read from the environment
+variable `GIT_PUSH_OPTION_COUNT`, and the options themselves are
+found in `GIT_PUSH_OPTION_0`, `GIT_PUSH_OPTION_1`,...
+If it is negotiated to not use the push options phase, the
+environment variables will not be set. If the client selects
+to use push options, but doesn't transmit any, the count variable
+will be set to zero, 

[PATCHv5 0/4] Push options

2016-07-14 Thread Stefan Beller
Jeff wrote:
> Junio wrote:
>> I think those extra knobs can come later.  If we are not going to
>> limit with max_options in the end, however, wouldn't it be more
>> natural for the initial iteration without any configuration not to
>> have hard-coded max_options at all?
>
> Yeah, I am OK with adding restrictive knobs later as a separate topic.
> As Stefan notes, upstream does not have the other knobs anyway, and IIRC
> the push-options feature is not even enabled by default.
>
> -Peff

* now it actually is not a default. ;)
* removed knobs, but instead we only reject at > LARGE_PACKET_MAX - 1,

Thanks,
Stefan

v5:
git diff origin/sb/push-options:
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 4d8041a..917ac18 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -44,7 +44,7 @@ static struct strbuf fsck_msg_types = STRBUF_INIT;
 static int receive_unpack_limit = -1;
 static int transfer_unpack_limit = -1;
 static int advertise_atomic_push = 1;
-static int advertise_push_options = 1;
+static int advertise_push_options;
 static int unpack_limit = 100;
 static int report_status;
 static int use_sideband;
@@ -1501,24 +1501,11 @@ static struct command *read_head_info(struct sha1_array 
*shallow)
 
 static struct string_list *read_push_options(void)
 {
-   int i;
struct string_list *ret = xmalloc(sizeof(*ret));
-   /* NEEDSWORK: expose the limitations to be configurable. */
-   int max_options = 32;
-
-   /*
-* NEEDSWORK: expose the limitations to be configurable;
-* Once the limit can be lifted, include a way for payloads
-* larger than one pkt, e.g allow a payload of up to
-* LARGE_PACKET_MAX - 1 only, and reserve the last byte
-* to indicate whether the next pkt continues with this
-* push option.
-*/
-   int max_size = 1024;
-
string_list_init(ret, 1);
-   for (i = 0; i < max_options; i++) {
-   char *line;
+
+   while (1) {
+   char *line, *lf;
int len;
 
line = packet_read_line(0, );
@@ -1526,7 +1513,14 @@ static struct string_list *read_push_options(void)
if (!line)
break;
 
-   if (len > max_size) {
+   /*
+   * NEEDSWORK: expose the limitations to be configurable;
+   * Once the limit can be lifted, include a way for payloads
+   * larger than one pkt, e.g use last byte to indicate if
+   * the push option continues in the next packet or implement
+   * larger packets.
+   */
+   if (len > LARGE_PACKET_MAX - 1) {
/*
 * NEEDSWORK: The error message in die(..) is not
 * transmitted in call cases, so ideally all die(..)
@@ -1534,20 +1528,17 @@ static struct string_list *read_push_options(void)
 * combine rp_error && die into one helper function.
 */
rp_error("protocol error: server configuration allows 
push "
-"options of size up to %d bytes", max_size);
+"options of size up to %d bytes",
+LARGE_PACKET_MAX - 1);
die("protocol error: push options too large");
}
 
-   len = strcspn(line, "\n");
-   line[len] = '\0';
+   lf = strchr(line, '\n');
+   if (lf)
+   *lf = '\0';
 
string_list_append(ret, line);
}
-   if (i == max_options) {
-   rp_error("protocol error: server configuration only allows up "
-   "to %d push options", max_options);
-   die("protocol error: push options too large");
-   }
 
return ret;
 }
diff --git a/t/t5545-push-options.sh b/t/t5545-push-options.sh
index 8dd3c8e..ea813b9 100755
--- a/t/t5545-push-options.sh
+++ b/t/t5545-push-options.sh
@@ -57,6 +57,7 @@ test_refs () {
 
 test_expect_success 'one push option works for a single branch' '
mk_repo_pair &&
+   git -C upstream config receive.advertisePushOptions true &&
(
cd workbench &&
test_commit one &&
@@ -85,6 +86,7 @@ test_expect_success 'push option denied by remote' '
 
 test_expect_success 'two push options work' '
mk_repo_pair &&
+   git -C upstream config receive.advertisePushOptions true &&
(
cd workbench &&
test_commit one &&



cover letter v4:

Thanks Junio, Jeff, Jonathan for discussion and feedback!

I went over the emails again and we seem to agree that the initial design (in 
v3)
was sane and the error messages and reporting for corner cases were to be
dismissed as "it happens as often as 'BUG:' messages appear, so let's not care
too deeply now".

Thanks,
Stefan


[PATCH 2/4] receive-pack: implement advertising and receiving push options

2016-07-14 Thread Stefan Beller
The pre/post receive hook may be interested in more information from the
user. This information can be transmitted when both client and server
support the "push-options" capability, which when used is a phase directly
after update commands ended by a flush pkt.

Similar to the atomic option, the server capability can be disabled via
the `receive.advertisePushOptions` config variable. While documenting
this, fix a nit in the `receive.advertiseAtomic` wording.

Signed-off-by: Stefan Beller 
---
 Documentation/config.txt  |  9 +++-
 Documentation/technical/pack-protocol.txt | 10 ++--
 Documentation/technical/protocol-capabilities.txt |  9 
 builtin/receive-pack.c| 59 +++
 4 files changed, 81 insertions(+), 6 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index e208af1..25b5db1 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2410,8 +2410,13 @@ rebase.instructionFormat
 
 receive.advertiseAtomic::
By default, git-receive-pack will advertise the atomic push
-   capability to its clients. If you don't want to this capability
-   to be advertised, set this variable to false.
+   capability to its clients. If you don't want to advertise this
+   capability, set this variable to false.
+
+receive.advertisePushOptions::
+   By default, git-receive-pack will advertise the push options
+   capability to its clients. If you don't want to advertise this
+   capability, set this variable to false.
 
 receive.autogc::
By default, git-receive-pack will run "git-gc --auto" after
diff --git a/Documentation/technical/pack-protocol.txt 
b/Documentation/technical/pack-protocol.txt
index 8b36343..7a2ed30 100644
--- a/Documentation/technical/pack-protocol.txt
+++ b/Documentation/technical/pack-protocol.txt
@@ -454,7 +454,8 @@ The reference discovery phase is done nearly the same way 
as it is in the
 fetching protocol. Each reference obj-id and name on the server is sent
 in packet-line format to the client, followed by a flush-pkt.  The only
 real difference is that the capability listing is different - the only
-possible values are 'report-status', 'delete-refs' and 'ofs-delta'.
+possible values are 'report-status', 'delete-refs', 'ofs-delta' and
+'push-options'.
 
 Reference Update Request and Packfile Transfer
 --
@@ -465,9 +466,10 @@ that it wants to update, it sends a line listing the 
obj-id currently on
 the server, the obj-id the client would like to update it to and the name
 of the reference.
 
-This list is followed by a flush-pkt and then the packfile that should
-contain all the objects that the server will need to complete the new
-references.
+This list is followed by a flush-pkt. Then the push options are transmitted
+one per packet followed by another flush-pkt. After that the packfile that
+should contain all the objects that the server will need to complete the new
+references will be sent.
 
 
   update-request=  *shallow ( command-list | push-cert ) [packfile]
diff --git a/Documentation/technical/protocol-capabilities.txt 
b/Documentation/technical/protocol-capabilities.txt
index eaab6b4..4c28d3a 100644
--- a/Documentation/technical/protocol-capabilities.txt
+++ b/Documentation/technical/protocol-capabilities.txt
@@ -253,6 +253,15 @@ atomic pushes. If the pushing client requests this 
capability, the server
 will update the refs in one atomic transaction. Either all refs are
 updated or none.
 
+push-options
+
+
+If the server sends the 'push-options' capability it is able to accept
+push options after the update commands have been sent, but before the
+packfile is streamed. If the pushing client requests this capability,
+the server will pass the options to the pre- and post- receive hooks
+that process this push request.
+
 allow-tip-sha1-in-want
 --
 
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 236417e..917ac18 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -44,10 +44,12 @@ static struct strbuf fsck_msg_types = STRBUF_INIT;
 static int receive_unpack_limit = -1;
 static int transfer_unpack_limit = -1;
 static int advertise_atomic_push = 1;
+static int advertise_push_options;
 static int unpack_limit = 100;
 static int report_status;
 static int use_sideband;
 static int use_atomic;
+static int use_push_options;
 static int quiet;
 static int prefer_ofs_delta = 1;
 static int auto_update_server_info;
@@ -193,6 +195,11 @@ static int receive_pack_config(const char *var, const char 
*value, void *cb)
return 0;
}
 
+   if (strcmp(var, "receive.advertisepushoptions") == 0) {
+   advertise_push_options = git_config_bool(var, value);
+   return 0;
+   }
+
return git_default_config(var, value, cb);
 }
 
@@ -211,6 +218,8 @@ 

Re: [PATCH v4 3/5] archive-tar: write extended headers for file sizes >= 8GB

2016-07-14 Thread Junio C Hamano
Johannes Sixt  writes:

> Am 30.06.2016 um 11:09 schrieb Jeff King:
>> +/*
>> + * This is the max value that a ustar size header can specify, as it is 
>> fixed
>> + * at 11 octal digits. POSIX specifies that we switch to extended headers at
>> + * this size.
>> + */
>> +#define USTAR_MAX_SIZE 0777UL
>
> This is too large by one bit for our 32-bit unsigned long on Windows:
>
> archive-tar.c: In function 'write_tar_entry':
> archive-tar.c:295: warning: integer constant is too large for
> unsigned long' type
> archive-tar.c: In function 'write_global_extended_header':
> archive-tar.c:332: warning: integer constant is too large for
> unsigned long' type
> archive-tar.c:335: warning: integer constant is too large for
> unsigned long' type
> archive-tar.c:335: warning: overflow in implicit constant conversion

Yikes.  I guess we need ULL here, and it cascade to all the
variables used to interact with this constant, but not everybody has
"long long", so

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 2/5] t5000: test tar files that overflow ustar headers

2016-07-14 Thread Junio C Hamano
Johannes Sixt  writes:

> Am 14.07.2016 um 17:47 schrieb Johannes Schindelin:
>> On Thu, 30 Jun 2016, Jeff King wrote:
>>> The ustar format only has room for 11 (or 12, depending on
>>> some implementations) octal digits for the size and mtime of
>>> each file. For values larger than this, we have to add pax
>>> extended headers to specify the real data, and git does not
>>> yet know how to do so.
>>>
>>> [...]
>>>  t/t5000/19f9c8273ec45a8938e6999cb59b3ff66739902a | Bin 0 -> 2048 bytes
>>
>> It appears that this blob cannot be read when sizeof(unsigned long) == 4.
>> This happens to break the t5000 test on Windows, where that comparison
>> holds true.
>
> The problem occurs in parse_sha1_header_extended(), where the
> calculation of the size in the object header overflows our 32-bit
> unsigned long.

OK, then we'd want something that measures how big "unsigned long"
is, and use it as a lazy prerequisite 64BIT_LONG, tweaking the other
patch to t0006 the other Johannes sent yesterday.

Dscho?  I'll revert the merge of 'js/t0006-for-v2.9.2' out of
'next' so that we can replace it with your newer version, but it
needs to be massaged to lose the strong linkage with "time", as
it is no longer "is our time big enough", I would think.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] fsck_walk(): optionally name objects on the go

2016-07-14 Thread Junio C Hamano
Johannes Schindelin  writes:

> Note that this patch opts for decorating the objects with plain strings
> instead of full-blown structs (à la `struct rev_name` in the code of
> the `git name-rev` command), for several reasons:
>
> - the code is much simpler than if it had to work with structs that
>   describe arbitrarily long names such as "master~14^2~5:builtin/am.c",
>
> - the string processing is actually quite light-weight compared to the
>   rest of fsck's operation,
>
> - the caller of fsck_walk() is expected to provide names for the
>   starting points, and using plain and simple strings is just the
>   easiest way to do that.

Simpler is good; we can always optimize something well-isolated like
this later if it proves necessary.

> +static char *get_object_name(struct fsck_options *options, struct object 
> *obj)
> +{
> + return options->object_names ?
> + lookup_decoration(options->object_names, obj) : NULL;
> +}
> +
> +static void put_object_name(struct fsck_options *options, struct object *obj,
> + const char *fmt, ...)
> +{
> + va_list ap;
> + char *existing = lookup_decoration(options->object_names, obj);
> + struct strbuf buf = STRBUF_INIT;

When reading a few early calling sites, it wasn't quite obvious how
the code avoids the "naming" when .object_names decoration is not
initialized (which is tied to the --name-objects option to decide if
the feature needs to be triggered).  The current "if get_object_name
for the containing object gives us NULL, then we refrain from
calling put_object_name()" may be good enough, but having an early
return similar to get_object_name() would make it easier to grok,
i.e.

get_object_name(...)
{
if (!options->object_names)
return NULL;
return lookup_decoration(...);
}

put_object_name(...)
{
... decls ...

if (!options->object_names)
return NULL;
existing = lookup_decoration(...);
if (existing)
return existing;
...
}

It is a minor point, as the caller needs to check "name" that is the
value returned from get_object_name() for the containing object to
avoid wasting cycles to compute the parameters to pass to
put_object_name() anyway.

>   while (tree_entry(, )) {
>   int result;
>  
> + if (name) {
> + struct object *obj = parse_object(entry.oid->hash);

This worries me somewhat.  IIRC, "git fsck" uses object->parsed to
tell between objects that are unreachable or not and act differently
so I would fear that parsing the object here would screw up that
logic, when the call comes from fsck_dir() -> fsck_sha1_list() ->
fsck_sha1() -> fsck_obj() -> fsck_walk() -> fsck_walk_tree()
codepath.  Is it no longer the case, I wonder?

I see in the same loop there is a call to lookup_tree()->object, which
probably is how the existing code avoids that issue?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Server-side preventing some files from being overwritten

2016-07-14 Thread Stefan Beller
On Thu, Jul 14, 2016 at 8:31 AM, Thorsten Glaser  wrote:
> Hi *,
>
> is there a way, for example with some sort of pre-receive hook,
> to prevent some files from being overwritten by a push?

pre-receive hooks are a thing!

 pre-receive
   This hook is invoked by git-receive-pack on the remote
repository, which happens when a git push is done on a local
repository. Just before starting to update refs on the remote
repository,
   the pre-receive hook is invoked. Its exit status determines the
success or failure of the update.

   This hook executes once for the receive operation. It takes no
arguments, but for each ref to be updated it receives on standard
input a line of the format:

SP  SP  LF

   where  is the old object name stored in the ref,
 is the new object name to be stored in the ref and
 is the full name of the ref. When creating a new ref,
is 40 0.

   If the hook exits with non-zero status, none of the refs will
be updated. If the hook exits with zero, updating of individual refs
can still be prevented by the update hook.

   Both standard output and standard error output are forwarded to
git send-pack on the other end, so you can simply echo messages for
the user.


As you have access to the old and new value, just check them out to
/tmp and inspect the files?
(That is slower than optimal I'd assume, so to get it faster you could
go roughly like

git ls-tree -r  | grep   >old_interest
git ls-tree -r  | grep  >new_interest
if diff old_interest new_interest
then
echo "There is a difference in  
exit 1
fi



>
> Use case:
>
> In some project, we use Flyway to manage the database schema,
> and Jenkins to automatically build master’s HEAD after each
> push (and install it, thus activating the schema files almost
> immediately). Now, I wish to prevent coworkers from changing
> anything in the SQL subdirectory that has ever been pushed,
> forcing them to push new SQL files with ALTER statements instead.
> (Yes, I am aware this will likely make me even less liked. No,
> this is not an issue.)
>
> As for the comparison, only the changes between the previous
> HEAD of master and the new HEAD of master after the push would
> have been accepted need to be taken into account; any intermediate
> commits, merges, etc. are no problem (because Jenkins does not
> build them, and because, once a push fails, the developer will
> have to add a commit reverting their change and moving it to
> another file on top, I’m no friend of rewriting).
>
> File matching would be “any files under a certain subdirectory”,
> nothing fancier than that.
>
> I’ve tried a web search (with two different search engines) for
> “git prevent pushed files from modification”, but this seems to
> only show people who want to ignore local changes or somesuch…
>
> I’ve asked in IRC, but with no answer for hours I thought that
> maybe this was the better place to ask for it.
>
> Thanks in advance,
> //mirabilos
> --
> tarent solutions GmbH
> Rochusstraße 2-4, D-53123 Bonn • http://www.tarent.de/
> Tel: +49 228 54881-393 • Fax: +49 228 54881-235
> HRB 5168 (AG Bonn) • USt-ID (VAT): DE122264941
> Geschäftsführer: Dr. Stefan Barth, Kai Ebenrett, Boris Esser, Alexander Steeg
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 3/5] archive-tar: write extended headers for file sizes >= 8GB

2016-07-14 Thread Johannes Sixt

Am 30.06.2016 um 11:09 schrieb Jeff King:

+/*
+ * This is the max value that a ustar size header can specify, as it is fixed
+ * at 11 octal digits. POSIX specifies that we switch to extended headers at
+ * this size.
+ */
+#define USTAR_MAX_SIZE 0777UL


This is too large by one bit for our 32-bit unsigned long on Windows:

archive-tar.c: In function 'write_tar_entry':
archive-tar.c:295: warning: integer constant is too large for 'unsigned 
long' type

archive-tar.c: In function 'write_global_extended_header':
archive-tar.c:332: warning: integer constant is too large for 'unsigned 
long' type
archive-tar.c:335: warning: integer constant is too large for 'unsigned 
long' type

archive-tar.c:335: warning: overflow in implicit constant conversion

-- Hannes

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 2/5] t5000: test tar files that overflow ustar headers

2016-07-14 Thread Johannes Sixt

Am 14.07.2016 um 17:47 schrieb Johannes Schindelin:

On Thu, 30 Jun 2016, Jeff King wrote:

The ustar format only has room for 11 (or 12, depending on
some implementations) octal digits for the size and mtime of
each file. For values larger than this, we have to add pax
extended headers to specify the real data, and git does not
yet know how to do so.

[...]
 t/t5000/19f9c8273ec45a8938e6999cb59b3ff66739902a | Bin 0 -> 2048 bytes


It appears that this blob cannot be read when sizeof(unsigned long) == 4.
This happens to break the t5000 test on Windows, where that comparison
holds true.


The problem occurs in parse_sha1_header_extended(), where the 
calculation of the size in the object header overflows our 32-bit 
unsigned long.


-- Hannes

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] fsck: refactor how to describe objects

2016-07-14 Thread Junio C Hamano
Johannes Schindelin  writes:

> In many places, we refer to objects via their SHA-1s. Let's abstract
> that into a function.
>
> For the moment, it does nothing else than what we did previously: print
> out the 40-digit hex string. But that will change over the course of the
> next patches.
>
> Signed-off-by: Johannes Schindelin 
> ---

Makes sense.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/3] Teach `git fsck` a new option: `--name-objects`

2016-07-14 Thread Junio C Hamano
Johannes Schindelin  writes:

> Example output:
>
>   ...
>   broken link fromtree b5eb6ff...  (refs/stash@{}~37:)
> toblob ec5cf80...

The objective makes sense, and their progression is very nicely
structured.  I can "smell" that these are going in the right
direction only with a cursory scan of the three patches.

> Originally, I intended to teach name-rev a new mode where it would also
> name objects other than commits and tags,...

As to having it in name-rev, it is still a "good to have" for an
object that does exist.  It would be "super nice" if it also worked
for a missing object.  It makes tons of sense from the end-user UI
point of view to have this feature there.

I however agree with you that it is sensible to do this in "fsck"
first and leave the "good to have" to later, because (1) naming an
arbitrary blob like this needs full object-store scan like "fsck"
does anyway, and (2) the primary occasion users would want to use
the "super nice" part of the feature is when they discover an object
is "missing", and the first thing they would want to run in such a
case anyway is "fsck".

So, in short, I very much like them.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] mingw: fix regression in t1308-config-set

2016-07-14 Thread Junio C Hamano
Johannes Schindelin  writes:

> When we tried to fix in 58461bd (t1308: do not get fooled by symbolic
> links to the source tree, 2016-06-02) an obscure case where the user
> cd's into Git's source code via a symbolic link, a regression was
> introduced that affects all test runs on Windows.

Thanks for producing a fix quickly after the topic hit 'master'.

The original came from

  http://thread.gmane.org/gmane.comp.version-control.git/296021/focus=296199

which was merged at e5e5bb67 (Merge branch 'jk/upload-pack-hook'
into next, 2016-06-28) to 'next'.  

I see J6t raised C:/foo vs /c/foo issue in the thread later, but
unfortunately I didn't notice it.

I added a few missing Cc: and quoted the whole patch here to those
who were involved; I think this update is correct, but just trying
to make sure people know.

Not limited to this particular topic, there probably are some things
we can and should add to the procedure to prevent further episodes
like this, but I am not seeing anything immediately obvious offhand.
There already is a way to prominently mark a topic to be not-ready
with an outstanding issue called "What's cooking" report, but it is
maintained manually and it can be leaky without extra set of eyes
constantly monitoring.

> The original patch introducing the test case in question was careful to
> use `$(pwd)` instead of `$PWD`.
>
> This was done to account for the fact that Git's test suite uses shell
> scripting even on Windows, where the shell's Unix-y paths are
> incompatible with the main Git executable's idea of paths: it only
> accepts Windows paths.
>
> It is an awkward but necessary thing, then, to use `$(pwd)` (which gives
> us a Windows path) when interacting with the Git executable and `$PWD`
> (which gives the shell's idea of the current working directory in Unix-y
> form) for shell scripts, including the test suite itself.
>
> Obviously this broke the use case of the Git maintainer when changing
> the working directory into Git's source code directory via a symlink,
> i.e. when `$(pwd)` does not agree with `$PWD`.
>
> However, we must not fix that use case at the expense of regressing
> another use case.
>
> Let's special-case Windows here, even if it is ugly, for lack of a more
> elegant solution.
>
> Signed-off-by: Johannes Schindelin 
> ---
> Published-As: https://github.com/dscho/git/releases/tag/t1308-on-windows-v1
>
>   Side note: it was not at all clear to me how 58461bd fixed the
>   problem by replacing $(pwd) with $HOME, given that HOME is set to
>   $TRASH_DIRECTORY which is set to $TEST_OUTPUT_DIRECTORY/... after
>   TEST_OUTPUT_DIRECTORY was set to TEST_DIRECTORY which in turn was
>   set to $(pwd).
>
>   I guess the reason is that -P in `cd -P "$TRASH DIRECTORY"`, but
>   then I *really* do not understand how $(pwd) and $PWD could
>   disagree.
>   Oh well. I have to move on to the next fire.
>
>  t/t1308-config-set.sh | 10 +-
>  1 file changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/t/t1308-config-set.sh b/t/t1308-config-set.sh
> index a06e71c..7655c94 100755
> --- a/t/t1308-config-set.sh
> +++ b/t/t1308-config-set.sh
> @@ -233,11 +233,19 @@ cmdline_config="'foo.bar=from-cmdline'"
>  test_expect_success 'iteration shows correct origins' '
>   echo "[foo]bar = from-repo" >.git/config &&
>   echo "[foo]bar = from-home" >.gitconfig &&
> + if test_have_prereq MINGW
> + then
> + # Use Windows path (i.e. *not* $HOME)
> + HOME_GITCONFIG=$(pwd)/.gitconfig
> + else
> + # Do not get fooled by symbolic links, i.e. $HOME != $(pwd)
> + HOME_GITCONFIG=$HOME/.gitconfig
> + fi &&
>   cat >expect <<-EOF &&
>   key=foo.bar
>   value=from-home
>   origin=file
> - name=$HOME/.gitconfig
> + name=$HOME_GITCONFIG
>   scope=global
>  
>   key=foo.bar
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4] config: add conditional include

2016-07-14 Thread Duy Nguyen
On Thu, Jul 14, 2016 at 5:53 PM, Johannes Schindelin
 wrote:
> Hi Duy,
>
> On Thu, 14 Jul 2016, Nguyễn Thái Ngọc Duy wrote:
>
>> Helped-by: Jeff King 
>> Signed-off-by: Nguyễn Thái Ngọc Duy 
>
> This commit message is quite a bit short on details.

Because it's fully documented in config.txt.

> I still fail to see what use case this would benefit,

It allows you to group repos together. The first mail that started all
this [1] is one of the use cases: you may want to use one identity in
a group of repos, and another identity in some other. I want some
more, to use a private key one some repos and another private key on
some other. Some more settings may be shareable this way, like coding
style-related (trailing spaces or not...)

[1] http://thread.gmane.org/gmane.comp.version-control.git/273811

> and I already start to suspect that the change would open a can of worms that 
> might not be desired.

You can choose not to use it, I guess. From what I see it's nothing
super tricky. The config hierarchy we have now is: system -> per-user
-> repo. This could change this to: system -> per-user ->
per-directory -> repo. You could create a different hierarchy, but
with git-config now showing where a config var comes from, it's not
hard to troubleshoot.

>> + ; include if $GIT_DIR is /path/to/foo/.git
>> + [include "gitdir:/path/to/foo/.git"]
>> + path = /path/to/foo.inc
>
> I find this way to specify a conditional unintuitive. Reading
> "gitdir:/path/to/foo/.git" suggests to me that the Git dir is *re-set*,
> not that this is a condition.

Well.. to me it's no different than [remote "foo"] to apply stuff to "foo".

> I would have expected something like
>
> [include "if-gitdir-is:/path/to/foo/.git"]
>
> instead.

If we do this, should we change gitdir/i to
if-git-dir-case-insensitively-is ? I think we are supposed to read
documents before using any feature, not just guessing then trial and
error.
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ANNOUNCE] Git v2.9.1

2016-07-14 Thread Johannes Schindelin
Hi Junio,

On Wed, 13 Jul 2016, Junio C Hamano wrote:

> Moving of [64BIT_TIME] lazy_prereq to test-lib might be an upside if we
> were planning to add a test that depends on the system having or not
> having 64-bit timestamp elsewhere, but I do not think of a reason why
> such a new test cannot go to t0006-date, which has the perfect name for
> such a test and is not overly long right now (114 lines).

Turns out we *already* have another test in `master` that needs this lazy
prereq:

https://github.com/git/git/blob/79ed43c28f/t/t5000-tar-tree.sh#L378-L384

Ciao,
Dscho
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v14 00/21] index-helper/watchman

2016-07-14 Thread Duy Nguyen
Big typo..

On Thu, Jul 14, 2016 at 5:56 PM, Duy Nguyen  wrote:
> For giant-scale repos, you probably want something more efficient than
> a script like this. And the good thing is you have freedom to do
> whatever you want. You can run one daemon per repo, you can run one
> daemon per system... In some previous mail exchange with Dscho, it was
> mentioned that something other than watchman may be desired. This
> opens up that door without much headache from outside.

s/outside/our side/
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v14 00/21] index-helper/watchman

2016-07-14 Thread Duy Nguyen
On Wed, Jul 13, 2016 at 11:59 PM, David Turner  wrote:
> On 07/12/2016 02:24 PM, Duy Nguyen wrote:
>>
>> Just thinking out loud. I've been thinking about this more about this.
>> After the move from signal-based to unix socket for communication, we
>> probably are better off with a simpler design than the shm-alike one
>> we have now.
>>
>> What if we send everything over a socket or a pipe? Sending 500MB over
>> a unix socket takes 253ms, that's insignificant when operations on an
>> index that size usually take seconds. If we send everything over
>> socket/pipe, we can trust data integrity and don't have to verify,
>> even the trailing SHA-1 in shm file.
>
>
> I think it would be good to make index operations not take seconds.
>
> In general, we should not need to verify the trailing SHA-1 for shm data.
> So the index-helper verifies it when it loads it, but the git (e.g.) status
> should not need to verify.
>
> Also, if we have two git commands running at the same time, the index-helper
> can only serve one at a time; with shm, both can run at full speed.

We still have an option to send a (shm, possibly) path to git to pick
up and skip verification. If we can exchange capabilities then sending
the index some way else is always possible.

>> So, what I have in mind is this, at read index time, instead of open a
>> socket, we run a separate program and communicate via pipes. We can
>> exchange capabilities if needed, then the program sends the entire
>> current index, the list of updated files back (and/or the list of dirs
>> to invalidate). The design looks very much like a smudge/clean filter.
>
>
> This seems very complicated.  Now git status talks to the separate program,
> which talks to the index-helper, which talks to watchman.  That is a lot of
> steps!

I was suggesting this because I think it would simplify things, not
complicate stuff further. Yes the separate program plays the role of
our unix client, if we keep the index-helper. But we don't have to.

Do you remember Junio once suggested to put the index on tmpfs? That's
what I imagine in common, medium scale setups. We don't need an extra
daemon:

1) when git needs the index, the script looks at its tmpfs mount, if
found, pass the path back
2) when git announces the index has been updated, the script reads the
index and saves it in tmpfs
3) when git refreshes and asks for watchman support, the script simply
runs "watchman" command, post processes the output a bit and send the
file list to git

Because there is no separate daemon in this case, we don't need
--kill, we don't need --autorun. We still need WAMA extension but it
can contain just an arbitrary clock string, this is completely opaque
to git. If we can get rid of the index-helper (with an example script
probably landed in contrib folder), that's a lot of less headache down
the road.

For giant-scale repos, you probably want something more efficient than
a script like this. And the good thing is you have freedom to do
whatever you want. You can run one daemon per repo, you can run one
daemon per system... In some previous mail exchange with Dscho, it was
mentioned that something other than watchman may be desired. This
opens up that door without much headache from outside.

> I think the daemon also has the advantage that it can reload the index as
> soon as it changes.  This is not quite implemented, but it would be pretty
> easy to do.  That would save a lot of time in the typical workflow.

A script has the same advantage, that is if git notifies it (like we
do now). You can also do it using watchman trigger, which does not
need any special support from git.
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4] config: add conditional include

2016-07-14 Thread Johannes Schindelin
Hi Duy,

On Thu, 14 Jul 2016, Nguyễn Thái Ngọc Duy wrote:

> Helped-by: Jeff King 
> Signed-off-by: Nguyễn Thái Ngọc Duy 

This commit message is quite a bit short on details. I still fail to see
what use case this would benefit, and I already start to suspect that the
change would open a can of worms that might not be desired.

> + ; include if $GIT_DIR is /path/to/foo/.git
> + [include "gitdir:/path/to/foo/.git"]
> + path = /path/to/foo.inc

I find this way to specify a conditional unintuitive. Reading
"gitdir:/path/to/foo/.git" suggests to me that the Git dir is *re-set*,
not that this is a condition.

I would have expected something like

[include "if-gitdir-is:/path/to/foo/.git"]

instead.

Ciao,
Dscho

Re: What's cooking in git.git (Jul 2016, #05; Wed, 13)

2016-07-14 Thread Junio C Hamano
Johannes Schindelin  writes:

> Oh, and v14 has a bug that I reported already:
> http://article.gmane.org/gmane.comp.version-control.git/298949

These two lines are the most helpful kind of response to "What's
cooking" report.  Highly appreciated.

The value of the report to me primarily is to make sure other people
can _stop_ me from merging things prematurely.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 2/5] t5000: test tar files that overflow ustar headers

2016-07-14 Thread Johannes Schindelin
Hi Peff,

sorry for the very, very late reply.

On Thu, 30 Jun 2016, Jeff King wrote:

> The ustar format only has room for 11 (or 12, depending on
> some implementations) octal digits for the size and mtime of
> each file. For values larger than this, we have to add pax
> extended headers to specify the real data, and git does not
> yet know how to do so.
>
> [...]
>  t/t5000/19f9c8273ec45a8938e6999cb59b3ff66739902a | Bin 0 -> 2048 bytes

It appears that this blob cannot be read when sizeof(unsigned long) == 4.
This happens to break the t5000 test on Windows, where that comparison
holds true.

I am sure that I missed some other discussion about this issue... could
you point me to it?

Ciao,
Dscho
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Jul 2016, #05; Wed, 13)

2016-07-14 Thread Junio C Hamano
Johannes Schindelin  writes:

> +   ret = add_cache_entry(ce, options);
> +   if (refresh) {
>
> Should we really refresh, even if ret < 0?

As we stopped calling make_cache_entry() with REFRESH flag on, we
can change this not to refresh if we want to, and I think we can
skip refresh without compromising correctness.  But I'd prefer to
see that change as a separate "optimization" step---after all, the
original before this change unconditionally refreshed before even
calling add_cache_entry() and knowing the result of it.

> +   struct cache_entry *nce;
> +
> +   nce = refresh_cache_entry(ce, CE_MATCH_REFRESH | 
> CE_MATCH_IGNORE_MISSING);
>
> This line is overly long, but there is a *lot* of precedent for that in
> merge-recursive.c, unfortunately. So this is just a remark, not an
> objection.

Yes, I had the same objection to the original codebase while I was
touching it.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Server-side preventing some files from being overwritten

2016-07-14 Thread Thorsten Glaser
Hi *,

is there a way, for example with some sort of pre-receive hook,
to prevent some files from being overwritten by a push?

Use case:

In some project, we use Flyway to manage the database schema,
and Jenkins to automatically build master’s HEAD after each
push (and install it, thus activating the schema files almost
immediately). Now, I wish to prevent coworkers from changing
anything in the SQL subdirectory that has ever been pushed,
forcing them to push new SQL files with ALTER statements instead.
(Yes, I am aware this will likely make me even less liked. No,
this is not an issue.)

As for the comparison, only the changes between the previous
HEAD of master and the new HEAD of master after the push would
have been accepted need to be taken into account; any intermediate
commits, merges, etc. are no problem (because Jenkins does not
build them, and because, once a push fails, the developer will
have to add a commit reverting their change and moving it to
another file on top, I’m no friend of rewriting).

File matching would be “any files under a certain subdirectory”,
nothing fancier than that.

I’ve tried a web search (with two different search engines) for
“git prevent pushed files from modification”, but this seems to
only show people who want to ignore local changes or somesuch…

I’ve asked in IRC, but with no answer for hours I thought that
maybe this was the better place to ask for it.

Thanks in advance,
//mirabilos
-- 
tarent solutions GmbH
Rochusstraße 2-4, D-53123 Bonn • http://www.tarent.de/
Tel: +49 228 54881-393 • Fax: +49 228 54881-235
HRB 5168 (AG Bonn) • USt-ID (VAT): DE122264941
Geschäftsführer: Dr. Stefan Barth, Kai Ebenrett, Boris Esser, Alexander Steeg
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4] config: add conditional include

2016-07-14 Thread Nguyễn Thái Ngọc Duy
Helped-by: Jeff King 
Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 The diff from v3 is mostly clarification in code and document.

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 18623ee..d971334 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -99,10 +99,9 @@ Supported keywords are:
 
 `gitdir`::
The environment variable `GIT_DIR` must match the following
-   pattern for files to be included. The pattern can contain
-   standard globbing wildcards and two additional ones, `**/` and
-   `/**`, that can match multiple path components. Please refer
-   to linkgit:gitignore[5] for details. For convenience:
+   pattern for files to be included. The pattern shares the same
+   syntax as patterns in link:gitignore[5] with a few exceptions
+   below:
 
  * If the pattern starts with `~/`, `~` will be substituted with the
content of the environment variable `HOME`.
diff --git a/config.c b/config.c
index ff44e00..690f3d5 100644
--- a/config.c
+++ b/config.c
@@ -183,6 +183,11 @@ static int prepare_include_condition_pattern(struct 
strbuf *pat)
 
strbuf_add_absolute_path(, home);
strbuf_splice(pat, 0, 1, path.buf, path.len);
+   /*
+* This part, path.buf[0..len], should be considered
+* a literal string even if it has wildcards in it,
+* because those wildcards are not wanted by the user.
+*/
prefix = path.len + 1 /*slash*/;
strbuf_release();
} else if (pat->buf[0] == '.' && is_dir_sep(pat->buf[1])) {
@@ -198,6 +203,11 @@ static int prepare_include_condition_pattern(struct 
strbuf *pat)
if (!slash)
die("BUG: how is this possible?");
strbuf_splice(pat, 0, 1, path.buf, slash - path.buf);
+   /*
+* This part, path.buf[0..slash], should be consider
+* a literal string even if it has wildcards in it,
+* because those wildcards are not wanted by the user.
+*/
prefix = slash - path.buf + 1 /* slash */;
strbuf_release();
} else if (!is_absolute_path(pat->buf))
@@ -224,8 +234,9 @@ static int include_by_gitdir(const char *cond, size_t 
cond_len, int icase)
 
if (prefix > 0) {
/*
-* perform literal matching on the prefix part so that
-* any wildcard character in it can't create side effects.
+* perform literal matching on the expanded prefix
+* part so that any wildcard character in it (e.g in
+* the expansion of ~) can't create side effects.
 */
if (text.len < prefix)
goto done;
 Documentation/config.txt  |  47 +++
 config.c  | 113 +-
 t/t1305-config-include.sh |  56 +++
 3 files changed, 214 insertions(+), 2 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index db05dec..d971334 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -91,6 +91,42 @@ found at the location of the include directive. If the value 
of the
 relative to the configuration file in which the include directive was
 found.  See below for examples.
 
+Included files can be grouped into subsections where the subsection
+name is the condition that must be met for the files to be included.
+The condition starts with a keyword, followed by a colon and a
+pattern. The interpretation of the pattern depends on the keyword.
+Supported keywords are:
+
+`gitdir`::
+   The environment variable `GIT_DIR` must match the following
+   pattern for files to be included. The pattern shares the same
+   syntax as patterns in link:gitignore[5] with a few exceptions
+   below:
+
+ * If the pattern starts with `~/`, `~` will be substituted with the
+   content of the environment variable `HOME`.
+
+ * If the pattern starts with `./`, it is replaced with the directory
+   containing the current config file.
+
+ * If the pattern does not start with either `~/`, `./` or `/`, `**/`
+   will be automatically prepended. For example, the pattern `foo/bar`
+   becomes `**/foo/bar` and would match `/any/path/to/foo/bar`.
+
+ * If the pattern ends with `/`, `**` will be automatically added. For
+   example, the pattern `foo/` becomes `foo/**`. In other words, it
+   matches "foo" and everything inside, recursively.
+
+`gitdir/i`::
+   This is the same as `gitdir` except that matching is done
+   case-insensitively (e.g. on case-insensitive file sytems)
+
+A few more notes on matching:
+
+ * Symlinks in `$GIT_DIR` are not resolved before matching.
+
+ * Note that 

[PATCH 3/3] fsck: optionally show more helpful info for broken links

2016-07-14 Thread Johannes Schindelin
When reporting broken links between commits/trees/blobs, it would be
quite helpful at times if the user would be told how the object is
supposed to be reachable.

With the new --name-objects option, git-fsck will try to do exactly
that: name the objects in a way that shows how they are reachable.

For example, when some reflog got corrupted and a blob is missing that
should not be, the user might want to remove the corresponding reflog
entry. This option helps them find that entry: `git fsck` will now
report something like this:

broken link fromtree b5eb6ff...  (refs/stash@{}~37:)
  toblob ec5cf80...

Signed-off-by: Johannes Schindelin 
---
 Documentation/git-fsck.txt |  9 -
 builtin/fsck.c | 42 ++
 t/t1450-fsck.sh| 22 ++
 3 files changed, 68 insertions(+), 5 deletions(-)

diff --git a/Documentation/git-fsck.txt b/Documentation/git-fsck.txt
index 7fc68eb..b9f060e 100644
--- a/Documentation/git-fsck.txt
+++ b/Documentation/git-fsck.txt
@@ -11,7 +11,8 @@ SYNOPSIS
 [verse]
 'git fsck' [--tags] [--root] [--unreachable] [--cache] [--no-reflogs]
 [--[no-]full] [--strict] [--verbose] [--lost-found]
-[--[no-]dangling] [--[no-]progress] [--connectivity-only] [*]
+[--[no-]dangling] [--[no-]progress] [--connectivity-only]
+[--[no-]name-objects] [*]
 
 DESCRIPTION
 ---
@@ -82,6 +83,12 @@ index file, all SHA-1 references in `refs` namespace, and 
all reflogs
a blob, the contents are written into the file, rather than
its object name.
 
+--name-objects::
+   When displaying names of reachable objects, in addition to the
+   SHA-1 also display a name that describes *how* they are reachable,
+   compatible with linkgit:git-rev-parse[1], e.g.
+   `HEAD@{1234567890}~25^2:src/`.
+
 --[no-]progress::
Progress status is reported on the standard error stream by
default when it is attached to a terminal, unless
diff --git a/builtin/fsck.c b/builtin/fsck.c
index 87df191..e2173b6 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -13,6 +13,7 @@
 #include "dir.h"
 #include "progress.h"
 #include "streaming.h"
+#include "decorate.h"
 
 #define REACHABLE 0x0001
 #define SEEN  0x0002
@@ -35,6 +36,7 @@ static int write_lost_and_found;
 static int verbose;
 static int show_progress = -1;
 static int show_dangling = 1;
+static int name_objects = 0;
 #define ERROR_OBJECT 01
 #define ERROR_REACHABLE 02
 #define ERROR_PACK 04
@@ -42,7 +44,16 @@ static int show_dangling = 1;
 
 static const char *describe_object(struct object *obj)
 {
-   return oid_to_hex(>oid);
+   static struct strbuf buf = STRBUF_INIT;
+   char *name = name_objects ?
+   lookup_decoration(fsck_walk_options.object_names, obj) : NULL;
+
+   strbuf_reset();
+   strbuf_addstr(, oid_to_hex(>oid));
+   if (name)
+   strbuf_addf(, " (%s)", name);
+
+   return buf.buf;
 }
 
 static int fsck_config(const char *var, const char *value, void *cb)
@@ -377,13 +388,18 @@ static int fsck_obj_buffer(const unsigned char *sha1, 
enum object_type type,
 
 static int default_refs;
 
-static void fsck_handle_reflog_sha1(const char *refname, unsigned char *sha1)
+static void fsck_handle_reflog_sha1(const char *refname, unsigned char *sha1,
+   unsigned long timestamp)
 {
struct object *obj;
 
if (!is_null_sha1(sha1)) {
obj = lookup_object(sha1);
if (obj) {
+   if (timestamp && name_objects)
+   add_decoration(fsck_walk_options.object_names,
+   obj,
+   xstrfmt("%s@{%ld}", refname, 
timestamp));
obj->used = 1;
mark_object_reachable(obj);
} else {
@@ -403,8 +419,8 @@ static int fsck_handle_reflog_ent(unsigned char *osha1, 
unsigned char *nsha1,
fprintf(stderr, "Checking reflog %s->%s\n",
sha1_to_hex(osha1), sha1_to_hex(nsha1));
 
-   fsck_handle_reflog_sha1(refname, osha1);
-   fsck_handle_reflog_sha1(refname, nsha1);
+   fsck_handle_reflog_sha1(refname, osha1, 0);
+   fsck_handle_reflog_sha1(refname, nsha1, timestamp);
return 0;
 }
 
@@ -433,6 +449,9 @@ static int fsck_handle_ref(const char *refname, const 
struct object_id *oid,
}
default_refs++;
obj->used = 1;
+   if (name_objects)
+   add_decoration(fsck_walk_options.object_names,
+   obj, xstrdup(refname));
mark_object_reachable(obj);
 
return 0;
@@ -548,6 +567,9 @@ static int fsck_cache_tree(struct cache_tree *it)
return 1;
}
obj->used = 1;
+   if (name_objects)
+   

[PATCH 2/3] fsck_walk(): optionally name objects on the go

2016-07-14 Thread Johannes Schindelin
If fsck_options->name_objects is initialized, and if it already has
name(s) for the object(s) that are to be the starting point(s) for
fsck_walk(), then that function will now add names for the objects
that were walked.

This will be highly useful for teaching git-fsck to identify root causes
for broken links, which is the task for the next patch in this series.

Note that this patch opts for decorating the objects with plain strings
instead of full-blown structs (à la `struct rev_name` in the code of
the `git name-rev` command), for several reasons:

- the code is much simpler than if it had to work with structs that
  describe arbitrarily long names such as "master~14^2~5:builtin/am.c",

- the string processing is actually quite light-weight compared to the
  rest of fsck's operation,

- the caller of fsck_walk() is expected to provide names for the
  starting points, and using plain and simple strings is just the
  easiest way to do that.

Signed-off-by: Johannes Schindelin 
---
 fsck.c | 72 ++
 fsck.h |  1 +
 2 files changed, 73 insertions(+)

diff --git a/fsck.c b/fsck.c
index 0531545..fe6a28a 100644
--- a/fsck.c
+++ b/fsck.c
@@ -9,6 +9,7 @@
 #include "refs.h"
 #include "utf8.h"
 #include "sha1-array.h"
+#include "decorate.h"
 
 #define FSCK_FATAL -1
 #define FSCK_INFO -2
@@ -297,19 +298,51 @@ static int report(struct fsck_options *options, struct 
object *object,
return result;
 }
 
+static char *get_object_name(struct fsck_options *options, struct object *obj)
+{
+   return options->object_names ?
+   lookup_decoration(options->object_names, obj) : NULL;
+}
+
+static void put_object_name(struct fsck_options *options, struct object *obj,
+   const char *fmt, ...)
+{
+   va_list ap;
+   char *existing = lookup_decoration(options->object_names, obj);
+   struct strbuf buf = STRBUF_INIT;
+
+   if (existing)
+   return;
+   va_start(ap, fmt);
+   strbuf_vaddf(, fmt, ap);
+   add_decoration(options->object_names, obj, strbuf_detach(, NULL));
+   va_end(ap);
+}
+
 static int fsck_walk_tree(struct tree *tree, void *data, struct fsck_options 
*options)
 {
struct tree_desc desc;
struct name_entry entry;
int res = 0;
+   const char *name;
 
if (parse_tree(tree))
return -1;
 
+   name = get_object_name(options, >object);
init_tree_desc(, tree->buffer, tree->size);
while (tree_entry(, )) {
int result;
 
+   if (name) {
+   struct object *obj = parse_object(entry.oid->hash);
+
+   if (obj)
+   put_object_name(options, obj, "%s%s%s", name,
+   entry.path,
+   S_ISDIR(entry.mode) ? "/" : "");
+   }
+
if (S_ISGITLINK(entry.mode))
continue;
if (S_ISDIR(entry.mode))
@@ -330,20 +363,55 @@ static int fsck_walk_tree(struct tree *tree, void *data, 
struct fsck_options *op
 
 static int fsck_walk_commit(struct commit *commit, void *data, struct 
fsck_options *options)
 {
+   int counter = 0, generation = 0, name_prefix_len = 0;
struct commit_list *parents;
int res;
int result;
+   const char *name;
 
if (parse_commit(commit))
return -1;
 
+   name = get_object_name(options, >object);
+   if (name)
+   put_object_name(options, >tree->object, "%s:", name);
+
result = options->walk((struct object *)commit->tree, OBJ_TREE, data, 
options);
if (result < 0)
return result;
res = result;
 
parents = commit->parents;
+   if (name && parents) {
+   int len = strlen(name), power;
+
+   if (len && name[len - 1] == '^') {
+   generation = 1;
+   name_prefix_len = len - 1;
+   }
+   else { /* parse ~ suffix */
+   for (generation = 0, power = 1;
+len && isdigit(name[len - 1]);
+power *= 10)
+   generation += power * (name[--len] - '0');
+   if (power > 1 && len && name[len - 1] == '~')
+   name_prefix_len = len - 1;
+   }
+   }
+
while (parents) {
+   if (name) {
+   struct object *obj = >item->object;
+
+   if (++counter > 1)
+   put_object_name(options, obj, "%s^%d",
+   name, counter);
+   else if (generation > 0)
+   put_object_name(options, obj, "%.*s~%d",
+   name_prefix_len, name, 

[PATCH 1/3] fsck: refactor how to describe objects

2016-07-14 Thread Johannes Schindelin
In many places, we refer to objects via their SHA-1s. Let's abstract
that into a function.

For the moment, it does nothing else than what we did previously: print
out the 40-digit hex string. But that will change over the course of the
next patches.

Signed-off-by: Johannes Schindelin 
---
 builtin/fsck.c | 37 +++--
 1 file changed, 23 insertions(+), 14 deletions(-)

diff --git a/builtin/fsck.c b/builtin/fsck.c
index 3f27456..87df191 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -40,6 +40,11 @@ static int show_dangling = 1;
 #define ERROR_PACK 04
 #define ERROR_REFS 010
 
+static const char *describe_object(struct object *obj)
+{
+   return oid_to_hex(>oid);
+}
+
 static int fsck_config(const char *var, const char *value, void *cb)
 {
if (strcmp(var, "fsck.skiplist") == 0) {
@@ -67,7 +72,7 @@ static void objreport(struct object *obj, const char 
*msg_type,
const char *err)
 {
fprintf(stderr, "%s in %s %s: %s\n",
-   msg_type, typename(obj->type), oid_to_hex(>oid), err);
+   msg_type, typename(obj->type), describe_object(obj), err);
 }
 
 static int objerror(struct object *obj, const char *err)
@@ -97,7 +102,7 @@ static int mark_object(struct object *obj, int type, void 
*data, struct fsck_opt
if (!obj) {
/* ... these references to parent->fld are safe here */
printf("broken link from %7s %s\n",
-  typename(parent->type), oid_to_hex(>oid));
+  typename(parent->type), describe_object(parent));
printf("broken link from %7s %s\n",
   (type == OBJ_ANY ? "unknown" : typename(type)), 
"unknown");
errors_found |= ERROR_REACHABLE;
@@ -114,9 +119,9 @@ static int mark_object(struct object *obj, int type, void 
*data, struct fsck_opt
if (!(obj->flags & HAS_OBJ)) {
if (parent && !has_object_file(>oid)) {
printf("broken link from %7s %s\n",
-typename(parent->type), 
oid_to_hex(>oid));
+typename(parent->type), 
describe_object(parent));
printf("  to %7s %s\n",
-typename(obj->type), oid_to_hex(>oid));
+typename(obj->type), describe_object(obj));
errors_found |= ERROR_REACHABLE;
}
return 1;
@@ -190,7 +195,8 @@ static void check_reachable_object(struct object *obj)
return; /* it is in pack - forget about it */
if (connectivity_only && has_object_file(>oid))
return;
-   printf("missing %s %s\n", typename(obj->type), 
oid_to_hex(>oid));
+   printf("missing %s %s\n", typename(obj->type),
+   describe_object(obj));
errors_found |= ERROR_REACHABLE;
return;
}
@@ -215,7 +221,8 @@ static void check_unreachable_object(struct object *obj)
 * since this is something that is prunable.
 */
if (show_unreachable) {
-   printf("unreachable %s %s\n", typename(obj->type), 
oid_to_hex(>oid));
+   printf("unreachable %s %s\n", typename(obj->type),
+   describe_object(obj));
return;
}
 
@@ -234,11 +241,11 @@ static void check_unreachable_object(struct object *obj)
if (!obj->used) {
if (show_dangling)
printf("dangling %s %s\n", typename(obj->type),
-  oid_to_hex(>oid));
+  describe_object(obj));
if (write_lost_and_found) {
char *filename = git_pathdup("lost-found/%s/%s",
obj->type == OBJ_COMMIT ? "commit" : "other",
-   oid_to_hex(>oid));
+   describe_object(obj));
FILE *f;
 
if (safe_create_leading_directories_const(filename)) {
@@ -252,7 +259,7 @@ static void check_unreachable_object(struct object *obj)
if (stream_blob_to_fd(fileno(f), obj->oid.hash, 
NULL, 1))
die_errno("Could not write '%s'", 
filename);
} else
-   fprintf(f, "%s\n", oid_to_hex(>oid));
+   fprintf(f, "%s\n", describe_object(obj));
if (fclose(f))
die_errno("Could not finish '%s'",
  filename);
@@ -271,7 +278,7 @@ static void check_unreachable_object(struct object *obj)
 static void check_object(struct object *obj)
 {
if (verbose)
-   fprintf(stderr, "Checking %s\n", 

[PATCH 0/3] Teach `git fsck` a new option: `--name-objects`

2016-07-14 Thread Johannes Schindelin
When using experimental features (such as worktrees), it is quite
possible to end up with a repository that is a little bit corrupted. In
this developer's case, the auto gc run during interactive rebases in
worktrees completely messed up the reflogs.

The symptoms are broken links between commits/trees/blobs.

Trying to work around such problems can be a real challenge: while
several tools will report when objects are missing, all of them simply
state the SHA-1. This is not useful when the user has to kiss the
offending reflog good-bye, but does not even know which one.

This patch series introduces a new option to `git fsck`: --name-objects.
With this option, the fsck command will report not only the SHA-1 of
missing objects, but also a name by which this object is supposed to be
reachable.

Example output:

...
broken link fromtree b5eb6ff...  (refs/stash@{}~37:)
  toblob ec5cf80...

Originally, I intended to teach name-rev a new mode where it would also
name objects other than commits and tags, but since the objects in
question were lost to a garbage collection, and therefore there would
not have been any objects to call names to begin with, I had to abandon
said quest.


Johannes Schindelin (3):
  fsck: refactor how to describe objects
  fsck_walk(): optionally name objects on the go
  fsck: optionally show more helpful info for broken links

 Documentation/git-fsck.txt |  9 +-
 builtin/fsck.c | 77 --
 fsck.c | 72 +++
 fsck.h |  1 +
 t/t1450-fsck.sh| 22 +
 5 files changed, 163 insertions(+), 18 deletions(-)

Published-As: https://github.com/dscho/git/releases/tag/fsck-name-objects-v1
-- 
2.9.0.278.g1caae67

base-commit: 79ed43c28f626a4e805f350a77c54968b59be6e9
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Bug in index-helper/watchman?

2016-07-14 Thread Duy Nguyen
...and somehow I forgot to CC git@vger, g

On Thu, Jul 14, 2016 at 5:14 PM, Duy Nguyen  wrote:
> Bug reports should always be in public (so people are aware of it)
> unless it's security exploit of course.
>
> I have not read this mail yet, but I will in a couple of hours, hopefully.
>
> On Thu, Jul 14, 2016 at 12:11 AM, Ben Peart  wrote:
>> I’ve been chasing down an issue where it looks like the untracked cache
>> logic doesn’t work correctly in the index-helper/watchman patch series.
>> It’s also entirely possible that I’m just missing something so feel free to
>> correct my misconceptions.
>>
>>
>>
>> Ultimately, it appears that none of the untracked cache directories are
>> getting flagged as invalid from the data in the watchman extension.  I
>> believe this is happening because untracked->root doesn’t get initialized
>> until validate_untracked_cache is called from read_directory.  This causes
>> all calls to lookup_untracked to return NULL so the dir->valid flag is never
>> set to zero in mark_untracked_invalid.  See the call stacks and sequence
>> below for details:
>>
>>
>>
>>
>>
>> cmd_status at builtin/commit.c:1362
>>
>>   status_init_config (s=0x6667a0 , fn=0x432790
>> ) at builtin/commit.c:187
>>
>>  gitmodules_config () at submodule.c:196
>>
>>read_index (istate=0x693860
>> ) at read-cache.c:1442
>>
>>   read_index_from
>> (istate=0x693860 , path=0x2ea3c58 ".git/index") at
>> read-cache.c:1849
>>
>>
>> do_read_index
>>
>>
>> read_index_extension
>>
>>
>> read_watchman_ext
>>
>>
>> mark_untracked_invalid
>>
>>
>> find_untracked_cache_dir
>>
>>
>> lookup_untracked
>>
>>
>> if (!dir)
>>
>>
>> return NULL;
>>
>>
>>
>>   wt_status_collect (s=0x6667a0 ) at wt-status.c:627
>>
>>  wt_status_collect_untracked (s=0x6667a0 ) at
>> wt-status.c:593
>>
>>fill_directory (dir=0xbafab0,
>> pathspec=0x6667b8 ) at dir.c:191
>>
>>   read_directory
>> (dir=0xbafab0, path=0x61cac3  "", len=0, pathspec=0x6667b8 )
>> at dir.c:2009
>>
>>
>> validate_untracked_cache at dir.c:1903
>>
>>
>> if (!dir->untracked->root)
>>
>>
>> read_directory_recursive
>>
>>
>> open_cached_dir
>>
>>
>> valid_cached_dir
>>
>>
>> read_directory_recursive
>>
>>
>> open_cached_dir
>>
>>
>> valid_cached_dir
>>
>>
>> if (dir->untracked->use_watchman)
>>
>>
>>
>>
>>
>> If I’m reading this correctly, one potential fix is to move the logic that
>> loops through the directories calling mark_untracked_invalid to between the
>> call to validate_untracked_cache and the call to read_directory_recursive.
>> I wonder if there is another simpler/better fix.
>>
>>
>>
>> Thoughts?
>>
>>
>>
>> Ben
>
>
>
> --
> Duy



-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Bug report: 'git rebase' confusing files with initially equal content

2016-07-14 Thread Bernhard Graf
With the following steps content from one file ends up in another file 
by using 'git rebase' (and clobbers the original content).
It is rather straight forward and I discovered this behaviour when 
trying to replay http://gitforteams.com/resources/rebasing.html .

The special thing is, that both files have the same content at start.

  $ mkdir rebase
  $ cd rebase
  git --version
  git version 2.9.0
  $ git init
  Initialized empty Git repository in /home/graf/projects/git/rebase/.git/
  $ echo stub >file1
  $ echo stub >file2
  $ git add .
  $ git commit -m 'add stub'
  [master (root-commit) 0cb53e9] Add stubs
   2 files changed, 2 insertions(+)
   create mode 100644 file1
   create mode 100644 file2
  $ git checkout -b feature/1
  Switched to a new branch 'feature/1'
  $ echo line1 >file1
  $ git commit -am 'Add content'
  [feature/1 2f3dcc7] Add content
   1 file changed, 1 insertion(+), 1 deletion(-)
  $ git checkout master
  Switched to branch 'master'
  $ git rm file1
  rm 'file1'
  $ git commit -m 'Delete file1'
  [master 1b18590] Delete file1
   1 file changed, 1 deletion(-)
   delete mode 100644 file1
  $ git checkout feature/1
  Switched to branch 'feature/1'
  $ echo line2 >>file1
  $ git commit -am 'Add more content'
  [feature/1 36be376] Add more content
   1 file changed, 1 insertion(+)
  $ cat file1
  line1
  line2
  $ cat file2
  stub
  $ git rebase master
  First, rewinding head to replay your work on top of it...
  Applying  Add content
  Using index info to reconstruct a base tree...
  A file1
  Falling back to patching base and 3-way merge...
  Auto-merging file2
  Applying  Add more content
  Using index info to reconstruct a base tree...
  A file1
  Falling back to patching base and 3-way merge...
  Auto-merging file2
  $ ls -l
  total 8
  -rw-r--r--  1 graf  users  12 14 Jul 13:16 file2
  $ cat file2
  line1
  line2

If instead file1 and file2 initially had different contents, 'git 
rebase' runs into a merge conflict, so the user can fix it appropriatly.


I suppose this is related to the fact that Git manages the repo by file 
contents only. Still this is nothing what I would expect (and neither 
accept).


--
Bernhard Graf
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] mingw: fix regression in t1308-config-set

2016-07-14 Thread Johannes Schindelin
When we tried to fix in 58461bd (t1308: do not get fooled by symbolic
links to the source tree, 2016-06-02) an obscure case where the user
cd's into Git's source code via a symbolic link, a regression was
introduced that affects all test runs on Windows.

The original patch introducing the test case in question was careful to
use `$(pwd)` instead of `$PWD`.

This was done to account for the fact that Git's test suite uses shell
scripting even on Windows, where the shell's Unix-y paths are
incompatible with the main Git executable's idea of paths: it only
accepts Windows paths.

It is an awkward but necessary thing, then, to use `$(pwd)` (which gives
us a Windows path) when interacting with the Git executable and `$PWD`
(which gives the shell's idea of the current working directory in Unix-y
form) for shell scripts, including the test suite itself.

Obviously this broke the use case of the Git maintainer when changing
the working directory into Git's source code directory via a symlink,
i.e. when `$(pwd)` does not agree with `$PWD`.

However, we must not fix that use case at the expense of regressing
another use case.

Let's special-case Windows here, even if it is ugly, for lack of a more
elegant solution.

Signed-off-by: Johannes Schindelin 
---
Published-As: https://github.com/dscho/git/releases/tag/t1308-on-windows-v1

Side note: it was not at all clear to me how 58461bd fixed the
problem by replacing $(pwd) with $HOME, given that HOME is set to
$TRASH_DIRECTORY which is set to $TEST_OUTPUT_DIRECTORY/... after
TEST_OUTPUT_DIRECTORY was set to TEST_DIRECTORY which in turn was
set to $(pwd).

I guess the reason is that -P in `cd -P "$TRASH DIRECTORY"`, but
then I *really* do not understand how $(pwd) and $PWD could
disagree.

Oh well. I have to move on to the next fire.

 t/t1308-config-set.sh | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/t/t1308-config-set.sh b/t/t1308-config-set.sh
index a06e71c..7655c94 100755
--- a/t/t1308-config-set.sh
+++ b/t/t1308-config-set.sh
@@ -233,11 +233,19 @@ cmdline_config="'foo.bar=from-cmdline'"
 test_expect_success 'iteration shows correct origins' '
echo "[foo]bar = from-repo" >.git/config &&
echo "[foo]bar = from-home" >.gitconfig &&
+   if test_have_prereq MINGW
+   then
+   # Use Windows path (i.e. *not* $HOME)
+   HOME_GITCONFIG=$(pwd)/.gitconfig
+   else
+   # Do not get fooled by symbolic links, i.e. $HOME != $(pwd)
+   HOME_GITCONFIG=$HOME/.gitconfig
+   fi &&
cat >expect <<-EOF &&
key=foo.bar
value=from-home
origin=file
-   name=$HOME/.gitconfig
+   name=$HOME_GITCONFIG
scope=global
 
key=foo.bar
-- 
2.9.0.278.g1caae67

base-commit: 79ed43c28f626a4e805f350a77c54968b59be6e9
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 00/16] Use merge_recursive() directly in the builtin am

2016-07-14 Thread Johannes Schindelin
Hi Junio,

On Tue, 12 Jul 2016, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> > This is the second iteration of the long-awaited re-roll of the attempt to
> > avoid spawning merge-recursive from the builtin am and use merge_recursive()
> > directly instead.
> 
> This is actually the third iteration.

It is.

> I am trying to tease dependencies apart and apply this on a more
> reasonable base than a commit that happened to be at 'pu' on one
> day, but this would probably take some time, and I may give up
> merging it anywhere for today's integration cycle.  We'll see.

The two topics that are in 'pu' and conflict with this series are
'jh/clean-smudge-annex' and 'bc/cocci'.

It also conflicted with 'va/i18n-even-more', but that one was merged to
'master'.

Now, I think it would be okay to wait for 'bc/cocci' to go to 'master'
before integrating the 'am-3-merge-recursive-direct' branch, but I would
want to avoid waiting for 'jh/clean-smudge-annex'.

Do you concur? If so, I will rebase onto 'master' as soon as 'bc/cocci'
lands in there.

Ciao,
Dscho
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Jul 2016, #05; Wed, 13)

2016-07-14 Thread Johannes Schindelin
Hi Junio,

On Wed, 13 Jul 2016, Junio C Hamano wrote:

> * dt/index-helper (2016-07-06) 21 commits
>  - index-helper: indexhelper.exitAfter config
>  - trace: measure where the time is spent in the index-heavy operations
>  - index-helper: optionally automatically run
>  - index-helper: autorun mode
>  - index-helper: don't run if already running
>  - index-helper: kill mode
>  - watchman: add a config option to enable the extension
>  - unpack-trees: preserve index extensions
>  - update-index: enable/disable watchman support
>  - index-helper: use watchman to avoid refreshing index with lstat()
>  - watchman: support watchman to reduce index refresh cost
>  - read-cache: add watchman 'WAMA' extension
>  - index-helper: log warnings
>  - index-helper: add --detach
>  - daemonize(): set a flag before exiting the main process
>  - index-helper: add --strict
>  - index-helper: new daemon for caching index and related stuff
>  - unix-socket.c: add stub implementation when unix sockets are not supported
>  - pkt-line: add gentle version of packet_write
>  - read-cache: allow to keep mmap'd memory after reading
>  - read-cache.c: fix constness of verify_hdr()
> 
>  A new "index-helper" daemon has been introduced to give newly
>  spawned Git process a quicker access to the data in the index, and
>  optionally interface with the watchman daemon to further reduce the
>  refresh cost.
> 
>  Will merge to 'next'.
> 
>  Is everybody happy with this version?
>  At v14.

I am trying to get back to working on the Windows support for this. I had
something that compiled but did not quite manage to do the IPC on Monday
evening, and have been scrambling with too much other stuff preventing me
from getting back to that.

The reason I mention this is: I wanted to get it working first just to be
able to determine whether there are possible improvements to the design
that would make things much easier to port to non-Unix-sockets-capable
systems.

Oh, and v14 has a bug that I reported already:
http://article.gmane.org/gmane.comp.version-control.git/298949

Ciao,
Dscho
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Jul 2016, #05; Wed, 13)

2016-07-14 Thread Johannes Schindelin
Hi Junio,

On Wed, 13 Jul 2016, Junio C Hamano wrote:

> * jc/renormalize-merge-kill-safer-crlf (2016-07-12) 2 commits
>  - merge: avoid "safer crlf" during recording of merge results

I like the verbose commit message, it really makes it easier to understand
the issue as well as the solution. As to the diff:

+   ret = add_cache_entry(ce, options);
+   if (refresh) {

Should we really refresh, even if ret < 0?

+   struct cache_entry *nce;
+
+   nce = refresh_cache_entry(ce, CE_MATCH_REFRESH | 
CE_MATCH_IGNORE_MISSING);

This line is overly long, but there is a *lot* of precedent for that in
merge-recursive.c, unfortunately. So this is just a remark, not an
objection.

>  - convert: unify the "auto" handling of CRLF

I *think* this is a good thing to do. And I also have to admit that the
only thing I can say about that test matrix in t0027 (both pre- and
post-image) is that it is visually mesmerizing. So I cannot even say
whether those changes make sense to me because I cannot wrap my head
around the changes in the tests.

Ciao,
Dscho
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Jul 2016, #05; Wed, 13)

2016-07-14 Thread Johannes Schindelin
Hi Junio,

On Wed, 13 Jul 2016, Junio C Hamano wrote:

> * js/t0006-for-v2.9.2 (2016-07-12) 1 commit
>  - t0006: skip "far in the future" test when unsigned long is not long enough
> 
>  A test merged to v2.9.1 forgot that the feature it was testing
>  would not work on a platform with 32-bit time_t/unsigned long and
>  reported breakage.  Skip the tests that cannot run correctly on
>  such platforms.
> 
>  Waiting for an Ack; will fast-track down to 'maint' to cut v2.9.2.

If you are waiting for my okay, then go ahead and wait no more.

I still do not like the current form of the patch, even if it was
essentially my work, but I have bigger fish to fry: I am chasing down two
other regressions on `master`, I have to move that `git status -vvp` topic
along, and I have to get the index-helper to run on Windows. Before even
coming back to the "unsigned long -> time_t" patch series.

So in this case, "it works" has to be good enough for me, I guess.

Ciao,
Dscho
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] t1308: do not get fooled by symbolic links to the source tree

2016-07-14 Thread Johannes Schindelin
Hi,

On Fri, 3 Jun 2016, Torsten Bögershausen wrote:

> On 06/03/2016 08:53 AM, Johannes Sixt wrote:
> > Am 03.06.2016 um 08:10 schrieb Jeff King:
> > > On Fri, Jun 03, 2016 at 08:05:56AM +0200, Johannes Sixt wrote:
> > >
> > > > > -name=$(pwd)/.gitconfig
> > > > > +name=$HOME/.gitconfig
> > > >
> > > > I haven't tested this, yet, but my guess is that this breaks on Windows:
> > > > test-config will produce C:/foo style path, but the updated test would
> > > > expect /c/foo style path. Dscho, do you have an idea how to fix this?
> > >
> > > Hmm. This should come directly from expand_user_path("~/.gitconfig")
> > > which prepends the literal contents of the $HOME variable. It does go
> > > through convert_slashes() afterwards, but I don't see any other
> > > massaging (but I won't be surprised when you tell me there is some that
> > > happens behind the scenes).
> >
> > Yes, it happens behind the scenes: /c/foo absolute paths are a convention
> > used by the POSIX emulation layer (MSYS). When bash (an MSYS program) runs a
> > non-MSYS program such as git or test-config, it converts the /c/foo paths in
> > the environment (and argument list) to c:/foo style because the non-MSYS
> > programs do not understand the MSYS convention.
> >
> > -- Hannes
> Compiling pu didn't succed:
> unix_stream_connect is missing in read_cache.c
> 
> (And many more in index-heloer.c)
> 
> (I thought that the index-helper is only compiled on systems,
>   which are known to have unix-sockets and other stuff ?)
> 
> After patching that out,  t1308 fails:
> -name=/c/
> +name=c:/

This issue was not resolved, and now this commit is on `master`...

Ciao,
Dscho

Svare

2016-07-14 Thread STD
Dr. Alex Morgan
Drift / regionsjef
Santander Bank Plc,
47-48 Piccadilly
PICCADILLY
W1J0DT
London, Storbritannia

Hei venn,

Jeg er Dr. Alex Morgan, fra Harlesden North West London, leder for regnskap 
Revisjonen og formell senior programmerer sjef hos Deutsche bank, her i England 
(Santander Bank Plc). Jeg har også jobbet for Nyeste ministrene Bank Plc.

Jeg skriver du om en bedrift forslag som vil være av en enorm fordel for oss 
begge. I min avdeling, som de eneste Operations / regionsjef avdelingsleder 
Santander Bank Plc. Jeg oppdaget en sum (£ 21,500,000,00) britiske pund (Tjue 
millioner fem hundre tusen britiske pund) i en konto som tilhører en av våre 
utenlandske kunder (Late David McDowell Brown) en amerikansk statsborger av 
Arlington Virginia, som dessverre mistet sitt liv på første februar 2003 
gjennom det sørlige USA romfergen Columbia, døde han en enkelt mann.

Valget av kontakter du vekket fra den geografiske natur hvor du bor, særlig på 
grunn av sensitiviteten av denne transaksjonen og fortielse her inn. Vår bank 
har ventet på noen av de pårørende til å komme opp for kravet, men ingen har 
gjort det, og jeg personlig har vært mislykket i finne slektninger, jeg nå 
søker ditt samtykke til å presentere deg som pårørende (WILL MOTTAKER) til den 
avdøde, slik at utbyttet av denne kontoen verdi på (£ 21,500,000,00) britiske 
pund kan bli utbetalt til deg.

Likevel vil dette bli utbetalt eller delt i disse prosenter, 50% til meg og 40% 
til deg mens 10% er for eventuelle utgifter i løpet av transaksjonen. Jeg har 
sikret alle nødvendige juridiske dokumenter som kan brukes til å 
sikkerhetskopiere dette kravet vi gjør. Alt jeg trenger er å fylle inn ditt 
navn (e) til dokumenter og legalisere det i rettssalen her for å bevise deg som 
legitim betalingsmottakeren. Alt jeg trenger nå er ærlig samarbeid, 
taushetsplikt og tillit til at vi ser denne transaksjonen gjennom. Jeg kan 
garantere deg at dette vil bli behandlet under en lovlig ordning som vil 
beskytte deg mot eventuelle brudd på loven hvis du vil stå som arving til denne 
påstanden, jeg skal behandle kravet i ditt navn. Jeg bare krever at du sender 
dine gyldige identifikasjonsopplysninger og jeg vil godkjenne kravet i ditt 
navn.

Gi meg følgende under, som vi har 7 dager for å kjøre den gjennom. Dette er 
veldig HASTER.

1. Fullt navn:
2. Din Direct Mobilnummer:
3. Din Kontakt Adresse:
4. Yrke:
5. Alder:
6. Sex:
7. Nasjonalitet:

Etter å ha gått gjennom en metodisk søk, bestemte jeg meg for å kontakte deg 
håper at du vil finne dette forslaget interessant. Vær på bekreftelsessiden av 
denne meldingen og viser din interesse vil jeg gi deg mer informasjon.
Vennligst forsøke å la meg vite din beslutning heller enn å holde meg venter.

Takker du i påvente av gode svar.

Med vennlig hilsen,
Dr. Alex Morgan

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 32-bit Travis, was Re: [ANNOUNCE] Git v2.9.1

2016-07-14 Thread Johannes Schindelin
Hi Mike,

On Thu, 14 Jul 2016, Mike Hommey wrote:

> On Thu, Jul 14, 2016 at 09:58:45AM +0200, Johannes Schindelin wrote:
> > Hi Junio,
> > 
> > On Wed, 13 Jul 2016, Junio C Hamano wrote:
> > 
> > > Does Travis CI testing have an option to run our tests on some
> > > 32-bit platforms?
> > 
> > AFAIR Docker does not support 32-bit, and IIRC that's what Travis uses.
> > 
> > However, it is possible to install a 32-bit toolchain and use that to
> > compile Git.
> 
> You just need to install gcc-multilib on travis, and you can use -m32. I
> did that for jemalloc recently.
> See https://github.com/jemalloc/jemalloc/blob/dev/.travis.yml

Would we not also need

apt:
packages:
lib32z1

(or ia32libs in case of an older Ubuntu)?

Ciao,
Dscho
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 32-bit Travis, was Re: [ANNOUNCE] Git v2.9.1

2016-07-14 Thread Mike Hommey
On Thu, Jul 14, 2016 at 09:58:45AM +0200, Johannes Schindelin wrote:
> Hi Junio,
> 
> On Wed, 13 Jul 2016, Junio C Hamano wrote:
> 
> > Does Travis CI testing have an option to run our tests on some
> > 32-bit platforms?
> 
> AFAIR Docker does not support 32-bit, and IIRC that's what Travis uses.
> 
> However, it is possible to install a 32-bit toolchain and use that to
> compile Git.

You just need to install gcc-multilib on travis, and you can use -m32. I
did that for jemalloc recently.
See https://github.com/jemalloc/jemalloc/blob/dev/.travis.yml

Mike
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ANNOUNCE] Git v2.9.1

2016-07-14 Thread Jeff King
tl;dr I don't really care which fix goes in. They are both fine with me,
and in practice I cannot imagine either causing a big problem. But here
are my thoughts because I know you want them.

On Thu, Jul 14, 2016 at 09:45:12AM +0200, Johannes Schindelin wrote:

> > Hmm, sorry, I do not see much upside here (iow, the 2038 test you
> > came up with is as robust).
> 
> Unless you, or Peff, performed a thorough analysis whether the dates are
> always cut off at 2038 holds true, I am highly doubtful that the previous
> tes is robust at all. I certainly only tested on Windows and never
> investigated how that 2038 came about. For what I know, it might be a
> platform-dependent behavior of strtoul().

I think that when a long is 32-bit signed, you will always get 2038 from
strtol.  There could be systems where that is the case, though, and
time_t is of a different size. I'm not sure how much it would be worth
caring about them.

One nice thing about looking for "if we got 2038, we know we can skip"
as opposed to "did we correctly format this to 2286" is that we err on
the side of failing the test. So if we did ever find such an oddball
platform, the test would fail and we could address it then.

> > When the internal time representation is updated from "unsigned long" to
> > a signed and wider type [*1*], test-date has to stop reading the
> > second-from-epoch input with strtol(),
> 
> It's strtoul(), actually.

I think both you and Junio are mistaken in the quoted text. :)

The code in question is in t/helper/test-date.c:show_dates(), and _does_
call the signed strtol(). However, it is storing it not in an "unsigned
long" (which would be utterly silly), but in a time_t.

And the value is clamped to LONG_MAX there, so the representation
elsewhere does not matter at all, as long as it big enough to store
LONG_MAX. By definition, "unsigned long" should be. In practice, I'd
guess time_t is, though perhaps one could come up with a case of
compiling a 64-bit program against a 32-bit ABI? I don't know if that's
possible.

That also explains why we get 2038 here, and not our usual sentinel
value of "(time_t)0". We _do_ have overflow checks when formatting
pretty-printed dates from commits (see show_ident_date), but the test
helper isn't using them.

> Please also note that ULONG_MAX is not required to be either 2^32-1 or
> 2^64-1. Which means that the 2038 test is really not robust.

Of course not; but as I mentioned above, I think the test can be written
to complain in the unlikely case that it is not one of those, and we can
deal with it then.

> Also note that the 640bit test is very explicit, and hence robust. As a
> consequence it would skip the absurd dates on systems switching to
> int128_t for time_t.

Actually, I think that is a bad thing. The case that the test in
question was added for was not about overflowing "unsigned long", but
about having a far-future date that tm_to_time_t() could not handle. And
that maxes out at 2100. Testing it on a 128-bit system would be
completely appropriate.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Multiple Keys in ssh-agent, fail to clone

2016-07-14 Thread Johannes Schindelin
Hi Benjamin,

On Wed, 13 Jul 2016, Benjamin Fritsch wrote:

> I have two keys.
> - KeyA (my company that has access to the repository I want to clone)
> - KeyB (just my personal key with access to my personal stuff)
> 
> Having both keys in loaded and listed in `ssh-add -L` fails to clone the
> repository. I tried to change the order of the key in the agent but
> neither KeyA, KeyB nor KeyB, KeyA will work. The only case that works if
> I have KeyA loaded an no other key is added to the ssh-agent.

I once had the same problem, due to a server newly closing the connection
directly after the first key failed. Note: the problem was the *server*
(that was updated at the same time as my local Debian packages).

The only remedy was to specify the exact key in ~/.ssh/config via
IdentityFile.

Hth,
Johannes
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


  1   2   >