Re: [PATCH] compat: Allow static initializer for pthreads on Windows

2016-10-27 Thread Johannes Sixt

Am 27.10.2016 um 23:49 schrieb Jacob Keller:

Ok, so I've been reading this thread. I don't understand your
objections to emulating in this way.. Could you clearly spell out why
you believe this solution isn't acceptable? So far all I've understood
was "it's not critical sections" and "it penalizes Windows too much"
but... If Windows cannot statically initialize a pthread mutex, then
we *have* to dynamically initialize it somewhere. This solution adds a
single check before each lock and is safe due to use of memory
barriers. Yes, this will cost a tiny bit extra overhead for each use
of "pthread_mutex_lock" but I fail to see how that is a huge
penalty...


One point is that the DCLP idiom must be implemented correctly. There 
are solutions, of course, and when the initialization is over, we have a 
miniscule overhead at each pthread_mutex_lock call.


The main point is that the initialization has to solve a chicken-and-egg 
problem: After we have found an uninitialized critical section, we have 
to have mutual exclusion for the initialization. We need another 
critical section for this, but we cannot have one that is initialized. 
For this reason, the solution uses a different kind of mutual exclusion 
primitive, which is more akin to POSIX semaphores and works across 
processes. In the patch proposed by Stefan, a *session-wide* mutex is 
used. That means that all concurrent git invocations in a user's session 
synchronize their initialization of critical section objects.


That's just ridiculous. It's like waiting for a ... no, *the* ... battle 
ship just to get our bouncers in their position. We are talking 
milliseconds here, not nanoseconds.


-- Hannes



Re: [PATCH v3 2/3] sha1_file: open window into packfiles with O_CLOEXEC

2016-10-27 Thread Eric Wong
Junio C Hamano  wrote:
> Junio C Hamano  writes:
> 
> > Linus Torvalds  writes:
> >
> >> On Thu, Oct 27, 2016 at 4:36 PM, Junio C Hamano  wrote:
> >>>
> >>> Would the best endgame shape for this function be to open with
> >>> O_NOATIME (and retry without), and then add CLOEXEC with fcntl(2)
> >>> but ignoring an error from it, I guess?  That would be the closest
> >>> to what we historically had, I would think.
> >>
> >> I think that's the best model.

Actually, I would flip the order of flags.  O_CLOEXEC is more
important from a correctness standpoint.

> > OK, so perhaps like this.
> 
> Hmph.  This may not fly well in practice, though.  
> 
> To Unix folks, CLOEXEC is not a huge correctness issue.  A child
> process may hold onto an open file descriptor a bit longer than the
> lifetime of the parent but as long as the child eventually exits,

I'm not too familiar with C internals of git; but I know we use
threads in some places, and fork+execve in others.

If our usage of threads and execve intersects, and we run
untrusted code in an execve-ed child, then only having cloexec
on open() will save us time when auditing for leaking FDs.

fcntl(fd, F_SETFD, O_CLOEXEC) is racy in if there are other
threads doing execve; so I wouldn't rely on it as a first
choice.

So I suppose something like this:

static int noatime = 1;
int fd = open(... | O_CLOEXEC);
...error checking and retrying...

if (fd >= 0 && noatime && fcntl(fd, F_SETFL, O_NOATIME) != 0)
noatime = 0;

return fd;


Re: [RFC PATCH 0/5] recursively grep across submodules

2016-10-27 Thread Stefan Beller
On Thu, Oct 27, 2016 at 7:50 PM, Junio C Hamano  wrote:


>
> Unless you are imagining "git grep" to initialize and checkout a
> submodule that is not checked out on-demand, I do not think you have
> any reason to even look at ".gitmodules" for the purpose of "I want
> to grep both in superproject and submodules that are checked out."

In tree-ish mode you may have this example:

git -C superproject rm path/to/submodule
git -C superproject commit -a -m "delete submodule"
...  time passes ...
git -C superproject grep --recurse-submodule -e  \
HEAD~42 path/to/submodule

In the last command you need to map the path to submodule to
the name of the submodule to find out the place of the object store
for that submodule and see if it exists.

> If it is working with a tree-ish, again, go look at the object store
> in that submodule repository.

and to find out the object store for that submodule you need the
path -> name mapping at that point in time, i.e. you want to look
at the .gitmodules file at the given tree-ish.


Loan

2016-10-27 Thread Direzione Generale


-- 
Are you in need of easy qualifying business or personal loans? We offer all 
kinds of loans at 3% contact us today at. Carmona.gomez402@gmail.comFull 
Name:Need Loan Amount:Loan Duration:Phone Number:Applied before :?State:Monthly 
Income:Country:contact us today at. carmona.gomez402@gmail.comYours 
faithfullyGloria Gomez CarmonaMD / CEO


"git push" says "src refspec XYZ matches more than one" even without explicit XYZ argument.

2016-10-27 Thread Kannan Goundan
1. My repo has a branch named 'v1' that is tracking 'origin/v1'.
2. My repo has a tag named 'v1'.
3. I have "push.default" set to "upstream".

I made a commit on branch 'v1' and tried doing a push:

# git push
error: src refspec v1 matches more than one.
error: failed to push some refs to 'g...@github.com:whatever/ns1-go.git'

If I rename the branch to 'v1-dev', then the push goes through.

I understand why the command "git push origin/v1 v1" is ambiguous.
But if I do a plain "git push", I thought Git would know to push my
current branch.

[Git version 2.10.1 from Homebrew on Mac OS 10.11.6.]


Re: Expanding Includes in .gitignore

2016-10-27 Thread Junio C Hamano
Jeff King  writes:

> However, as I said elsewhere, I'm not convinced this feature is all that
> helpful for in-repository .gitignore files, and I think it does
> introduce compatibility complications. People with older git will not
> respect your .gitignore.d files. Whereas $GIT_DIR/info is purely a local
> matter.

As I do not see the point of making in-tree .gitignore to a forest
of .gitignore.d/ at all, compatibility complications is not worth
even thinking about, I would have to say.

Thanks.





Re: [RFC PATCH 0/5] recursively grep across submodules

2016-10-27 Thread Junio C Hamano
Stefan Beller  writes:

>> Just a few brief comments, before reading the patches carefully.
>>
>>  * It is somewhat surprising that [1/5] is even needed (in other
>>words, I would have expected something like this to be already
>>there, and my knee-jerk reaction was "Heh, how does 'git status'
>>know how to show submodules that are and are not initialized
>>differently without this?"
>
> The issue with much of the existing code is that it is submodule centric,
> i.e. it is written to not care about the rest.
>
> git status for example just calls "git submodule summary" to
> parse and display the submodule information additionally.
> It doesn't integrate submodules and treats them "just like files".

Oh, I know all that after/while writing the above "it is somewhat
surprising" and reading what wt-status.c does.  It was just that it
was somewhat surprising ;-)

> My reaction to 1/5 was that the implementation is sound,
> but the design may need rethinking.
>
> Instead of asking all these question, "Is a submodule
> * initialized
> * checked out (== have a working dir)
> * have a .git dir (think of deleted submodules that keep the
>   historical git dir around)
> (* have commit X)
> we would want to either extend the submodule-config API
> to also carry these informations just like
> name/path/sha1/url/shallow clone recommendation.

I think you are going in a wrong direction with all the above.

Unless you are imagining "git grep" to initialize and checkout a
submodule that is not checked out on-demand, I do not think you have
any reason to even look at ".gitmodules" for the purpose of "I want
to grep both in superproject and submodules that are checked out."

You only need to detect .gitlink that exists in the index of the
superproject, and then there would be only two cases:

 * $path has an empty directory (not even .git in there).  The user
   is not interested in that submodule.

 * $path has ".git", either a directory (old layout or we are
   dealing with the repository that originated the submodule) or a
   "gitdir:" file that points into .git/modules of the repository of
   the superproject.  The user is interested in the submodule.

If $path has ".git" and nothing else, the only explanation is that
the user removed the working tree files in the submodule.  If your
grep is looking at working tree files, it is correct not to find
anything in there.  If it is working with "--cached", go look at the
index of the submodule repository (either the ".git" directory, or
the stashed-away repository in .git/modules/ in the superproject).
If it is working with a tree-ish, again, go look at the object store
in that submodule repository.

>>  * It is somewhat surprising that [4/5] does not even use the
>>previous ls-files to find out the paths.  Also it is a bit
>>disappointing to see that the way processes are spawned and
>>managed does not share much with Stefan's earlier work, i.e.
>>run_processes_parallel().  I was somehow hoping that it can be
>>extended to support this use case, but apparently there aren't
>>much to be shared.
>
> I think there are 2 issues here:

There is no issue here.  I was just giving my impressions (i.e.
"somewhat surprising").

> * git-grep already has its own thread pool

I know.  I was expecting that the previous "ls-files" that recurses
will be used to feed into that thread pool, but I didn't find that
in my cursory look at the patch, hence "somewhat surprising".

I hate it when people become overly defensive and start making
excuses when given harmless observations.





Re: [PATCH v3 2/3] sha1_file: open window into packfiles with O_CLOEXEC

2016-10-27 Thread Junio C Hamano
Junio C Hamano  writes:

> Linus Torvalds  writes:
>
>> On Thu, Oct 27, 2016 at 4:36 PM, Junio C Hamano  wrote:
>>>
>>> Would the best endgame shape for this function be to open with
>>> O_NOATIME (and retry without), and then add CLOEXEC with fcntl(2)
>>> but ignoring an error from it, I guess?  That would be the closest
>>> to what we historically had, I would think.
>>
>> I think that's the best model.
>
> OK, so perhaps like this.

Hmph.  This may not fly well in practice, though.  

To Unix folks, CLOEXEC is not a huge correctness issue.  A child
process may hold onto an open file descriptor a bit longer than the
lifetime of the parent but as long as the child eventually exits,
nothing is affected.  Over there, things are different.  The parent
cannot even rename(2) or unlink(2) a file it created and closed
while the child is still holding the file descriptor open and the
lack of CLOEXEC will make the parent fail.  I do not know how well
fcntl(2) emulation works on Windows, but I would not be surprised
if J6t or Dscho comes back and says that FD_CLOEXEC given to F_SETFD
would not work while O_CLOEXEC given to open(2) does.


Re: [PATCH v3 2/3] sha1_file: open window into packfiles with O_CLOEXEC

2016-10-27 Thread Junio C Hamano
Linus Torvalds  writes:

> On Thu, Oct 27, 2016 at 4:36 PM, Junio C Hamano  wrote:
>>
>> Would the best endgame shape for this function be to open with
>> O_NOATIME (and retry without), and then add CLOEXEC with fcntl(2)
>> but ignoring an error from it, I guess?  That would be the closest
>> to what we historically had, I would think.
>
> I think that's the best model.

OK, so perhaps like this.

-- >8 --
Subject: git_open(): untangle possible NOATIME and CLOEXEC interactions

The way we structured the fallback-retry for opening with O_NOATIME
and O_CLOEXEC meant that if we failed due to lack of support to open
the file with O_NOATIME option (i.e. EINVAL), we would still try to
drop O_CLOEXEC first and retry, and then drop O_NOATIME.  A platform
on which O_NOATIME is defined in the header without support from the
kernel wouldn't have a chance to open with O_CLOEXEC option due to
this code structure.

Arguably, O_CLOEXEC is more important than O_NOATIME, as the latter
is mostly about performance, while the former can affect correctness.
Let's revert the recent changes to the way git_open() attempts to
open a file with O_NOATIME and retries without to the original
sequence, and then use a separate fcntl(fd, F_SETFD, FD_CLOEXEC) on
the resulting file descriptor.  The helper to do the latter can be
usable in the codepath in ce_compare_data() that was recently added
to open a file descriptor with O_CLOEXEC, so let's refactor that
codepath with the helper while we are at it.

Signed-off-by: Junio C Hamano 
---
 git-compat-util.h |  5 +++--
 read-cache.c  | 12 
 sha1_file.c   | 49 ++---
 3 files changed, 37 insertions(+), 29 deletions(-)

diff --git a/git-compat-util.h b/git-compat-util.h
index 43718dabae..a751630db5 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -679,9 +679,10 @@ char *gitstrdup(const char *s);
 #define getpagesize() sysconf(_SC_PAGESIZE)
 #endif
 
-#ifndef O_CLOEXEC
-#define O_CLOEXEC 0
+#ifndef FD_CLOEXEC
+#define FD_CLOEXEC 0
 #endif
+extern int git_set_cloexec(int);
 
 #ifdef FREAD_READS_DIRECTORIES
 #ifdef fopen
diff --git a/read-cache.c b/read-cache.c
index db5d910642..fb91514885 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -156,17 +156,13 @@ void fill_stat_cache_info(struct cache_entry *ce, struct 
stat *st)
 static int ce_compare_data(const struct cache_entry *ce, struct stat *st)
 {
int match = -1;
-   static int cloexec = O_CLOEXEC;
-   int fd = open(ce->name, O_RDONLY | cloexec);
-
-   if ((cloexec & O_CLOEXEC) && fd < 0 && errno == EINVAL) {
-   /* Try again w/o O_CLOEXEC: the kernel might not support it */
-   cloexec &= ~O_CLOEXEC;
-   fd = open(ce->name, O_RDONLY | cloexec);
-   }
+   int fd = open(ce->name, O_RDONLY);
 
if (fd >= 0) {
unsigned char sha1[20];
+
+   /* do not let child processes to hold onto the open fd */
+   git_set_cloexec(fd);
if (!index_fd(sha1, fd, st, OBJ_BLOB, ce->name, 0))
match = hashcmp(sha1, ce->oid.hash);
/* index_fd() closed the file descriptor already */
diff --git a/sha1_file.c b/sha1_file.c
index 09045df1dc..41383a6c20 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1559,31 +1559,42 @@ int check_sha1_signature(const unsigned char *sha1, 
void *map,
return hashcmp(sha1, real_sha1) ? -1 : 0;
 }
 
-int git_open(const char *name)
+int git_set_cloexec(int fd)
 {
-   static int sha1_file_open_flag = O_NOATIME | O_CLOEXEC;
+   static int cloexec = FD_CLOEXEC;
 
-   for (;;) {
-   int fd;
+   if (cloexec) {
+   if (fcntl(fd, F_SETFD, FD_CLOEXEC) < 0)
+   cloexec = 0;
+   /*
+* We might want to diagnose and complain upon seeing
+* an error from this call, but let's keep the same
+* behaviour as before for now.
+*/
+   }
+   return 0;
+}
 
-   errno = 0;
-   fd = open(name, O_RDONLY | sha1_file_open_flag);
-   if (fd >= 0)
-   return fd;
+int git_open(const char *name)
+{
+   static int noatime = O_NOATIME;
+   int fd;
 
-   /* Try again w/o O_CLOEXEC: the kernel might not support it */
-   if ((sha1_file_open_flag & O_CLOEXEC) && errno == EINVAL) {
-   sha1_file_open_flag &= ~O_CLOEXEC;
-   continue;
-   }
+   errno = 0;
+   fd = open(name, O_RDONLY | noatime);
 
-   /* Might the failure be due to O_NOATIME? */
-   if (errno != ENOENT && (sha1_file_open_flag & O_NOATIME)) {
-   sha1_file_open_flag &= ~O_NOATIME;
-   continue;
-   }
-   return -1;
+   /* Might the 

Re: [RFC PATCH 0/5] recursively grep across submodules

2016-10-27 Thread Stefan Beller
On Thu, Oct 27, 2016 at 4:26 PM, Junio C Hamano  wrote:
> Brandon Williams  writes:
>
>> As for the rest of the series, it should be ready for review or comments.
>
> Just a few brief comments, before reading the patches carefully.
>
>  * It is somewhat surprising that [1/5] is even needed (in other
>words, I would have expected something like this to be already
>there, and my knee-jerk reaction was "Heh, how does 'git status'
>know how to show submodules that are and are not initialized
>differently without this?"

The issue with much of the existing code is that it is submodule centric,
i.e. it is written to not care about the rest.

git status for example just calls "git submodule summary" to
parse and display the submodule information additionally.
It doesn't integrate submodules and treats them "just like files".

git submodule summary then proceeds to use "submodule--helper list"
that lists submodules *only* ignoring all files.

>
>The implementation that reads from the config of the current
>repository may be OK, but I actually would have expected that a
>check would be "given a $path, check to see if $path/.git is
>there and is a valid repository".  In a repository where the
>submodules originate, there may not even be submodule.$name.url
>entries there yet.

My reaction to 1/5 was that the implementation is sound,
but the design may need rethinking.

Instead of asking all these question, "Is a submodule
* initialized
* checked out (== have a working dir)
* have a .git dir (think of deleted submodules that keep the
  historical git dir around)
(* have commit X)
we would want to either extend the submodule-config API
to also carry these informations just like
name/path/sha1/url/shallow clone recommendation.

Obtaining the information above is however not as cheap,
because we'd need to do extra work additionally to parsing
the .gitmodules file. So the submodule-config would need to learn
an input that will tell the submodule-config what informations should
be evaluated and which can be omitted.

>
>  * It is somewhat surprising that [4/5] does not even use the
>previous ls-files to find out the paths.  Also it is a bit
>disappointing to see that the way processes are spawned and
>managed does not share much with Stefan's earlier work, i.e.
>run_processes_parallel().  I was somehow hoping that it can be
>extended to support this use case, but apparently there aren't
>much to be shared.

I think there are 2 issues here:
* The API I designed runs processes in parallel and the order or
  output is non-deterministic. git-grep uses threads and output is
  alphabetically sorted. The order is fixable though (by e.g. adding
  a flag that indicates which parallel processing output the caller
  wants).

* git-grep already has its own thread pool; integrating/combining
  2 worker pools doesn't sound trivial even to someone who wrote
  one of them.
  Maybe we could extend/rewrite the run_processes_parallel
  API to not just run processes, but instead you could also provide
  a function pointer that is used in a thread instead.
  Then we'd have one machinery that e.g. keeps track of the
  number of parallel processes/threads.

Thanks,
Stefan


Re: [PATCH v3 2/3] sha1_file: open window into packfiles with O_CLOEXEC

2016-10-27 Thread Linus Torvalds
On Thu, Oct 27, 2016 at 4:36 PM, Junio C Hamano  wrote:
>
> Would the best endgame shape for this function be to open with
> O_NOATIME (and retry without), and then add CLOEXEC with fcntl(2)
> but ignoring an error from it, I guess?  That would be the closest
> to what we historically had, I would think.

I think that's the best model.

Note that the O_NOATIME code is very much designed to try O_NOATIME
only _once_. Because even when the kernel supports O_NOATIME, if you
have a shared object tree where you may not be the owner of all the
files, a O_NOATIME open can fail with NOPERM ("You are not allowed to
hide your accesses to this file").

This is why it uses that

static unsigned int sha1_file_open_flag = O_NOATIME;

and if the O_NOATIME open ever fails (and the no-O_NOATIME open
succeeds), it clears that flag. Exactly so that it will *not* end up
in some kind of "let's open and fail and re-open" loop. It's designed
to fail once.

Or at least that's how it used to be originally. This code has
obviously changed since that early design. Now it seems to clear it
for any non-ENOENT error. Which looks fine too.

  Linus


Re: [PATCH v3 2/3] sha1_file: open window into packfiles with O_CLOEXEC

2016-10-27 Thread Junio C Hamano
Linus Torvalds  writes:

> But the basic issue still remains - I'd really prefer to have NOATIME
> stay around for all those poor misguided souls that for some reason
> don't like "relatime" or run old kernels. But whether it is with
> O_NOATIME at open time or with F_SETFL, I don't care.

Understood.  

Would the best endgame shape for this function be to open with
O_NOATIME (and retry without), and then add CLOEXEC with fcntl(2)
but ignoring an error from it, I guess?  That would be the closest
to what we historically had, I would think.





Re: feature request

2016-10-27 Thread David Lang

On Thu, 27 Oct 2016, John Rood wrote:


Thanks, I think changing the default for windows is a good idea.


notepad doesn't work well with unix line endings, wordpad handles the files much 
more cleanly.


David Lang


Re: [RFC PATCH 0/5] recursively grep across submodules

2016-10-27 Thread Junio C Hamano
Brandon Williams  writes:

> As for the rest of the series, it should be ready for review or comments.

Just a few brief comments, before reading the patches carefully.

 * It is somewhat surprising that [1/5] is even needed (in other
   words, I would have expected something like this to be already
   there, and my knee-jerk reaction was "Heh, how does 'git status'
   know how to show submodules that are and are not initialized
   differently without this?"  

   The implementation that reads from the config of the current
   repository may be OK, but I actually would have expected that a
   check would be "given a $path, check to see if $path/.git is
   there and is a valid repository".  In a repository where the
   submodules originate, there may not even be submodule.$name.url
   entries there yet.

 * It is somewhat surprising that [4/5] does not even use the
   previous ls-files to find out the paths.  Also it is a bit
   disappointing to see that the way processes are spawned and
   managed does not share much with Stefan's earlier work, i.e.
   run_processes_parallel().  I was somehow hoping that it can be
   extended to support this use case, but apparently there aren't
   much to be shared.

Thanks.


Re: [PATCH v3 2/3] sha1_file: open window into packfiles with O_CLOEXEC

2016-10-27 Thread Linus Torvalds
On Thu, Oct 27, 2016 at 4:09 PM, Linus Torvalds
 wrote:
>
> That said, now that I think about it, I should double-check: maybe
> open() doesn't actually set atime at all, and we *could* do NOATIME
> with SETFL after all.

Checked. Yup. O_NOATIME could easily be done with SETFL:, because as
with O_CLOEXEC, it only affects operations _after_ the open. The open
itself doesn't set the access time.

So I was full of it.

But the basic issue still remains - I'd really prefer to have NOATIME
stay around for all those poor misguided souls that for some reason
don't like "relatime" or run old kernels. But whether it is with
O_NOATIME at open time or with F_SETFL, I don't care.

Linus


Re: feature request

2016-10-27 Thread John Rood
Thanks. I wasn't aware of --no-edit, but that is indeed exactly what I
was looking for.

I think your point about encouraging users to make good use of commit
messages is good.
My concern though is that vim isn't encouraging users to leave good
messages as much as it is scaring them away from leaving messages at
all.


On Thu, Oct 27, 2016 at 5:51 PM, Junio C Hamano  wrote:
> John Rood  writes:
>
> [administrivia: do not top post]
>
>> What I'm really seeking is not a make-shift solution for myself, but
>> an intuitive solution for the novice user-base at large.
>
> Well, there are -m and --no-edit.  Recording commits with useless
> single liner is a bad habit to get into, and change to encourage
> novice user-base at large to do so is not a good idea.


Re: [PATCH v3 2/3] sha1_file: open window into packfiles with O_CLOEXEC

2016-10-27 Thread Linus Torvalds
On Thu, Oct 27, 2016 at 3:56 PM, Junio C Hamano  wrote:
>
> I also thought O_NOATIME shouldn't be an effective fcntl(2) thing,
> for the reasons you stated above, when I was studying the area
> because of the discussion on these patches.  I was surprised to see
> that http://man7.org/linux/man-pages/man2/fcntl.2.html contradicts
> by saying F_SETFL can take O_NOATIME.
>
> Perhaps it deserves a bugreport?

It's not a bug per se.

You can indeed change O_NOATIME with F_SETFL. And it actually does
matter, since atime is normally changed by mmap(), read() and write()
calls.

So F_SETFL O_NOATIME actually does make sense, but it doesn't cover
the atime update done by open() itself.

That said, now that I think about it, I should double-check: maybe
open() doesn't actually set atime at all, and we *could* do NOATIME
with SETFL after all.

The exact [acm]time semantics are damn subtle.

 Linus


Re: Expanding Includes in .gitignore

2016-10-27 Thread Aaron Pelly
On 28/10/16 10:07, Jeff King wrote:
> I think it does
> introduce compatibility complications.

If this is not a show stopper, I am happy to knock out a short
functional spec. I'll give it some hours before I begin.



Re: [PATCH v3 2/3] sha1_file: open window into packfiles with O_CLOEXEC

2016-10-27 Thread Junio C Hamano
Linus Torvalds  writes:

> Note that you can *not* do the same thing with O_NOATIME, since the
> whole point of O_NOATIME is that it changes the behavior of the open
> itself (unlike O_CLOEXEC which changes _later_ behavior, and can
> always be replaced by FD_CLOEXEC fnclt modulo races that are
> immaterial for git)

A tangent.  

I also thought O_NOATIME shouldn't be an effective fcntl(2) thing,
for the reasons you stated above, when I was studying the area
because of the discussion on these patches.  I was surprised to see
that http://man7.org/linux/man-pages/man2/fcntl.2.html contradicts
by saying F_SETFL can take O_NOATIME.  

Perhaps it deserves a bugreport?




Re: feature request

2016-10-27 Thread Junio C Hamano
John Rood  writes:

[administrivia: do not top post]

> What I'm really seeking is not a make-shift solution for myself, but
> an intuitive solution for the novice user-base at large.

Well, there are -m and --no-edit.  Recording commits with useless
single liner is a bad habit to get into, and change to encourage
novice user-base at large to do so is not a good idea.


Re: feature request

2016-10-27 Thread John Rood
What I'm really seeking is not a make-shift solution for myself, but
an intuitive solution for the novice user-base at large.

On Thu, Oct 27, 2016 at 5:27 PM, Junio C Hamano  wrote:
> John Rood  writes:
>
>> I suppose I can do git config --global core.editor notepad
>> However, this really only addresses my second concern.
>>
>> My first concern is that using a text editor at all seems like
>> overkill in many scenarios.
>
> Nobody stops you from writing a "type whatever you want; I won't let
> you edit any mistakes as I am not even a text editor; just hit
> RETURN when you are done, as you can only write a single line"
> program and set it as your GIT_EDITOR.
>
> I do not know what would happen when you need "git commit --amend",
> though.


Re: feature request

2016-10-27 Thread Junio C Hamano
John Rood  writes:

> On Thu, Oct 27, 2016 at 5:30 PM, Stefan Beller  wrote:
>> On Thu, Oct 27, 2016 at 2:55 PM, John Rood  wrote:
>>> Users should be able to configure Git to not send them into a Vim editor.
>>>
>>> When users pull commits, and a new commit needs to be created for a
>>> merge, Git's current way of determining a commit message is to send
>>> the user into a Vim window so that they can write a message. There are
>>> 2 reasons why this might not be the ideal way to prompt for a commit
>>> message.
>>>
>>> 1. Many users are used to writing concise one-line commit messages and
>>> would not expect to save a commit message in a multi-line file. Some
>>> users will wonder why they are in a text editor or which file they are
>>> editing. Others may not, in fact, realize at all that a text editor is
>>> what they are in.
>>
>> Look at the -m option of git commit,
>>

[administrivia: do not top post]

> Thanks, I think changing the default for windows is a good idea.
>
> The -m indeed accomplishes one-line messages when you are voluntarily
> doing a commit. However, the scenario I mentioned is "When users pull
> commits, and a new commit needs to be created for the merge"  In this
> situation, the user isn't issuing the "git commit" command, and so
> he/she doesn't have the opportunity to use the -m flag.

There is --no-edit there.




Re: feature request

2016-10-27 Thread John Rood
Thanks, I think changing the default for windows is a good idea.

The -m indeed accomplishes one-line messages when you are voluntarily
doing a commit. However, the scenario I mentioned is "When users pull
commits, and a new commit needs to be created for the merge"  In this
situation, the user isn't issuing the "git commit" command, and so
he/she doesn't have the opportunity to use the -m flag.

On Thu, Oct 27, 2016 at 5:30 PM, Stefan Beller  wrote:
> On Thu, Oct 27, 2016 at 2:55 PM, John Rood  wrote:
>> Users should be able to configure Git to not send them into a Vim editor.
>>
>> When users pull commits, and a new commit needs to be created for a
>> merge, Git's current way of determining a commit message is to send
>> the user into a Vim window so that they can write a message. There are
>> 2 reasons why this might not be the ideal way to prompt for a commit
>> message.
>>
>> 1. Many users are used to writing concise one-line commit messages and
>> would not expect to save a commit message in a multi-line file. Some
>> users will wonder why they are in a text editor or which file they are
>> editing. Others may not, in fact, realize at all that a text editor is
>> what they are in.
>
> Look at the -m option of git commit,
>
> git commit -a -m "look a commit with no editor, and a precise one line 
> message"
>
> I do not advocate this use though, as I think commit messages should be
> more wordy.
>
>>
>> 2. Many users are not familiar with Vim, and do not understand how to
>> modify, save, and exit. It is not very considerate to require a user
>> to learn Vim in order to finish a commit that they are in the middle
>> of.
>
> That is true, but vi is like the most available editor as a relict
> from ancient times;
> as you are on Windows, maybe notepad is the best on that platform.
>
> Maybe file a bug/issue at https://github.com/git-for-windows to change
> the default?
>
>>
>> The existing behavior should be optional, and there should be two new 
>> options:
>>
>> 1. Use a simple inline prompt for a commit message (in the same way
>> Git might prompt for a username).
>>
>> 2. Automatically assign names for commits in the form of "Merged x into y".


[PATCH 4/5] grep: optionally recurse into submodules

2016-10-27 Thread Brandon Williams
Allow grep to recognize submodules and recursively search for patterns in
each submodule.  This is done by forking off a process to recursively
call grep on each submodule.  The top level --super-prefix option is
used to pass a path to the submodule which can in turn be used to
prepend to output or in pathspec matching logic.

Recursion only occurs for submodules which have been initialized and
checked out by the parent project.  If a submodule hasn't been
initialized and checked out it is simply skipped.

In order to support the existing multi-threading infrastructure in grep,
output from each child process is captured in a strbuf so that it can be
later printed to the console in an ordered fashion.

To limit the number of theads that are created, each child process has
half the number of threads as its parents (minimum of 1), otherwise we
potentailly have a fork-bomb.

Signed-off-by: Brandon Williams 
---
 Documentation/git-grep.txt |   5 +
 builtin/grep.c | 301 ++---
 git.c  |   2 +-
 t/t7814-grep-recurse-submodules.sh |  99 
 4 files changed, 386 insertions(+), 21 deletions(-)
 create mode 100755 t/t7814-grep-recurse-submodules.sh

diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
index 0ecea6e49..17aa1ba70 100644
--- a/Documentation/git-grep.txt
+++ b/Documentation/git-grep.txt
@@ -26,6 +26,7 @@ SYNOPSIS
   [--threads ]
   [-f ] [-e] 
   [--and|--or|--not|(|)|-e ...]
+  [--recurse-submodules]
   [ [--[no-]exclude-standard] [--cached | --no-index | --untracked] | 
...]
   [--] [...]
 
@@ -88,6 +89,10 @@ OPTIONS
mechanism.  Only useful when searching files in the current directory
with `--no-index`.
 
+--recurse-submodules::
+   Recursively search in each submodule that has been initialized and
+   checked out in the repository.
+
 -a::
 --text::
Process binary files as if they were text.
diff --git a/builtin/grep.c b/builtin/grep.c
index 8887b6add..f34f16df9 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -18,12 +18,20 @@
 #include "quote.h"
 #include "dir.h"
 #include "pathspec.h"
+#include "submodule.h"
 
 static char const * const grep_usage[] = {
N_("git grep [] [-e]  [...] [[--] ...]"),
NULL
 };
 
+static const char *super_prefix;
+static int recurse_submodules;
+static struct argv_array submodule_options = ARGV_ARRAY_INIT;
+
+static int grep_submodule_launch(struct grep_opt *opt,
+const struct grep_source *gs);
+
 #define GREP_NUM_THREADS_DEFAULT 8
 static int num_threads;
 
@@ -174,7 +182,10 @@ static void *run(void *arg)
break;
 
opt->output_priv = w;
-   hit |= grep_source(opt, >source);
+   if (w->source.type == GREP_SOURCE_SUBMODULE)
+   hit |= grep_submodule_launch(opt, >source);
+   else
+   hit |= grep_source(opt, >source);
grep_source_clear_data(>source);
work_done(w);
}
@@ -300,6 +311,10 @@ static int grep_sha1(struct grep_opt *opt, const unsigned 
char *sha1,
if (opt->relative && opt->prefix_length) {
quote_path_relative(filename + tree_name_len, opt->prefix, 
);
strbuf_insert(, 0, filename, tree_name_len);
+   } else if (super_prefix) {
+   strbuf_add(, filename, tree_name_len);
+   strbuf_addstr(, super_prefix);
+   strbuf_addstr(, filename + tree_name_len);
} else {
strbuf_addstr(, filename);
}
@@ -328,10 +343,13 @@ static int grep_file(struct grep_opt *opt, const char 
*filename)
 {
struct strbuf buf = STRBUF_INIT;
 
-   if (opt->relative && opt->prefix_length)
+   if (opt->relative && opt->prefix_length) {
quote_path_relative(filename, opt->prefix, );
-   else
+   } else {
+   if (super_prefix)
+   strbuf_addstr(, super_prefix);
strbuf_addstr(, filename);
+   }
 
 #ifndef NO_PTHREADS
if (num_threads) {
@@ -378,31 +396,259 @@ static void run_pager(struct grep_opt *opt, const char 
*prefix)
exit(status);
 }
 
-static int grep_cache(struct grep_opt *opt, const struct pathspec *pathspec, 
int cached)
+static void compile_submodule_options(const struct grep_opt *opt,
+ const struct pathspec *pathspec,
+ int cached, int untracked,
+ int opt_exclude, int use_index,
+ int pattern_type_arg)
+{
+   struct grep_pat *pattern;
+   int i;
+
+   if (recurse_submodules)
+   argv_array_push(_options, "--recurse-submodules");
+
+   if (cached)
+   argv_array_push(_options, 

[PATCH 5/5] grep: enable recurse-submodules to work on objects

2016-10-27 Thread Brandon Williams
Teach grep to recursively search in submodules when provided with a
 object. This allows grep to search a submodule based on the state
of the submodule that is present in a commit of the super project.

When grep is provided with a  object, the name of the object is
prefixed to all output.  In order to provide uniformity of output
between the parent and child processes the option `--parent-basename`
has been added so that the child can preface all of it's output with the
name of the parent's object instead of the name of the commit SHA1 of
the submodule. This changes output from the command
`git grep -e. -l --recurse-submodules HEAD` from:
HEAD:file
:sub/file

to:
HEAD:file
HEAD:sub/file

Signed-off-by: Brandon Williams 
---
 Documentation/git-grep.txt | 13 ++--
 builtin/grep.c | 67 +++---
 t/t7814-grep-recurse-submodules.sh | 44 -
 tree-walk.c| 17 +-
 4 files changed, 125 insertions(+), 16 deletions(-)

diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
index 17aa1ba70..386a868c6 100644
--- a/Documentation/git-grep.txt
+++ b/Documentation/git-grep.txt
@@ -26,7 +26,7 @@ SYNOPSIS
   [--threads ]
   [-f ] [-e] 
   [--and|--or|--not|(|)|-e ...]
-  [--recurse-submodules]
+  [--recurse-submodules] [--parent-basename]
   [ [--[no-]exclude-standard] [--cached | --no-index | --untracked] | 
...]
   [--] [...]
 
@@ -91,7 +91,16 @@ OPTIONS
 
 --recurse-submodules::
Recursively search in each submodule that has been initialized and
-   checked out in the repository.
+   checked out in the repository.  When used in combination with the
+option the prefix of all submodule output will be the name of
+   the parent project's  object.
+
+--parent-basename::
+   For internal use only.  In order to produce uniform output with the
+   --recurse-submodules option, this option can be used to provide the
+   basename of a parent's  object to a submodule so the submodule
+   can prefix its output with the parent's name rather than the SHA1 of
+   the submodule.
 
 -a::
 --text::
diff --git a/builtin/grep.c b/builtin/grep.c
index f34f16df9..bdf1b9089 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -19,6 +19,7 @@
 #include "dir.h"
 #include "pathspec.h"
 #include "submodule.h"
+#include "submodule-config.h"
 
 static char const * const grep_usage[] = {
N_("git grep [] [-e]  [...] [[--] ...]"),
@@ -28,6 +29,7 @@ static char const * const grep_usage[] = {
 static const char *super_prefix;
 static int recurse_submodules;
 static struct argv_array submodule_options = ARGV_ARRAY_INIT;
+static const char *parent_basename;
 
 static int grep_submodule_launch(struct grep_opt *opt,
 const struct grep_source *gs);
@@ -535,6 +537,7 @@ static int grep_submodule_launch(struct grep_opt *opt,
 {
struct child_process cp = CHILD_PROCESS_INIT;
int status, i;
+   const char *end_of_base;
struct work_item *w = opt->output_priv;
 
prepare_submodule_repo_env(_array);
@@ -545,9 +548,36 @@ static int grep_submodule_launch(struct grep_opt *opt,
 gs->path);
argv_array_push(, "grep");
 
+   /*
+* Add basename of parent project
+* When performing grep on a  object the filename is prefixed
+* with the object's name: ':filename'.  In order to
+* provide uniformity of output we want to pass the name of the
+* parent project's object name to the submodule so the submodule can
+* prefix its output with the parent's name and not its own SHA1.
+*/
+   end_of_base = strchr(gs->name, ':');
+   if (end_of_base)
+   argv_array_pushf(, "--parent-basename=%.*s",
+(int) (end_of_base - gs->name),
+gs->name);
+
/* Add options */
-   for (i = 0; i < submodule_options.argc; i++)
+   for (i = 0; i < submodule_options.argc; i++) {
+   /*
+* If there is a  identifier for the submodule, add the
+* rev after adding the submodule options but before the
+* pathspecs.  To do this we listen for the '--' and insert the
+* sha1 before pushing the '--' onto the child process argv
+* array.
+*/
+   if (gs->identifier &&
+   !strcmp("--", submodule_options.argv[i])) {
+   argv_array_push(, sha1_to_hex(gs->identifier));
+   }
+
argv_array_push(, submodule_options.argv[i]);
+   }
 
cp.git_cmd = 1;
cp.dir = gs->path;
@@ -672,12 +702,21 @@ static int grep_tree(struct grep_opt *opt, const struct 
pathspec *pathspec,
enum interesting match = entry_not_interesting;

[PATCH 1/5] submodules: add helper functions to determine presence of submodules

2016-10-27 Thread Brandon Williams
Add two helper functions to submodules.c.
`is_submodule_initialized()` checks if a submodule has been initialized
at a given path and `is_submodule_checked_out()` check if a submodule
has been checked out at a given path.

Signed-off-by: Brandon Williams 
---
 submodule.c | 39 +++
 submodule.h |  2 ++
 2 files changed, 41 insertions(+)

diff --git a/submodule.c b/submodule.c
index 6f7d883de..029b24440 100644
--- a/submodule.c
+++ b/submodule.c
@@ -198,6 +198,45 @@ void gitmodules_config(void)
}
 }
 
+/*
+ * Determine if a submodule has been initialized at a given 'path'
+ */
+int is_submodule_initialized(const char *path)
+{
+   int ret = 0;
+   const struct submodule *module = NULL;
+
+   module = submodule_from_path(null_sha1, path);
+
+   if (module) {
+   struct strbuf buf = STRBUF_INIT;
+   char *submodule_url = NULL;
+
+   strbuf_addf(, "submodule.%s.url",module->name);
+   ret = !git_config_get_string(buf.buf, _url);
+
+   free(submodule_url);
+   strbuf_release();
+   }
+
+   return ret;
+}
+
+/*
+ * Determine if a submodule has been checked out at a given 'path'
+ */
+int is_submodule_checked_out(const char *path)
+{
+   int ret = 0;
+   struct strbuf buf = STRBUF_INIT;
+
+   strbuf_addf(, "%s/.git", path);
+   ret = file_exists(buf.buf);
+
+   strbuf_release();
+   return ret;
+}
+
 int parse_submodule_update_strategy(const char *value,
struct submodule_update_strategy *dst)
 {
diff --git a/submodule.h b/submodule.h
index d9e197a94..bd039ca98 100644
--- a/submodule.h
+++ b/submodule.h
@@ -37,6 +37,8 @@ void set_diffopt_flags_from_submodule_config(struct 
diff_options *diffopt,
const char *path);
 int submodule_config(const char *var, const char *value, void *cb);
 void gitmodules_config(void);
+extern int is_submodule_initialized(const char *path);
+extern int is_submodule_checked_out(const char *path);
 int parse_submodule_update_strategy(const char *value,
struct submodule_update_strategy *dst);
 const char *submodule_strategy_to_string(const struct 
submodule_update_strategy *s);
-- 
2.10.1.613.g6021889



[PATCH 3/5] grep: add submodules as a grep source type

2016-10-27 Thread Brandon Williams
Add `GREP_SOURCE_SUBMODULE` as a grep_source type and cases for this new
type in the various switch statements in grep.c.

When initializing a grep_source with type `GREP_SOURCE_SUBMODULE` the
identifier can either be NULL (to indicate that the working tree will be
used) or a SHA1 (the REV of the submodule to be grep'd).  If the
identifier is a SHA1 then we want to fall through to the
`GREP_SOURCE_SHA1` case to handle the copying of the SHA1.

Signed-off-by: Brandon Williams 
---
 grep.c | 16 +++-
 grep.h |  1 +
 2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/grep.c b/grep.c
index 1194d35b5..0dbdc1d00 100644
--- a/grep.c
+++ b/grep.c
@@ -1735,12 +1735,23 @@ void grep_source_init(struct grep_source *gs, enum 
grep_source_type type,
case GREP_SOURCE_FILE:
gs->identifier = xstrdup(identifier);
break;
+   case GREP_SOURCE_SUBMODULE:
+   if (!identifier) {
+   gs->identifier = NULL;
+   break;
+   }
+   /*
+* FALL THROUGH
+* If the identifier is non-NULL (in the submodule case) it
+* will be a SHA1 that needs to be copied.
+*/
case GREP_SOURCE_SHA1:
gs->identifier = xmalloc(20);
hashcpy(gs->identifier, identifier);
break;
case GREP_SOURCE_BUF:
gs->identifier = NULL;
+   break;
}
 }
 
@@ -1760,6 +1771,7 @@ void grep_source_clear_data(struct grep_source *gs)
switch (gs->type) {
case GREP_SOURCE_FILE:
case GREP_SOURCE_SHA1:
+   case GREP_SOURCE_SUBMODULE:
free(gs->buf);
gs->buf = NULL;
gs->size = 0;
@@ -1831,8 +1843,10 @@ static int grep_source_load(struct grep_source *gs)
return grep_source_load_sha1(gs);
case GREP_SOURCE_BUF:
return gs->buf ? 0 : -1;
+   case GREP_SOURCE_SUBMODULE:
+   break;
}
-   die("BUG: invalid grep_source type");
+   die("BUG: invalid grep_source type to load");
 }
 
 void grep_source_load_driver(struct grep_source *gs)
diff --git a/grep.h b/grep.h
index 5856a23e4..267534ca2 100644
--- a/grep.h
+++ b/grep.h
@@ -161,6 +161,7 @@ struct grep_source {
GREP_SOURCE_SHA1,
GREP_SOURCE_FILE,
GREP_SOURCE_BUF,
+   GREP_SOURCE_SUBMODULE,
} type;
void *identifier;
 
-- 
2.10.1.613.g6021889



[PATCH 2/5] submodules: load gitmodules file from commit sha1

2016-10-27 Thread Brandon Williams
teach submodules to load a '.gitmodules' file from a commit sha1.  This
enables the population of the submodule_cache to be based on the state
of the '.gitmodules' file from a particular commit.

Signed-off-by: Brandon Williams 
---
 cache.h|  2 ++
 config.c   |  8 
 submodule-config.c |  6 +++---
 submodule-config.h |  3 +++
 submodule.c| 12 
 submodule.h|  1 +
 6 files changed, 25 insertions(+), 7 deletions(-)

diff --git a/cache.h b/cache.h
index f7ee41456..74b0c3cba 100644
--- a/cache.h
+++ b/cache.h
@@ -1681,6 +1681,8 @@ extern int git_default_config(const char *, const char *, 
void *);
 extern int git_config_from_file(config_fn_t fn, const char *, void *);
 extern int git_config_from_mem(config_fn_t fn, const enum config_origin_type,
const char *name, const char *buf, 
size_t len, void *data);
+extern int git_config_from_blob_sha1(config_fn_t fn, const char *name,
+const unsigned char *sha1, void *data);
 extern void git_config_push_parameter(const char *text);
 extern int git_config_from_parameters(config_fn_t fn, void *data);
 extern void git_config(config_fn_t fn, void *);
diff --git a/config.c b/config.c
index 83fdecb1b..4d78e7227 100644
--- a/config.c
+++ b/config.c
@@ -1214,10 +1214,10 @@ int git_config_from_mem(config_fn_t fn, const enum 
config_origin_type origin_typ
return do_config_from(, fn, data);
 }
 
-static int git_config_from_blob_sha1(config_fn_t fn,
-const char *name,
-const unsigned char *sha1,
-void *data)
+int git_config_from_blob_sha1(config_fn_t fn,
+ const char *name,
+ const unsigned char *sha1,
+ void *data)
 {
enum object_type type;
char *buf;
diff --git a/submodule-config.c b/submodule-config.c
index 098085be6..8b9a2ef28 100644
--- a/submodule-config.c
+++ b/submodule-config.c
@@ -379,9 +379,9 @@ static int parse_config(const char *var, const char *value, 
void *data)
return ret;
 }
 
-static int gitmodule_sha1_from_commit(const unsigned char *commit_sha1,
- unsigned char *gitmodules_sha1,
- struct strbuf *rev)
+int gitmodule_sha1_from_commit(const unsigned char *commit_sha1,
+  unsigned char *gitmodules_sha1,
+  struct strbuf *rev)
 {
int ret = 0;
 
diff --git a/submodule-config.h b/submodule-config.h
index d05c542d2..78584ba6a 100644
--- a/submodule-config.h
+++ b/submodule-config.h
@@ -29,6 +29,9 @@ const struct submodule *submodule_from_name(const unsigned 
char *commit_sha1,
const char *name);
 const struct submodule *submodule_from_path(const unsigned char *commit_sha1,
const char *path);
+extern int gitmodule_sha1_from_commit(const unsigned char *commit_sha1,
+ unsigned char *gitmodules_sha1,
+ struct strbuf *rev);
 void submodule_free(void);
 
 #endif /* SUBMODULE_CONFIG_H */
diff --git a/submodule.c b/submodule.c
index 029b24440..f2a56689f 100644
--- a/submodule.c
+++ b/submodule.c
@@ -198,6 +198,18 @@ void gitmodules_config(void)
}
 }
 
+void gitmodules_config_sha1(const unsigned char *commit_sha1)
+{
+   struct strbuf rev = STRBUF_INIT;
+   unsigned char sha1[20];
+
+   if (gitmodule_sha1_from_commit(commit_sha1, sha1, )) {
+   git_config_from_blob_sha1(submodule_config, rev.buf,
+ sha1, NULL);
+   }
+   strbuf_release();
+}
+
 /*
  * Determine if a submodule has been initialized at a given 'path'
  */
diff --git a/submodule.h b/submodule.h
index bd039ca98..9a24ac82e 100644
--- a/submodule.h
+++ b/submodule.h
@@ -37,6 +37,7 @@ void set_diffopt_flags_from_submodule_config(struct 
diff_options *diffopt,
const char *path);
 int submodule_config(const char *var, const char *value, void *cb);
 void gitmodules_config(void);
+extern void gitmodules_config_sha1(const unsigned char *commit_sha1);
 extern int is_submodule_initialized(const char *path);
 extern int is_submodule_checked_out(const char *path);
 int parse_submodule_update_strategy(const char *value,
-- 
2.10.1.613.g6021889



Re: [PATCH v3 2/3] sha1_file: open window into packfiles with O_CLOEXEC

2016-10-27 Thread Linus Torvalds
On Thu, Oct 27, 2016 at 3:24 AM, Jeff King  wrote:
>
> +cc Linus as the original author of 144bde78e9 in case there is
> something subtle I'm missing, but this really just seems like it's
> an outdated optimization.

I'd *really* like to keep O_NOATIME if at all possible. It made a huge
difference on older kernels, and I'm not convinced that relatime
really fixes it as well as O_NOATIME.

There are people who don't like relatime. And even if you do have
relatime enabled, it will update atime once every day, so then this
makes your filesystem have a storm of nasty inode writebacks if you
haven't touched that git repo in a while.

If this is purely about mixing things up with O_CLOEXEC, then that is
*trivially* fixable by just using

 fcntl(fd, F_SETFD, FD_CLOEXEC);

after the open.

Which is what you have to do anyway if you want to be portable, so
its' not like you could avoid that complexity in the first place.

Note that you can *not* do the same thing with O_NOATIME, since the
whole point of O_NOATIME is that it changes the behavior of the open
itself (unlike O_CLOEXEC which changes _later_ behavior, and can
always be replaced by FD_CLOEXEC fnclt modulo races that are
immaterial for git)

 Linus


[RFC PATCH 0/5] recursively grep across submodules

2016-10-27 Thread Brandon Williams
This patch series adds some basic api functions to the submodule interface as
well as teaching grep to recursively search in submodules.

The additions to the submodule interface allow grep to verify that a submodule
has been initialized and checked out prior to launching a child process.  One
issue that still needs to be worked out is when greppig history, you could be
in a state where the submodule doesn't have a working tree (or the path you had
in the past doesn't match what currently exists) so instead of changing
directory into the submdoule you need to look for the .git directory for the
submodule in the parents .git/modules directory.  If it exists we would need to
change directory to .git/modules/ and then run the child process
from there.  This currently doesn't work due to commit <10f5c52656> since the 
GIT_DIR env variable is explicitly set to be '.git'.  I'm going to spend some
more time thinking about this problem and will address it as an additional 
patch in
the series at a later time.

As for the rest of the series, it should be ready for review or comments.


Brandon Williams (5):
  submodules: add helper functions to determine presence of submodules
  submodules: load gitmodules file from commit sha1
  grep: add submodules as a grep source type
  grep: optionally recurse into submodules
  grep: enable recurse-submodules to work on  objects

 Documentation/git-grep.txt |  14 ++
 builtin/grep.c | 364 ++---
 cache.h|   2 +
 config.c   |   8 +-
 git.c  |   2 +-
 grep.c |  16 +-
 grep.h |   1 +
 submodule-config.c |   6 +-
 submodule-config.h |   3 +
 submodule.c|  51 ++
 submodule.h|   3 +
 t/t7814-grep-recurse-submodules.sh | 141 ++
 tree-walk.c|  17 +-
 13 files changed, 588 insertions(+), 40 deletions(-)
 create mode 100755 t/t7814-grep-recurse-submodules.sh

-- 
2.10.1.613.g6021889



Re: Expanding Includes in .gitignore

2016-10-27 Thread Aaron Pelly
On 28/10/16 10:07, Jeff King wrote:
> I think it does
> introduce compatibility complications. People with older git will not
> respect your .gitignore.d files. Whereas $GIT_DIR/info is purely a local
> matter.

I know. I don't think it's a serious compatibility issue, but more
important people may disagree, in which case I'll have to review my
stance. I think the pros of it working everywhere outweigh the cons.

I will say that I am a proponent of consistency and obviousness. I would
rather this worked the same everywhere or it was a completely novel
component in $GIT_DIR.

I don't think I like the idea much, but it would be possible to
magically generate an ignore file from a directory.

> But perhaps there is a use case I'm missing.

A new repo where the dev tools haven't been standardised yet. New tool,
new file. Simple, and somewhat self documenting as a table of contents
in .gitignore.d



Re: feature request

2016-10-27 Thread Stefan Beller
On Thu, Oct 27, 2016 at 2:55 PM, John Rood  wrote:
> Users should be able to configure Git to not send them into a Vim editor.
>
> When users pull commits, and a new commit needs to be created for a
> merge, Git's current way of determining a commit message is to send
> the user into a Vim window so that they can write a message. There are
> 2 reasons why this might not be the ideal way to prompt for a commit
> message.
>
> 1. Many users are used to writing concise one-line commit messages and
> would not expect to save a commit message in a multi-line file. Some
> users will wonder why they are in a text editor or which file they are
> editing. Others may not, in fact, realize at all that a text editor is
> what they are in.

Look at the -m option of git commit,

git commit -a -m "look a commit with no editor, and a precise one line message"

I do not advocate this use though, as I think commit messages should be
more wordy.

>
> 2. Many users are not familiar with Vim, and do not understand how to
> modify, save, and exit. It is not very considerate to require a user
> to learn Vim in order to finish a commit that they are in the middle
> of.

That is true, but vi is like the most available editor as a relict
from ancient times;
as you are on Windows, maybe notepad is the best on that platform.

Maybe file a bug/issue at https://github.com/git-for-windows to change
the default?

>
> The existing behavior should be optional, and there should be two new options:
>
> 1. Use a simple inline prompt for a commit message (in the same way
> Git might prompt for a username).
>
> 2. Automatically assign names for commits in the form of "Merged x into y".


Re: feature request

2016-10-27 Thread Junio C Hamano
John Rood  writes:

> I suppose I can do git config --global core.editor notepad
> However, this really only addresses my second concern.
>
> My first concern is that using a text editor at all seems like
> overkill in many scenarios.

Nobody stops you from writing a "type whatever you want; I won't let
you edit any mistakes as I am not even a text editor; just hit
RETURN when you are done, as you can only write a single line"
program and set it as your GIT_EDITOR.

I do not know what would happen when you need "git commit --amend",
though.


Re: feature request

2016-10-27 Thread John Rood
I suppose I can do git config --global core.editor notepad
However, this really only addresses my second concern.

My first concern is that using a text editor at all seems like
overkill in many scenarios.
For that reason, I still think the other two options I mentioned would
be beneficial.

On Thu, Oct 27, 2016 at 5:05 PM, John Rood  wrote:
> Unfortunately, in my case I'm on windows (my company's choice, not mine).
>
> On Thu, Oct 27, 2016 at 5:01 PM, Stefan Beller  wrote:
>> On Thu, Oct 27, 2016 at 2:55 PM, John Rood  wrote:
>>> Users should be able to configure Git to not send them into a Vim editor.
>>
>> See https://git-scm.com/docs/git-var
>>
>> GIT_EDITOR
>>
>> Text editor for use by Git commands. The value is meant to be interpreted
>> by the shell when it is used. Examples: ~/bin/vi, $SOME_ENVIRONMENT_VARIABLE,
>> "C:\Program Files\Vim\gvim.exe" --nofork. The order of preference is the
>> $GIT_EDITOR environment variable, then core.editor configuration, then
>> $VISUAL, then $EDITOR, and then the default chosen at compile time,
>> which is usually vi.
>>
>>
>> So maybe
>>
>> git config --global core.editor "nano"
>>
>> helps in your case?


[PATCH] attr: convert to new threadsafe API

2016-10-27 Thread Stefan Beller
This revamps the API of the attr subsystem to be thread safe.
Before we had the question and its results in one struct type.
The typical usage of the API was

static struct git_attr_check *check;

if (!check)
check = git_attr_check_initl("text", NULL);

git_check_attr(path, check);
act_on(check->value[0]);

This has a couple of issues when it comes to thread safety:

* the initialization is racy in this implementation. To make it
  thread safe, we need to acquire a mutex, such that only one
  thread is executing the code in git_attr_check_initl.
  As we do not want to introduce a mutex at each call site,
  this is best done in the attr code. However to do so, we need
  to have access to the `check` variable, i.e. the API has to
  look like
git_attr_check_initl(struct git_attr_check**, ...);
  Then one of the threads calling git_attr_check_initl will
  acquire the mutex and init the `check`, while all other threads
  will wait on the mutex just to realize they're late to the
  party and they'll return with no work done.

* While the check for attributes to be questioned only need to
  be initalized once as that part will be read only after its
  initialisation, the answer may be different for each path.
  Because of that we need to decouple the check and the answer,
  such that each thread can obtain an answer for the path it
  is currently processing.

This commit changes the API and adds locking in
git_attr_check_initl that provides the thread safety for constructing
`struct git_attr_check`.

The usage of the new API will be:

/*
 * The initl call will thread-safely check whether the
 * struct git_attr_check has been initialized. We only
 * want to do the initialization work once, hence we do
 * that work inside a thread safe environment.
 */
static struct git_attr_check *check;
git_attr_check_initl(, "text", NULL);

/* We're just asking for one attribute "text". */
git_attr_result myresult[1];

/* Perform the check and act on it: */
git_check_attr(path, check, myresult);
act_on(myresult[0].value);

/*
 * No need to free the check as it is static, hence doesn't leak
 * memory. The result is also static, so no need to free there either.
 */

Signed-off-by: Stefan Beller 
---

* use attr_start on Windows to dynamically initialize the Single Big Attr Mutex
* rewrote the documentation by Junios guiding comments
* interned struct git_attr, i.e. you hand "const char *" to the attr API, and
  internally it will make sense of it. When asking git_all_attrs, however you
  still need to know about git_attrs as you need to find out the name of the
  attribute via `git_attr_name(struct git_attr *)`.
  
If there are no other huge design discussions, I'll reroll the whole series.

Thanks,
Stefan

 Documentation/technical/api-gitattributes.txt | 104 +++---
 archive.c |  11 +-
 attr.c| 150 ++
 attr.h|  76 +++--
 builtin/check-attr.c  |  47 
 builtin/pack-objects.c|  16 +--
 compat/mingw.c|   3 +
 convert.c |  40 +++
 ll-merge.c|  24 +++--
 userdiff.c|  16 +--
 ws.c  |   8 +-
 11 files changed, 305 insertions(+), 190 deletions(-)

diff --git a/Documentation/technical/api-gitattributes.txt 
b/Documentation/technical/api-gitattributes.txt
index 92fc32a..ba04c30 100644
--- a/Documentation/technical/api-gitattributes.txt
+++ b/Documentation/technical/api-gitattributes.txt
@@ -16,15 +16,19 @@ Data Structure
of no interest to the calling programs.  The name of the
attribute can be retrieved by calling `git_attr_name()`.
 
-`struct git_attr_check_elem`::
-
-   This structure represents one attribute and its value.
-
 `struct git_attr_check`::
 
-   This structure represents a collection of `git_attr_check_elem`.
+   This structure represents a collection of `struct git_attrs`.
It is passed to `git_check_attr()` function, specifying the
-   attributes to check, and receives their values.
+   attributes to check, and receives their values into a corresponding
+   `struct git_attr_result`.
+
+`struct git_attr_result`::
+
+   This structure represents one results for a check, such that an
+   array of `struct git_attr_results` corresponds to a
+   `struct git_attr_check`. The answers given in that array are in
+   the the same order as the check struct.
 
 
 Attribute Values
@@ -32,7 +36,7 @@ Attribute Values
 
 An attribute for a path can be in one of four states: Set, Unset,
 Unspecified or set to a string, and `.value` member of `struct
-git_attr_check` records 

Re: Expanding Includes in .gitignore

2016-10-27 Thread Aaron Pelly
On 28/10/16 10:55, Aaron Pelly wrote:
> 2) I fetch a repo with a hostile ignore file. It includes files from
> $GIT_DIR/test-data/ssl/private or some such. Change. Don't pay
> attention. Commit. Push. Problems if my test data comes from production.
> 
> Is this mitigated currently?
> 
> Not that git should be an enabler, but surely it falls on the user of
> untrusted software to ensure their own security?

Balls, I meant $GIT_WORK_TREE not $GIT_DIR



[PATCH] valgrind: support test helpers

2016-10-27 Thread René Scharfe
Tests run with --valgrind call git commands through a wrapper script
that invokes valgrind on them.  This script (valgrind.sh) is in turn
invoked through symlinks created for each command in t/valgrind/bin/.

Since e6e7530d (test helpers: move test-* to t/helper/ subdirectory)
these symlinks have been broken for test helpers -- they point to the
old locations in the root of the build directory.  Fix that by teaching
the code for creating the links about the new location of the binaries,
and do the same in the wrapper script to allow it to find its payload.

Signed-off-by: Rene Scharfe 
---
 t/test-lib.sh  |  9 -
 t/valgrind/valgrind.sh | 12 ++--
 2 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/t/test-lib.sh b/t/test-lib.sh
index b859db6..a724181 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -809,7 +809,14 @@ then
return;
 
base=$(basename "$1")
-   symlink_target=$GIT_BUILD_DIR/$base
+   case "$base" in
+   test-*)
+   symlink_target="$GIT_BUILD_DIR/t/helper/$base"
+   ;;
+   *)
+   symlink_target="$GIT_BUILD_DIR/$base"
+   ;;
+   esac
# do not override scripts
if test -x "$symlink_target" &&
test ! -d "$symlink_target" &&
diff --git a/t/valgrind/valgrind.sh b/t/valgrind/valgrind.sh
index 4215303..669ebaf 100755
--- a/t/valgrind/valgrind.sh
+++ b/t/valgrind/valgrind.sh
@@ -1,11 +1,19 @@
 #!/bin/sh
 
 base=$(basename "$0")
+case "$base" in
+test-*)
+   program="$GIT_VALGRIND/../../t/helper/$base"
+   ;;
+*)
+   program="$GIT_VALGRIND/../../$base"
+   ;;
+esac
 
 TOOL_OPTIONS='--leak-check=no'
 
 test -z "$GIT_VALGRIND_ENABLED" &&
-exec "$GIT_VALGRIND"/../../"$base" "$@"
+exec "$program" "$@"
 
 case "$GIT_VALGRIND_MODE" in
 memcheck-fast)
@@ -29,4 +37,4 @@ exec valgrind -q --error-exitcode=126 \
--log-fd=4 \
--input-fd=4 \
$GIT_VALGRIND_OPTIONS \
-   "$GIT_VALGRIND"/../../"$base" "$@"
+   "$program" "$@"
-- 
2.10.1


Re: feature request

2016-10-27 Thread John Rood
Unfortunately, in my case I'm on windows (my company's choice, not mine).

On Thu, Oct 27, 2016 at 5:01 PM, Stefan Beller  wrote:
> On Thu, Oct 27, 2016 at 2:55 PM, John Rood  wrote:
>> Users should be able to configure Git to not send them into a Vim editor.
>
> See https://git-scm.com/docs/git-var
>
> GIT_EDITOR
>
> Text editor for use by Git commands. The value is meant to be interpreted
> by the shell when it is used. Examples: ~/bin/vi, $SOME_ENVIRONMENT_VARIABLE,
> "C:\Program Files\Vim\gvim.exe" --nofork. The order of preference is the
> $GIT_EDITOR environment variable, then core.editor configuration, then
> $VISUAL, then $EDITOR, and then the default chosen at compile time,
> which is usually vi.
>
>
> So maybe
>
> git config --global core.editor "nano"
>
> helps in your case?


Re: feature request

2016-10-27 Thread Stefan Beller
On Thu, Oct 27, 2016 at 2:55 PM, John Rood  wrote:
> Users should be able to configure Git to not send them into a Vim editor.

See https://git-scm.com/docs/git-var

GIT_EDITOR

Text editor for use by Git commands. The value is meant to be interpreted
by the shell when it is used. Examples: ~/bin/vi, $SOME_ENVIRONMENT_VARIABLE,
"C:\Program Files\Vim\gvim.exe" --nofork. The order of preference is the
$GIT_EDITOR environment variable, then core.editor configuration, then
$VISUAL, then $EDITOR, and then the default chosen at compile time,
which is usually vi.


So maybe

git config --global core.editor "nano"

helps in your case?


Re: [PATCH] compat: Allow static initializer for pthreads on Windows

2016-10-27 Thread Stefan Beller
> there isn't really a great place to put a dynamic initialization.

Answering this question specifically (Where to put it?),
I am about to send out a patch that puts it in compat/mingw.c:2232:

diff --git a/compat/mingw.c b/compat/mingw.c
index 3fbfda5..9881c3d 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -2232,6 +2232,9 @@ void mingw_startup(void)
 /* initialize critical section for waitpid pinfo_t list */
 InitializeCriticalSection(_cs);

+/* initialize critical sections in the attr code */
+start_attr();
+
 /* set up default file mode and file modes for stdin/out/err */
 _fmode = _O_BINARY;
 _setmode(_fileno(stdin), _O_BINARY);

If I understood Johannes correctly this is the place to put it.
Junio seems to be ok with static mutex init for non windows platforms,
so I put it into the mingw specifc startup code.

Thanks,
Stefan


Re: Expanding Includes in .gitignore

2016-10-27 Thread Aaron Pelly
On 28/10/16 09:55, Jeff King wrote:
> On Fri, Oct 28, 2016 at 09:28:23AM +1300, Aaron Pelly wrote:
> 
>>>   - we parse possibly-hostile .gitignore files from cloned repositories.
>>> What happens when I include ask to include /etc/passwd? Probably
>>> nothing, but there are setups where it might matter (e.g., something
>>> like Travis that auto-builds untrusted repositories, and you could
>>> potentially leak the contents of files via error messages). It's
>>> nice to avoid the issue entirely.
>>
>> I understand the issue.
>>
>> It's not obvious to me how using a .d solves this problem though.
> 
> It doesn't by itself. But we are worried only about tracked .gitignore
> files (recall that even repo-level files in $GIT_DIR/info are generated
> fresh by the clone process, and don't come from the remote). If we apply
> the feature only to core.excludeFile and $GIT_DIR/info/exclude, those
> are already under the user's control.

The things you say make sense from this perspective.

I was hoping to employ this mechanism throughout the git ecosystem.

Thinking out loud for a minute:

1) I clone a repo with a hostile ignore file. It includes files from
/etc/ssl/private or some such. Change. Don't pay attention. Commit.
Push. Problems.

What is the use case for reaching out of the repo in the first place?

2) I fetch a repo with a hostile ignore file. It includes files from
$GIT_DIR/test-data/ssl/private or some such. Change. Don't pay
attention. Commit. Push. Problems if my test data comes from production.

Is this mitigated currently?

Not that git should be an enabler, but surely it falls on the user of
untrusted software to ensure their own security?

> It's true that we could make a similar exception for an "include"
> feature, and respect include directives only in those "safe" files.
> Somehow that seems more confusing to me, though, than doing adding the
> feature at the file level, as it introduces slightly varying syntax
> between the locations.

I'm quickly getting over the include file idea. But yes, that would be
non obvious.

>>> Whereas letting any of the user- or repo-level exclude files be a
>>> directory, and simply reading all of the files inside, seems simple and
>>> obvious.
>>
>> Apart from backwards compatibility, unless there's something I'm missing.
> 
> I'm not sure what you mean. If we make:
> 
>   mkdir .git/info/exclude
>   echo whatever >.git/info/exclude/local
> 
> work, I don't think we have to care about backwards compatibility. That
> was nonsensical before, and never did anything useful (so somebody might
> have done it, but we can assume anybody relying on it not to read the
> contents is crazy).

Seeing your perspective, now, I can see why you didn't understand me. In
your context this makes perfect sense.



feature request

2016-10-27 Thread John Rood
Users should be able to configure Git to not send them into a Vim editor.

When users pull commits, and a new commit needs to be created for a
merge, Git's current way of determining a commit message is to send
the user into a Vim window so that they can write a message. There are
2 reasons why this might not be the ideal way to prompt for a commit
message.

1. Many users are used to writing concise one-line commit messages and
would not expect to save a commit message in a multi-line file. Some
users will wonder why they are in a text editor or which file they are
editing. Others may not, in fact, realize at all that a text editor is
what they are in.

2. Many users are not familiar with Vim, and do not understand how to
modify, save, and exit. It is not very considerate to require a user
to learn Vim in order to finish a commit that they are in the middle
of.

The existing behavior should be optional, and there should be two new options:

1. Use a simple inline prompt for a commit message (in the same way
Git might prompt for a username).

2. Automatically assign names for commits in the form of "Merged x into y".


Re: [PATCH] compat: Allow static initializer for pthreads on Windows

2016-10-27 Thread Jacob Keller
On Thu, Oct 27, 2016 at 1:05 PM, Johannes Sixt  wrote:
>> The implementation under discussion (well we did not discuss the
>> implementation a
>> whole lot yet) ...
>
>
> There's not a whole lot to discuss: it must be rewritten from scratch (it's
> not just the memory barriers, it is everything else, too). But time is much
> better spent on an attr_start() solution.
>
> -- Hannes
>

Ok, so I've been reading this thread. I don't understand your
objections to emulating in this way.. Could you clearly spell out why
you believe this solution isn't acceptable? So far all I've understood
was "it's not critical sections" and "it penalizes Windows too much"
but... If Windows cannot statically initialize a pthread mutex, then
we *have* to dynamically initialize it somewhere. This solution adds a
single check before each lock and is safe due to use of memory
barriers. Yes, this will cost a tiny bit extra overhead for each use
of "pthread_mutex_lock" but I fail to see how that is a huge
penalty...

So I'm trying to understand if that really is your only concern,
because I don't actually buy that it is a huge overhead to pay.

I agree with Stefan that there isn't really a great place to put a
dynamic initialization.

To be clear I am trying to understand the objections since so far all
I have read is "this is bad, and I object" without clearly explaining
reasoning.

Regards,
Jake


Re: [PATCH v3 2/3] sha1_file: open window into packfiles with O_CLOEXEC

2016-10-27 Thread Junio C Hamano
Jeff King  writes:

> +cc Linus as the original author of 144bde78e9 in case there is
> something subtle I'm missing, but this really just seems like it's
> an outdated optimization.
>
> -- >8 --
> Subject: [PATCH] sha1_file: stop opening files with O_NOATIME
>
> When we open object files, we try to do so with O_NOATIME.
> This dates back to 144bde78e9 (Use O_NOATIME when opening
> the sha1 files., 2005-04-23), which is an optimization to
> avoid creating a bunch of dirty inodes when we're accessing
> many objects.  But a few things have changed since then:
>
>   1. In June 2005, git learned about packfiles, which means
>  we would do a lot fewer atime updates (rather than one
>  per object access, we'd generally get one per packfile).
>
>   2. In late 2006, Linux learned about "relatime", which is
>  generally the default on modern installs. So
>  performance around atimes updates is a non-issue there
>  these days.
>
>  All the world isn't Linux, but as it turns out, Linux
>  is the only platform to implement O_NOATIME in the
>  first place.
>
> So it's very unlikely that this code is helping anybody
> these days.
>
> It's not a particularly large amount of code, but the
> fallback-retry creates complexity. E.g., we do a similar
> fallback for CLOEXEC; which one should take precedence, or
> should we try all possible combinations? Dropping O_NOATIME
> makes those questions go away.
>
> Signed-off-by: Jeff King 
> ---

We may want to lose the surrounding for (;;) loop as there is only
one flag to retry without, which was the original code structure
back when 144bde78e9 ("Use O_NOATIME when opening the sha1 files.",
2005-04-23) was written and refactored by 44d1c19ee8 ("Make loose
object file reading more careful", 2008-06-14).

IOW, this on top.  The update to ce_compare_data() Lars has in
a0a6cb9662 ("read-cache: make sure file handles are not inherited by
child processes", 2016-10-24) could then made into a call to
git_open().

 sha1_file.c | 23 ---
 1 file changed, 8 insertions(+), 15 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index 6f02a57d8b..e18ea053e6 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1553,24 +1553,17 @@ int check_sha1_signature(const unsigned char *sha1, 
void *map,
 
 int git_open(const char *name)
 {
-   static int sha1_file_open_flag = O_CLOEXEC;
-
-   for (;;) {
-   int fd;
-
-   errno = 0;
-   fd = open(name, O_RDONLY | sha1_file_open_flag);
-   if (fd >= 0)
-   return fd;
+   static int cloexec = O_CLOEXEC;
+   int fd;
 
+   errno = 0;
+   fd = open(name, O_RDONLY | cloexec);
+   if ((cloexec & O_CLOEXEC) && fd < 0 && errno == EINVAL) {
/* Try again w/o O_CLOEXEC: the kernel might not support it */
-   if ((sha1_file_open_flag & O_CLOEXEC) && errno == EINVAL) {
-   sha1_file_open_flag &= ~O_CLOEXEC;
-   continue;
-   }
-
-   return -1;
+   cloexec &= ~O_CLOEXEC;
+   fd = open(name, O_RDONLY | cloexec);
}
+   return fd;
 }
 
 static int stat_sha1_file(const unsigned char *sha1, struct stat *st)


Re: Expanding Includes in .gitignore

2016-10-27 Thread Jacob Keller
On Thu, Oct 27, 2016 at 2:04 PM, Jeff King  wrote:
> I'm not convinced this is needed for in-repo .gitignore files. The point
> is that you are pulling together separate files that may be administered
> independently. But git repositories inherently have a whole-project
> view. I'm not sure that separate files buy you a lot there. And the
> compatibility issues are more complicated.
>

I had just assumed it would be used everywhere, so I hadn't really
considered whether that was any reason not to. I personally like the
idea of splitting my git ignores into separate files just so each file
can be about one thing, but I guess it's not really a problem either
way.

> I do agree that:
>
>   cd .git/info
>   git clone /my/exclude/repo exclude ;# or exclude.d
>
> should work; ignoring dotfiles when reading the directory solves that,
> and is a pretty standard solution.
>
> -Peff


Re: [PATCH 32/36] pathspec: allow querying for attributes

2016-10-27 Thread Stefan Beller
On Wed, Oct 26, 2016 at 6:33 AM, Duy Nguyen  wrote:
> (sorry if this should have been answered if I went through the series
> patch by patch, I wanted to do a proper review but finally have to
> admit to myself I won't, so I just skim through a single giant diff
> instead)
>
> On Sun, Oct 23, 2016 at 6:32 AM, Stefan Beller  wrote:
>> +attr;;
>> +After `attr:` comes a space separated list of "attribute
>> +requirements", all of which must be met in order for the
>> +path to be considered a match;
>
> What about (attr=abc def,attr=ghi lkj)? Does it mean (abc && def) ||
> (ghi && lkj), or abc && def && ghi && lkj? Or is it forbidden to have
> multiple 'attr' attribute in the same pathspec?

Good point. I'll add a test for that.

Remembering the original discussion, multiple attrs
are forbidden for now as it is unclear what you want to see
as a user.

To model  (abc && def) || (ghi && lkj), you would need to give
multiple pathspec items as these are naturally ORed:

git ls-files :(attr:abc def) :(attr:ghi lkj) .

(compare "git ls-files Makefile README" which gives
2 files to you, that are named respectively.)

To get "abc && def && ghi && lkj" you go with

git ls-files :(attr:abc def ghi lkj) .

as then all things included into this one attr are
ANDed. I hope the documentation is clear for one
attr.

Thanks,
Stefan


Re: Expanding Includes in .gitignore

2016-10-27 Thread Aaron Pelly
On 28/10/16 10:04, Jeff King wrote:
> On Thu, Oct 27, 2016 at 12:48:34PM -0700, Jacob Keller wrote:
> 
>>> I think the normal behavior in such "foo.d" directory is to just sort
>>> the contents lexically and read them in order, as if they were all
>>> concatenated together, and with no recursion. I.e., behave "as if" the
>>> user had run "cat $dir/*".
>>
>> Yea, this is the normal behavior, and the user is expected to order
>> their files lexically such as "00-name", "50-name" and so on. Pretty
>> traditional for a lot of newer configurations.
> 
> One thing I will say about this approach is that you can implement it
> without any changes in git by doing:
> 
>   path=.git/info/exclude
>   cat $path.d/* >$path
> 
> and I have seen several config mechanisms basically do that (e.g.,
> Debian packaging for a program that doesn't have its own ".d" mechanism,
> but needs to grab config provided by several separate packages).
> 
> The reason to teach that trick to git is convenience; you don't have to
> remember to build the master file from its parts because it's done
> dynamically whenever git needs to look at it.

Precisely.

>> One thing to keep in mind would be that we should make sure we can
>> handle the .gitignore being a submodule or a git repository, so that
>> users could just do something like
> 
> I'm not convinced this is needed for in-repo .gitignore files. The point
> is that you are pulling together separate files that may be administered
> independently. But git repositories inherently have a whole-project
> view. I'm not sure that separate files buy you a lot there. And the
> compatibility issues are more complicated.
> 
> I do agree that:
> 
>   cd .git/info
>   git clone /my/exclude/repo exclude ;# or exclude.d
> 
> should work; ignoring dotfiles when reading the directory solves that,
> and is a pretty standard solution.

You raise a good point about the requirement. I was about to make an
argument in favour of it, but I argued myself out of an argument. I will
say; why have it work in one place and not others?



Re: [PATCH v2 2/2] convert.c: stream and fast search for binary

2016-10-27 Thread Junio C Hamano
tbo...@web.de writes:

> From: Torsten Bögershausen 
>
> When statistics are done for the autocrlf handling, the search in
> the content can be stopped, if e.g
> - a search for binary is done, and a NUL character is found
> - a search for CRLF is done, and the first CRLF is found.
>
> Similar when statistics for binary vs non-binary are gathered:
> Whenever a lone CR or NUL is found, the search can be aborted.
>
> When checking out files in "auto" mode, any file that has a "lone CR"
> or a CRLF will not be converted, so the search can be aborted early.
>
> Add the new bit, CONVERT_STAT_BITS_ANY_CR,
> which is set for either lone CR or CRLF.
>
> Many binary files have a NUL very early and it is often not necessary
> to load the whole content of a file or blob into memory.
>
> Split gather_stats() into gather_all_stats() and gather_stats_partly()
> to do a streaming handling for blobs and files in the worktree.
>
> Signed-off-by: Torsten Bögershausen 
> ---

I'll try to review in reverse order, as this seems to be doing too
many things at once and cannot get my head around it without going
top down.

> @@ -338,11 +401,15 @@ static int crlf_to_worktree(const char *path, const 
> char *src, size_t len,
>  {
>   char *to_free = NULL;
>   struct text_stat stats;
> + unsigned search_only = 0;
>  
>   if (!len || output_eol(crlf_action) != EOL_CRLF)
>   return 0;
>  
> - gather_stats(src, len, );
> + if (crlf_action == CRLF_AUTO || crlf_action == CRLF_AUTO_CRLF)
> + search_only = CONVERT_STAT_BITS_ANY_CR | CONVERT_STAT_BITS_BIN;
> +
> + gather_all_stats(src, len, , search_only);
>   if (!will_convert_lf_to_crlf(len, , crlf_action))
>   return 0;

This special case to decide whether we would limit the search_only
flag has too much intimate knowledge of what happens inside
will_convert_lf_to_crlf().  It knows that output_eol(crlf_action)
not being EOL_CRLF is the very first thing the function checks, too.

Makes one wonder if the check for output_eol(crlf_action) can be
removed from will_convert_lf_to_crlf(), no?  It is not apparent if
that is a good idea for the other caller in crlf_to_git().

gather_all_stats() will give up immediately when it sees either
ANY_CR or BIN.  If CR appears before we see any BIN, stat_bits would
not have BITS_BIN even if the buffer may have BIN byte later.  It is
OK because either lonecr or crlf would be non-zero, and
will_convert_lf_to_crlf() would return 0.  If BIN apepars before we
see any CR, neither lonecr nor crlf will become non-zero even if the
buffer may have CR byte later, but again it is OK because
will_convert_lf_to_crlf() will return 0 in that case.

This looks too brittle, even though it is correct.

> @@ -253,7 +300,8 @@ static int crlf_to_git(const char *path, const char *src, 
> size_t len,
>  {
>   struct text_stat stats;
>   char *dst;
> - int convert_crlf_into_lf;
> + int has_crlf_to_convert;
> + unsigned search_only = 0;
>  
>   if (crlf_action == CRLF_BINARY ||
>   (src && !len))
> @@ -266,12 +314,16 @@ static int crlf_to_git(const char *path, const char 
> *src, size_t len,
>   if (!buf && !src)
>   return 1;
>  
> - gather_stats(src, len, );
> + if (crlf_action == CRLF_AUTO || crlf_action == CRLF_AUTO_INPUT || 
> crlf_action == CRLF_AUTO_CRLF)
> + search_only = CONVERT_STAT_BITS_BIN;
> +
> + gather_all_stats(src, len, , search_only);
> +
>   /* Optimization: No CRLF? Nothing to convert, regardless. */
> - convert_crlf_into_lf = !!stats.crlf;
> + has_crlf_to_convert = !!stats.crlf;

The comment here may need to say a lot more, now we do not even
count .crlf in some cases because of "search_only" setting.

>   if (crlf_action == CRLF_AUTO || crlf_action == CRLF_AUTO_INPUT || 
> crlf_action == CRLF_AUTO_CRLF) {

The new "search_only" criteria above was added to match this if
block; it is not as bad as the previous one in crlf_to_worktree()
that knows, and must be kept in sync with, what a separate function
will_convert_lf_to_crlf() does, but still it is horrible for both
maintainability and readability.  Can you devise some mechanism to
ensure that these two if statements will stay in sync?

> - if (convert_is_binary(len, ))
> + if (stats.stat_bits & CONVERT_STAT_BITS_BIN)
>   return 0;

We no longer need the helper function convert_is_binary() and
instead need only STATS_BITS_BIN bit, so the control flow that
reaches this point is obviously correct (assuming that
gather_all_stats() that is limited with search_only option counts
things correctly, that is).

But what happens when we don't return here?  We didn't get the full
stats out of gather_all_stats() and continue.  Let's see what
happens in that case...

>   /*
>* If the file in the index has any CR in it, do not convert.
> @@ -280,24 +332,35 @@ static int 

Re: Expanding Includes in .gitignore

2016-10-27 Thread Jeff King
On Thu, Oct 27, 2016 at 04:55:08PM -0400, Jeff King wrote:

> On Fri, Oct 28, 2016 at 09:28:23AM +1300, Aaron Pelly wrote:
> 
> > >   - we parse possibly-hostile .gitignore files from cloned repositories.
> > > What happens when I include ask to include /etc/passwd? Probably
> > > nothing, but there are setups where it might matter (e.g., something
> > > like Travis that auto-builds untrusted repositories, and you could
> > > potentially leak the contents of files via error messages). It's
> > > nice to avoid the issue entirely.
> > 
> > I understand the issue.
> > 
> > It's not obvious to me how using a .d solves this problem though.
> 
> It doesn't by itself. But we are worried only about tracked .gitignore
> files (recall that even repo-level files in $GIT_DIR/info are generated
> fresh by the clone process, and don't come from the remote). If we apply
> the feature only to core.excludeFile and $GIT_DIR/info/exclude, those
> are already under the user's control.
> 
> It's true that we could make a similar exception for an "include"
> feature, and respect include directives only in those "safe" files.
> Somehow that seems more confusing to me, though, than doing adding the
> feature at the file level, as it introduces slightly varying syntax
> between the locations.

Actually, I suppose even a ".gitignore.d" inside the repo solves that
problem, too, because the repository is not specifying paths, only
content (it can specify symlinks outside the repository, but like other
parts of git, we should be careful not to follow them in that case).

However, as I said elsewhere, I'm not convinced this feature is all that
helpful for in-repository .gitignore files, and I think it does
introduce compatibility complications. People with older git will not
respect your .gitignore.d files. Whereas $GIT_DIR/info is purely a local
matter.

But perhaps there is a use case I'm missing.

-Peff


Re: Expanding Includes in .gitignore

2016-10-27 Thread Jeff King
On Thu, Oct 27, 2016 at 12:48:34PM -0700, Jacob Keller wrote:

> > I think the normal behavior in such "foo.d" directory is to just sort
> > the contents lexically and read them in order, as if they were all
> > concatenated together, and with no recursion. I.e., behave "as if" the
> > user had run "cat $dir/*".
> 
> Yea, this is the normal behavior, and the user is expected to order
> their files lexically such as "00-name", "50-name" and so on. Pretty
> traditional for a lot of newer configurations.

One thing I will say about this approach is that you can implement it
without any changes in git by doing:

  path=.git/info/exclude
  cat $path.d/* >$path

and I have seen several config mechanisms basically do that (e.g.,
Debian packaging for a program that doesn't have its own ".d" mechanism,
but needs to grab config provided by several separate packages).

The reason to teach that trick to git is convenience; you don't have to
remember to build the master file from its parts because it's done
dynamically whenever git needs to look at it.

> One thing to keep in mind would be that we should make sure we can
> handle the .gitignore being a submodule or a git repository, so that
> users could just do something like

I'm not convinced this is needed for in-repo .gitignore files. The point
is that you are pulling together separate files that may be administered
independently. But git repositories inherently have a whole-project
view. I'm not sure that separate files buy you a lot there. And the
compatibility issues are more complicated.

I do agree that:

  cd .git/info
  git clone /my/exclude/repo exclude ;# or exclude.d

should work; ignoring dotfiles when reading the directory solves that,
and is a pretty standard solution.

-Peff


Re: Expanding Includes in .gitignore

2016-10-27 Thread Aaron Pelly
On 28/10/16 08:48, Jacob Keller wrote:
> I would strongly prefer rc.d style directories either with a "if the
> .gitignore is a directory treat it like rc.d" or even "add support for
> .gitignore.d as well as .gitignore"

I think adding .gitignore.d shouldn't break existing systems, is
intuitive, and solves my issue.

Does git know when it is in a repo that is too new to comprehend?

My current thinking is that anywhere a .gitignore can go, so can a
.gitignore.d (named appropriately of course.) Any existing .gitignore
should take precedence to the result of parsing the directory.

I haven't looked at the implementation of precedence yet, but I'd be
surprised if the existing mechanism can't be employed.

> One thing to keep in mind would be that we should make sure we can
> handle the .gitignore being a submodule or a git repository, so that
> users could just do something like
> 
> "git submodule add  .gitignore and then track git ignore
> contents from a repository in a nice way.
> 
> By this I mean that the reading of files in .gitignore directory
> should exclude reading .git or other hidden files in some documented
> manor so as to avoid problems when linking to a git directory for its
> contents.

Nice! I like this a lot.



Re: [PATCH 17/36] attr: expose validity check for attribute names

2016-10-27 Thread Stefan Beller
On Mon, Oct 24, 2016 at 2:07 PM, Stefan Beller  wrote:
> On Sun, Oct 23, 2016 at 8:07 AM, Ramsay Jones
>  wrote:
>>
>>
>> On 23/10/16 00:32, Stefan Beller wrote:
>>> From: Junio C Hamano 
>>>
>>> Export attr_name_valid() function, and a helper function that
>>> returns the message to be given when a given  pair
>>> is not a good name for an attribute.
>>>
>>> We could later update the message to exactly spell out what the
>>> rules for a good attribute name are, etc.
>>>
>>> Signed-off-by: Junio C Hamano 
>>> Signed-off-by: Stefan Beller 
>>> ---
>>
>> [snip]
>>
>>> +extern int attr_name_valid(const char *name, size_t namelen);
>>> +extern void invalid_attr_name_message(struct strbuf *, const char *, int);
>>> +
>>
>> The symbol 'attr_name_valid()' is not used outside of attr.c, even
>> by the end of this series. Do you expect this function to be used
>> in any future series? (The export is deliberate and it certainly
>> seems like it should be part of the public interface, but ...)
>>

refactoring this series again, I will make use of attr_name_valid
outside of attr.c, so I'll keep this patch.


Re: Expanding Includes in .gitignore

2016-10-27 Thread Jeff King
On Fri, Oct 28, 2016 at 09:28:23AM +1300, Aaron Pelly wrote:

> >   - we parse possibly-hostile .gitignore files from cloned repositories.
> > What happens when I include ask to include /etc/passwd? Probably
> > nothing, but there are setups where it might matter (e.g., something
> > like Travis that auto-builds untrusted repositories, and you could
> > potentially leak the contents of files via error messages). It's
> > nice to avoid the issue entirely.
> 
> I understand the issue.
> 
> It's not obvious to me how using a .d solves this problem though.

It doesn't by itself. But we are worried only about tracked .gitignore
files (recall that even repo-level files in $GIT_DIR/info are generated
fresh by the clone process, and don't come from the remote). If we apply
the feature only to core.excludeFile and $GIT_DIR/info/exclude, those
are already under the user's control.

It's true that we could make a similar exception for an "include"
feature, and respect include directives only in those "safe" files.
Somehow that seems more confusing to me, though, than doing adding the
feature at the file level, as it introduces slightly varying syntax
between the locations.

> > Whereas letting any of the user- or repo-level exclude files be a
> > directory, and simply reading all of the files inside, seems simple and
> > obvious.
> 
> Apart from backwards compatibility, unless there's something I'm missing.

I'm not sure what you mean. If we make:

  mkdir .git/info/exclude
  echo whatever >.git/info/exclude/local

work, I don't think we have to care about backwards compatibility. That
was nonsensical before, and never did anything useful (so somebody might
have done it, but we can assume anybody relying on it not to read the
contents is crazy).

> > It
> > can't handle all cases (some items in "00foo" want precedence over "01bar"
> > and vice versa), but I don't think there's an easy solution. That's a
> > good sign that one or more of the files should be broken up.
> 
> I've been burned by this myself by packages interfering with each other
> in /etc/sysctl.d
> 
> Could we put this down to caveat emptor? I think this sorting should be
> intuitive to most people these days, and simple to document and comprehend.

Yeah, I think caveat emptor is fine there. Sorry if I implied otherwise.

-Peff


Re: [PATCH] git-svn: do not reuse caches memoized for a different architecture

2016-10-27 Thread Eric Wong
Junio C Hamano  wrote:
> Just peeking from the sideline, but the your squash looks like an
> improvement to me.

Thanks.

> Hopefully the final version after your interaction with Dscho can
> come to me via another "pull this now"?

Not sure if I'll be online the next few days,
but I've preeptively pushed the patch + squash to my repo:

The following changes since commit 2cc2e70264e0fcba04f9ef791d144bbc8b501206:

  Eleventh batch for 2.11 (2016-10-26 13:28:47 -0700)

are available in the git repository at:

  git://bogomips.org/git-svn.git svn-cache

for you to fetch changes up to a2c761ce5b7a5fd8b505b036f3509a9e6617dee8:

  git-svn: do not reuse caches memoized for a different architecture 
(2016-10-27 20:17:36 +)


Gavin Lambert (1):
  git-svn: do not reuse caches memoized for a different architecture

 perl/Git/SVN.pm | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)


Re: Drastic jump in the time required for the test suite

2016-10-27 Thread Eric Wong
Johannes Schindelin  wrote:
> I know you are a fan of testing things thoroughly in the test suite, but I
> have to say that it is getting out of hand, in particular due to our
> over-use of shell script idioms (which really only run fast on Linux, not
> a good idea for a portable software).

How much effort would it take to optimize a /bin/sh?

Would replacing uses of fork+execve posix_spawn be fast and
portable enough?

Even on Linux, performance sucks for me.  I've been hoping dash
can use posix_spawn (or using vfork directly) to see if that can
help things.

That won't help with subshells, though...

(I'm back to using a Centrino laptop from 2005)


Re: Expanding Includes in .gitignore

2016-10-27 Thread Aaron Pelly
On 27/10/16 23:50, Jeff King wrote:
> I'd shy away from an actual include directive, as it raises a lot of
> complications:

I'm leaning that way now too.

>   - we parse possibly-hostile .gitignore files from cloned repositories.
> What happens when I include ask to include /etc/passwd? Probably
> nothing, but there are setups where it might matter (e.g., something
> like Travis that auto-builds untrusted repositories, and you could
> potentially leak the contents of files via error messages). It's
> nice to avoid the issue entirely.

I understand the issue.

It's not obvious to me how using a .d solves this problem though.

>   - finding a backwards-compatible syntax

using .d directories solves this nicely in my opinion

> Whereas letting any of the user- or repo-level exclude files be a
> directory, and simply reading all of the files inside, seems simple and
> obvious.

Apart from backwards compatibility, unless there's something I'm missing.

> If you go that route, it probably makes sense to teach
> gitattributes the same trick.

Understood. I'll keep that in mind.

>> In the case of a directory the plan would be to add links to files
>> stored/sourced elsewhere. This does pose a precedence question which I
>> haven't thought about yet, but probably makes it too hard for the
>> limited value it brings.
> 
> I think the normal behavior in such "foo.d" directory is to just sort
> the contents lexically and read them in order, as if they were all
> concatenated together, and with no recursion. I.e., behave "as if" the
> user had run "cat $dir/*".
> 
> That lets you handle precedence via the filenames (or symlink names).

That was my thinking at first, but I didn't want to bias the discussion.

> It
> can't handle all cases (some items in "00foo" want precedence over "01bar"
> and vice versa), but I don't think there's an easy solution. That's a
> good sign that one or more of the files should be broken up.

I've been burned by this myself by packages interfering with each other
in /etc/sysctl.d

Could we put this down to caveat emptor? I think this sorting should be
intuitive to most people these days, and simple to document and comprehend.





Re: [PATCH v15 01/27] bisect--helper: use OPT_CMDMODE instead of OPT_BOOL

2016-10-27 Thread Christian Couder
On Thu, Oct 27, 2016 at 6:59 PM, Junio C Hamano  wrote:
> Cc'ed those who touched either "git-bisect.sh" or "builtin/bisect-helper.c"
> in our relatively recent past.
>
> Does any of you (and others on the list) have time and inclination
> to review this series?

As part of my mentoring Pranit for the GSoC I already took a look at
those patches some months ago on GitHub.

I could take another look at them but my eyes will not be fresh
anymore, so I don know if it will be valuable.


Re: [PATCH] compat: Allow static initializer for pthreads on Windows

2016-10-27 Thread Johannes Sixt

Am 27.10.2016 um 21:08 schrieb Stefan Beller:

On Thu, Oct 27, 2016 at 11:49 AM, Junio C Hamano  wrote:

Johannes Sixt  writes:


Am 27.10.2016 um 19:01 schrieb Stefan Beller:
...
It is not possible to mark a mutex uninitialized on Windows without an
extra piece of data. A solution would become quite complicated quite
quickly, and at the cost of additional operations that are in the same
ballpark as an uncontended mutex. I'm not enthused.


The positive aspect of this way the patch proposes would be that any
future contributor not knowing the details of how to do mutexes right
on Windows, would not totally break the whole system, i.e. this seems
to be more maintainable in the future as it reduces the friction between
pthreads mutexes and the way we can do things in Git in a platform
independent way


This is a non-argument. Coders have to know their tools.


Windows is not my tool.


The tool I meant is pthreads. For example, you can't have a 
pthread_mutex_t variable and not initialize it with either 
PTHREAD_MUTEX_INITIALIZER or pthread_mutex_init.



The codebase should strive to give coders a coherent abstraction
that can be implemented efficiently on platforms, so that coders do
not have to care too deeply about quirks that exist on individual
platforms.


Currently working as a coder I care about "submodules, that work on
linux." I do not care about Windows in the big picture.


I don't know what you meant to say with this sentence, but taken 
literally, are you on the right ship here, I have to ask?



I am however
willing to go an extra step to not break Windows.


"Not break Windows" is equivalent to "make it work on Windows", mind 
you. We can't have a new feature only on Linux when there is no reason 
not to have it on Windows as well. Sorry, "Git is for Unix only" is long 
over.



However that requires
a little bit of effort from both me and you:
* I need to be aware of what I cannot do with "not-my-tools". (So please
  somebody tell me, also in the future when I break obscure platforms. Mind
  that I don't equate obscure with not widely used or in any other way negative.
  It's just that people working on linux find some quirks on Windows
not "obvious")


That goes without saying.


* the workaround should not be too time consuming in the bigger picture,


That, however, is wishful thinking. If you want to have a feature 
dearly, you have to make it work, and that may take its time. I'm not 
saying that *you* have to make it work (there are platform experts 
around to lend a hand), but just an extra step to "not break" an 
important platform is not enough.



  which is why I propose to make the API a bit clearer by emulating posix
  mutexes harder. (From a Windows POV this might sound like making it
  more obscure because posix mutexes itself are obscure.)


So, when you think that POSIX mutexes are obscure, why don't we settle 
on the simpler Windows critical sections? Emulating them with pthreads 
should be child's play.



The implementation under discussion (well we did not discuss the
implementation a
whole lot yet) ...


There's not a whole lot to discuss: it must be rewritten from scratch 
(it's not just the memory barriers, it is everything else, too). But 
time is much better spent on an attr_start() solution.


-- Hannes



Re: [PATCH v2 1/2] read-cache: factor out get_sha1_from_index() helper

2016-10-27 Thread Junio C Hamano
tbo...@web.de writes:

> From: Torsten Bögershausen 
>
> Factor out the retrieval of the sha1 for a given path in
> read_blob_data_from_index() into the function get_sha1_from_index().
>
> This will be used in the next commit, when convert.c can do the
> analyze for "text=auto" without slurping the whole blob into memory
> at once.
>
> Add a wrapper definition get_sha1_from_cache().
>
> Signed-off-by: Torsten Bögershausen 
> ---
>  cache.h  |  3 +++
>  read-cache.c | 29 ++---
>  2 files changed, 21 insertions(+), 11 deletions(-)
>
> diff --git a/cache.h b/cache.h
> index 1604e29..04de209 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -380,6 +380,7 @@ extern void free_name_hash(struct index_state *istate);
>  #define unmerge_cache_entry_at(at) unmerge_index_entry_at(_index, at)
>  #define unmerge_cache(pathspec) unmerge_index(_index, pathspec)
>  #define read_blob_data_from_cache(path, sz) 
> read_blob_data_from_index(_index, (path), (sz))
> +#define get_sha1_from_cache(path)  get_sha1_from_index (_index, (path))

Should have caught this earlier, but there is an extra SP after "from_index"
which I'll remove (the topic is not in 'next' yet, lucky us).

I re-read this to ensure that it does not break read_blob_data_from_index()
the new function borrows the logic from.

Thanks.


Re: Expanding Includes in .gitignore

2016-10-27 Thread Jacob Keller
On Thu, Oct 27, 2016 at 3:50 AM, Jeff King  wrote:
> On Thu, Oct 27, 2016 at 01:22:43PM +1300, Aaron Pelly wrote:
>
>> The use case for this is where I did not write my own rules, but I want
>> to keep them updated. https://github.com/github/gitignore is a damn good
>> resource, but I want to pull it and include relevant bits project by
>> project and/or system wide. I don't want to have to update many projects
>> manually if that, or any other, repo changes.
>
> That seems like a reasonable thing to want.
>

Agreed, this seems useful to me.

>> A very brief look at dir.c would indicate that a recursive call from
>> add_excludes to itself when it parses some sort of include tag would do
>> it within a file. I'm sure it'd be pretty straight forward to hook into
>> something in dir.c to parse directories too.
>>
>> I'm thinking something like ". path/to/include/file" in an ignore file,
>> and/or creating .gitignore.d and/or allowing $HOME/.config/git/ignore
>> and $GIT_DIR/info/exclude to be directories. Or some sane and consistent
>> mixture of these things.
>
> I'd shy away from an actual include directive, as it raises a lot of
> complications:
>
>   - as you noted, cycles in the include graph need to be detected and
> broken
>
>   - we parse possibly-hostile .gitignore files from cloned repositories.
> What happens when I include ask to include /etc/passwd? Probably
> nothing, but there are setups where it might matter (e.g., something
> like Travis that auto-builds untrusted repositories, and you could
> potentially leak the contents of files via error messages). It's
> nice to avoid the issue entirely.
>
>   - finding a backwards-compatible syntax
>
> Whereas letting any of the user- or repo-level exclude files be a
> directory, and simply reading all of the files inside, seems simple and
> obvious. If you go that route, it probably makes sense to teach
> gitattributes the same trick.

Yep that would be easier and simpler.

>
>> In the case of a directory the plan would be to add links to files
>> stored/sourced elsewhere. This does pose a precedence question which I
>> haven't thought about yet, but probably makes it too hard for the
>> limited value it brings.
>
> I think the normal behavior in such "foo.d" directory is to just sort
> the contents lexically and read them in order, as if they were all
> concatenated together, and with no recursion. I.e., behave "as if" the
> user had run "cat $dir/*".

Yea, this is the normal behavior, and the user is expected to order
their files lexically such as "00-name", "50-name" and so on. Pretty
traditional for a lot of newer configurations.

>
> That lets you handle precedence via the filenames (or symlink names). It
> can't handle all cases (some items in "00foo" want precedence over "01bar"
> and vice versa), but I don't think there's an easy solution. That's a
> good sign that one or more of the files should be broken up.
>

Yea if you have these inter-dependent relationships it can't be
expressed, but then neither can it really be expressed in the simple
"include" case above.

I would strongly prefer rc.d style directories either with a "if the
.gitignore is a directory treat it like rc.d" or even "add support for
.gitignore.d as well as .gitignore"

One thing to keep in mind would be that we should make sure we can
handle the .gitignore being a submodule or a git repository, so that
users could just do something like

"git submodule add  .gitignore and then track git ignore
contents from a repository in a nice way.

By this I mean that the reading of files in .gitignore directory
should exclude reading .git or other hidden files in some documented
manor so as to avoid problems when linking to a git directory for its
contents.

Thanks,
Jake

> -Peff


Re: [PATCH] git-svn: do not reuse caches memoized for a different architecture

2016-10-27 Thread Junio C Hamano
Eric Wong  writes:

> Johannes Schindelin  wrote:
>> +++ b/perl/Git/SVN.pm
>> @@ -1658,6 +1658,11 @@ sub tie_for_persistent_memoization {
>>  if ($memo_backend > 0) {
>>  tie %$hash => 'Git::SVN::Memoize::YAML', "$path.yaml";
>>  } else {
>> +# first verify that any existing file can actually be loaded
>> +# (it may have been saved by an incompatible version)
>> +if (-e "$path.db") {
>> +unlink "$path.db" unless eval { retrieve("$path.db"); 1 
>> };
>> +}
>
> That retrieve() call is unlikely to work without "use Storable"
> to import it into the current package.
>
> I also favor setting "$path.db" once to detect typos and avoid
> going over 80 columns.  Additionally, having error-checking for
> unlink might be useful.
>
> So perhaps squashing this on top:
>
> diff --git a/perl/Git/SVN.pm b/perl/Git/SVN.pm
> index 025c894..b3c1460 100644
> --- a/perl/Git/SVN.pm
> +++ b/perl/Git/SVN.pm
> @@ -1660,10 +1660,15 @@ sub tie_for_persistent_memoization {
>   } else {
>   # first verify that any existing file can actually be loaded
>   # (it may have been saved by an incompatible version)
> - if (-e "$path.db") {
> - unlink "$path.db" unless eval { retrieve("$path.db"); 1 
> };
> + my $db = "$path.db";
> + if (-e $db) {
> + use Storable qw(retrieve);
> +
> + if (!eval { retrieve($db); 1 }) {
> + unlink $db or die "unlink $db failed: $!";
> + }
>   }
> - tie %$hash => 'Memoize::Storable', "$path.db", 'nstore';
> + tie %$hash => 'Memoize::Storable', $db, 'nstore';
>   }
>  }
>  
>
> Thoughts?  Thanks.

Just peeking from the sideline, but the your squash looks like an
improvement to me.

Hopefully the final version after your interaction with Dscho can
come to me via another "pull this now"?

Thanks.


Re: [PATCH] compat: Allow static initializer for pthreads on Windows

2016-10-27 Thread Stefan Beller
On Thu, Oct 27, 2016 at 11:49 AM, Junio C Hamano  wrote:
> Johannes Sixt  writes:
>
>> Am 27.10.2016 um 19:01 schrieb Stefan Beller:
>> ...
>> It is not possible to mark a mutex uninitialized on Windows without an
>> extra piece of data. A solution would become quite complicated quite
>> quickly, and at the cost of additional operations that are in the same
>> ballpark as an uncontended mutex. I'm not enthused.
>>
>>> The positive aspect of this way the patch proposes would be that any
>>> future contributor not knowing the details of how to do mutexes right
>>> on Windows, would not totally break the whole system, i.e. this seems
>>> to be more maintainable in the future as it reduces the friction between
>>> pthreads mutexes and the way we can do things in Git in a platform
>>> independent way
>>
>> This is a non-argument. Coders have to know their tools.

Windows is not my tool.

>
> The codebase should strive to give coders a coherent abstraction
> that can be implemented efficiently on platforms, so that coders do
> not have to care too deeply about quirks that exist on individual
> platforms.

Currently working as a coder I care about "submodules, that work on
linux." I do not care about Windows in the big picture. I am however
willing to go an extra step to not break Windows. However that requires
a little bit of effort from both me and you:
* I need to be aware of what I cannot do with "not-my-tools". (So please
  somebody tell me, also in the future when I break obscure platforms. Mind
  that I don't equate obscure with not widely used or in any other way negative.
  It's just that people working on linux find some quirks on Windows
not "obvious")
* the workaround should not be too time consuming in the bigger picture,
  which is why I propose to make the API a bit clearer by emulating posix
  mutexes harder. (From a Windows POV this might sound like making it
  more obscure because posix mutexes itself are obscure.)

>
> It is OK to argue that the particular solution Stefan lifted from
> somewhere (perhaps msdn article he cited???)

A stack overflow article that I found with my search engine of choice, because
I could not believe that Windows cannot have statically initialized mutexes.

> does not qualify as
> such an abstraction.

The implementation under discussion (well we did not discuss the
implementation a
whole lot yet) may even contain an error as the first memory barrier
needs to be in front
of the first condition.

>  But I do not agree with your "Coders have to
> know" as a blanket statement.

Well I do to some extent, I just disagree what my tools are.


Re: [PATCH] compat: Allow static initializer for pthreads on Windows

2016-10-27 Thread Junio C Hamano
Johannes Sixt  writes:

> Am 27.10.2016 um 19:01 schrieb Stefan Beller:
> ...
> It is not possible to mark a mutex uninitialized on Windows without an
> extra piece of data. A solution would become quite complicated quite
> quickly, and at the cost of additional operations that are in the same
> ballpark as an uncontended mutex. I'm not enthused.
>
>> The positive aspect of this way the patch proposes would be that any
>> future contributor not knowing the details of how to do mutexes right
>> on Windows, would not totally break the whole system, i.e. this seems
>> to be more maintainable in the future as it reduces the friction between
>> pthreads mutexes and the way we can do things in Git in a platform
>> independent way
>
> This is a non-argument. Coders have to know their tools.

The codebase should strive to give coders a coherent abstraction
that can be implemented efficiently on platforms, so that coders do
not have to care too deeply about quirks that exist on individual
platforms.

It is OK to argue that the particular solution Stefan lifted from
somewhere (perhaps msdn article he cited???) does not qualify as
such an abstraction.  But I do not agree with your "Coders have to
know" as a blanket statement.


Re: [PATCH] Update git rebase documentation to clarify HEAD behavior

2016-10-27 Thread Philip Oakley

From: "Junio C Hamano" 

Cody Sehl  writes:

The first few paragraphs in the git-rebase.txt documentation lay out the 
steps git takes during a rebase:

1. everything from `..HEAD` is saved to a temporary area
2. `HEAD` is set to ``
3. the changes held in the temporary area are applied one by one in order 
on top of the new `HEAD`


The second step was described using the phrase `The current branch is 
reset to `, which is true (because `HEAD` == current branch), 
but not clear.

---


Please wrap your lines to reasonable lengths like 70 columns or so.
Please sign off your patch.


 Documentation/git-rebase.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt
index de222c8..c47ca11 100644
--- a/Documentation/git-rebase.txt
+++ b/Documentation/git-rebase.txt
@@ -33,7 +33,7 @@ of commits that would be shown by `git log 
..HEAD`; or by

 description on `--fork-point` below); or by `git log HEAD`, if the
 `--root` option is specified.

-The current branch is reset to , or  if the
+HEAD is reset to , or  if the
 --onto option was supplied.  This has the exact same effect as
 `git reset --hard ` (or ).  ORIG_HEAD is set
 to point at the tip of the branch before the reset.


This is describing an ancient behaviour before 6fd2f5e60d ("rebase:
operate on a detached HEAD", 2007-11-08) in v1.5.4 timeframe.  We
apparently failed to update the description.

This depends on the desired technical detail of the description, but
a correct rewrite would be "HEAD is detached at , or
, and ORIG_HEAD is set to point at the tip of the branch
before this happens".  Detaching the HEAD at  no longer
has the same effect as "git reset --hard " (or ),
so that sentence must go.  It was the primary point of the ancient
change at 6fd2f5e60d after all.

And then there is a new step (to be numbered 4. in your description
in the proposed log message), which updates the tip of the branch to
the resulting HEAD (after replaying all these changes) and check the
branch out, which needs to be added.  Perhaps after "one by one, in
order."  Oh, the mention of "reapplied to the current branch" also
needs to be updated to "reapplied to the detached HEAD", too.

On the other hand, if we do not aim for that deep level of technical
correctness, but want to tell a white lie to make it easier to
understand at the conceptual level to new readers who haven't
grasped the detached HEAD, then the current description is fine.  By
bringing up "HEAD", you seem to be aiming for techincal correctness
(which I tend to agree is a good direction to go in this part of the
documentation), so the existing text needs a bit more work than your
patch to be brought to the modern world.



A third option is to start with a short "Conceptually, .." description which 
gives that summary overview, and then have a few "In detail, .." paragraphs 
that cover the ORIG_HEAD and Detatched HEAD, etc., that users should at 
least be aware of (so they know they can come back and read the gory details 
when they need it;-)

--
Philip 



Re: [PATCH] Documentation/git-diff: document git diff with 3+ commits

2016-10-27 Thread Junio C Hamano
Michael J Gruber  writes:

> It did not exist, but even at that point in time, "git log A..B" listed
> only commits between the merge base and B, not those which are only in A
> and not in B. Whereas "git diff A B" shows the differences between the
> endpoints A and B.

I think you are speaking with 20/20 hindsight at this point.  

Do you genuinely think you would have thought that way when there
weren't any concept of "merge-base" or "git log A...B" known to you?

>>> @@ -12,6 +12,7 @@ SYNOPSIS
>>>  'git diff' [options] [] [--] [...]
>>>  'git diff' [options] --cached [] [--] [...]
>>>  'git diff' [options]   [--] [...]
>>> +'git diff' [options][...]
>> 
>> Made me wonder "is [...] 0-or-more As or 1-or-more As?".
>
> 0-or-more, at least that's the way it is used in all lines here.
>
>> Don't we allow pathspecs in this case?
>
> Yes, the combinded diff mode kicks in only with no blobs (not: no
> pathspec, which I had misread) and N>=3 commits. Maybe I should update
> the code comments in builtin/diff.c to describe this, too.

I don't know about the code comments, but the above SYNOPSIS needs
an update, I would think.  That is why I wondered if [...] were
0-or-more or 1-or-moer.

You can replace

>>>  'git diff' [options]   [--] [...]

with

>>>  'git diff' [options]   [...] [--] [...]

without adding a new line to cover the new(ly discovered) case.



Re: [PATCH 32/36] pathspec: allow querying for attributes

2016-10-27 Thread Junio C Hamano
Stefan Beller  writes:

> The pathspec mechanism is extended via the new
> ":(attr:eol=input)pattern/to/match" syntax to filter paths so that it
> requires paths to not just match the given pattern but also have the
> specified attrs attached for them to be chosen.

I was looking at preload-index.c and its history again, because I
wanted to ensure that Lars's process filter stuff introduces no
unexpected interactions with getting multi-threaded, similar to the
problem we had in the earlier incarnations of this step, which we
worked around with 
"pathspec: disable preload-index when attribute pathspec magic is in
use".

I think that Lars's series is safe, because the only thing among
what preload-index.c::preload_thread() does that can go outside the
simple stat data and pattern matching with the pathname is the
racy-git check in ie_match_stat(), and the object store access in
that call was explicitly disabled by 7c4ea599b0 ("Fix index
preloading for racy dirty case", 2008-11-17) long time ago.

By passing CE_MATCH_RACY_IS_DIRTY option to the call, this caller
effectively says "A cache entry whose stat data matches may be
actually dirty when the timestamp is racy, in which case we usually
compare data to determine if it really is clean, but it is OK to err
on the safe and lazy side by declaring it dirty and not marking it
up-to-date while we are preloading.  Do not bother to go to the
object store".

The reason why I am bringing this up in this discussion thread on
this patch is because I wonder if we would benefit by a similar
"let's not do too involved things and be cheap by erring on the safe
and lazy side" strategy in the call to ce_path_match() call made in
this function to avoid making calls to the attr subsystem.

In other words, would it help the system by either simplifying the
processing done or reducing the cycle spent in preload_thread() if
we could tell ce_path_match() "A pathspec we are checking may
require not just the pattern to match but also attributes given to
the path to satisfy the criteria, but for the purpose of preloading,
pretend that the attribute satisfies the match criteria" (or
"pretend that it does not match"), thereby not having to make any
call into the attribute subsystem at all from this codepath?

The strategy this round takes to make it unnecessary to punt
preloading (i.e. dropping "pathspec: disable preload-index when
attribute pathspec magic is in use" patch the old series had) is to
make the attribute subsystem thread-safe.  But that thread-safety in
the initial round is based on a single Big Attribute Lock, so it may
turn out that the end result performs better for this codepath if we
did not to make any call into the attribute subsystem.

I am probably being two step ahead of ourselves by saying the above,
which is just something to keep in mind as a possible solution if
performance in this preload codepath becomes an issue when the
pathspec has attributes match specified.


Re: [PATCH] compat: Allow static initializer for pthreads on Windows

2016-10-27 Thread Johannes Sixt

Am 27.10.2016 um 19:01 schrieb Stefan Beller:

Johannes Sixt  writes:
This is the pessimization that I am talking about. I would not mind at all if
it were only for the attribute subsystem, but the proposed patch would
pessimize *all* uses of pthread_mutex_lock.


It would only pessimize *uninitialized* mutexes? For initialized mutexes
the added burden is super cheap (one additional condition).


It is not possible to mark a mutex uninitialized on Windows without an 
extra piece of data. A solution would become quite complicated quite 
quickly, and at the cost of additional operations that are in the same 
ballpark as an uncontended mutex. I'm not enthused.



The positive aspect of this way the patch proposes would be that any
future contributor not knowing the details of how to do mutexes right
on Windows, would not totally break the whole system, i.e. this seems
to be more maintainable in the future as it reduces the friction between
pthreads mutexes and the way we can do things in Git in a platform
independent way


This is a non-argument. Coders have to know their tools.

-- Hannes



Re: [PATCH] git-compat-util: move content inside ifdef/endif guards

2016-10-27 Thread Junio C Hamano
Jeff King  writes:

> Commit 3f2e2297b9 (add an extra level of indirection to
> main(), 2016-07-01) added a declaration to git-compat-util.h,
> but it was accidentally placed after the final #endif that
> guards against multiple inclusions.
>
> This doesn't have any actual impact on the code, since it's
> not incorrect to repeat a function declaration in C. But
> it's a bad habit, and makes it more likely for somebody else
> to make the same mistake. It also defeats gcc's optimization
> to avoid opening header files whose contents are completely
> guarded.
>
> Signed-off-by: Jeff King 
> ---
> I just happened to notice this today while doing something unrelated in
> the file.

Thanks.

>
>  git-compat-util.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/git-compat-util.h b/git-compat-util.h
> index 91e366d1dd..771ea29f31 100644
> --- a/git-compat-util.h
> +++ b/git-compat-util.h
> @@ -1042,6 +1042,6 @@ struct tm *git_gmtime_r(const time_t *, struct tm *);
>  #define getc_unlocked(fh) getc(fh)
>  #endif
>  
> -#endif
> -
>  extern int cmd_main(int, const char **);
> +
> +#endif


[PATCH] git-compat-util: move content inside ifdef/endif guards

2016-10-27 Thread Jeff King
Commit 3f2e2297b9 (add an extra level of indirection to
main(), 2016-07-01) added a declaration to git-compat-util.h,
but it was accidentally placed after the final #endif that
guards against multiple inclusions.

This doesn't have any actual impact on the code, since it's
not incorrect to repeat a function declaration in C. But
it's a bad habit, and makes it more likely for somebody else
to make the same mistake. It also defeats gcc's optimization
to avoid opening header files whose contents are completely
guarded.

Signed-off-by: Jeff King 
---
I just happened to notice this today while doing something unrelated in
the file.

 git-compat-util.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/git-compat-util.h b/git-compat-util.h
index 91e366d1dd..771ea29f31 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -1042,6 +1042,6 @@ struct tm *git_gmtime_r(const time_t *, struct tm *);
 #define getc_unlocked(fh) getc(fh)
 #endif
 
-#endif
-
 extern int cmd_main(int, const char **);
+
+#endif
-- 
2.10.1.916.g0d2035c


Re: [PATCH] compat: Allow static initializer for pthreads on Windows

2016-10-27 Thread Jeff King
On Thu, Oct 27, 2016 at 10:01:02AM -0700, Stefan Beller wrote:

> >  That function can
> > be called from main() of platforms that cannot statically initialize
> > mutices,
> 
> By main you mean the main() in common-main.c or cmd_main in git.c ?

Any setup or initialization for library functions needs to go in main()
of common-main.c.  Otherwise, only builtins get it, and external
programs are left to segfault (or whatever).

> Those both look like the wrong place. Of course it would work adding it
> there, but it smells like a maintenance nightmare.

I agree it's ugly, but I suspect it would work OK in practice.
We don't have that many mutexes and initializing them is cheap.

Languages like C++ have non-constant initializers, and the compiler
takes care of running the startup code before main() begins. There's no
portable way to do that in C, but our cmd_main() is roughly the same
thing.

I still prefer the lazy initialization if it can work without incurring
measurable overhead. That's the normal way to do things in C; the only
complication here is the thread safety.

-Peff


Re: [PATCH v4 00/14] Mark strings in Perl scripts for translation

2016-10-27 Thread Junio C Hamano
> Vasco Almeida (14):
>   i18n: add--interactive: mark strings for translation
>   i18n: add--interactive: mark simple here-documents for translation
>   i18n: add--interactive: mark strings with interpolation for
> translation
>   i18n: clean.c: match string with git-add--interactive.perl
>   i18n: add--interactive: mark plural strings
>   i18n: add--interactive: mark patch prompt for translation
>   i18n: add--interactive: i18n of help_patch_cmd
>   i18n: add--interactive: mark edit_hunk_manually message for
> translation
>   i18n: add--interactive: remove %patch_modes entries
>   i18n: add--interactive: mark status words for translation
>   i18n: send-email: mark strings for translation
>   i18n: send-email: mark warnings and errors for translation
>   i18n: send-email: mark string with interpolation for translation
>   i18n: difftool: mark warnings for translation

It seems that nobody has anything to say on this series, either
positive or negative?

The topic has been in "Waiting for review" pile but I'll mark it as
"Will merge to 'next'" unless I hear otherwise within a few days
(positives may make it faster, negatives may require a reroll).

Thanks.




Re: [PATCH v2 0/2] Stream and fast search

2016-10-27 Thread Junio C Hamano
Cc'ed those who touched convert.c or read-cache.c in our relatively
recent past with a change that affects the eol conversion codepath.

Does any of you (and others on the list) have time and inclination
to review this series?

Thanks.


Re: [PATCH] compat: Allow static initializer for pthreads on Windows

2016-10-27 Thread Stefan Beller
> Johannes Sixt  writes:
> This is the pessimization that I am talking about. I would not mind at all if
> it were only for the attribute subsystem, but the proposed patch would
> pessimize *all* uses of pthread_mutex_lock.

It would only pessimize *uninitialized* mutexes? For initialized mutexes
the added burden is super cheap (one additional condition).

The positive aspect of this way the patch proposes would be that any
future contributor not knowing the details of how to do mutexes right
on Windows, would not totally break the whole system, i.e. this seems
to be more maintainable in the future as it reduces the friction between
pthreads mutexes and the way we can do things in Git in a platform
independent way

On Wed, Oct 26, 2016 at 11:33 PM, Junio C Hamano  wrote:
>>
>> Lazy on-demand initialization as needed, perhaps?  The on-demand
>> initialization mechanism may become no-op on some platforms that can
>> do static initialization.
>
> Ah, I think I misunderstood your "please rewrite".  Did you mean to
> add "void attr_start(void)" helper function to attr.c that does
> series of pthread_mutex_init() calls as needed?

Well one init for now.

>  That function can
> be called from main() of platforms that cannot statically initialize
> mutices,

By main you mean the main() in common-main.c or cmd_main in git.c ?

Those both look like the wrong place. Of course it would work adding it
there, but it smells like a maintenance nightmare.

And then we would modify the attr_start command depending on the
platform, i.e.

#ifdef WIN32
void attr_start(void)
{
pthread_mutex_init(..);
}
#else
void attr_start(void)
{
/* nothing as it is statically init'd */
}
#endif

> while on other platforms it can be a no-op as long as the
> variables are statically initialized?  If so, that would not pessimize
> any platform, I would think.

I would think this pessimizes ALL platforms from a maintenance perspective
(but what do I know about maintaining stuff as an eager young contributor ;)

So I am willing to go that route, though I do not quite understand where exactly
you'd expect me to put the initializer as all places I can think of
are not the right
place.

Thanks,
Stefan


Re: [PATCH v15 01/27] bisect--helper: use OPT_CMDMODE instead of OPT_BOOL

2016-10-27 Thread Junio C Hamano
Cc'ed those who touched either "git-bisect.sh" or "builtin/bisect-helper.c"
in our relatively recent past.

Does any of you (and others on the list) have time and inclination
to review this series?

Thanks.


Re: [PATCH] reset: --unmerge

2016-10-27 Thread Junio C Hamano
Andreas Schwab  writes:

> On Okt 25 2016, Junio C Hamano  wrote:
>
>> Somebody with a bright idea decided that vc-git-resolve-conflicts
>> variable should be on by default in Emacs 25.1 X-<
>
> This is consistent with the behaviour of the other VC backends, where it
> isn't even customizable.

The problem I had was "M-x save-buffer" (after resolving the
conflicts in it manually) running "git add" on the resulting file,
robbing from "git diff" an opportunity to help the user to see how
the result looks relative to both branches.

Do you mean that VC mode broke the same feature equally other SCMs,
too?  Do other SCM supported by VC backends take advantage of
unmerged stages in the index (until you say "$scm add") by allowing
you to verify the combined diff with "$scm diff"?





Re: [PATCH v1 16/19] read-cache: unlink old sharedindex files

2016-10-27 Thread Junio C Hamano
Duy Nguyen  writes:

> Christian, if we assume to go with Junio's suggestion to disable
> split-index on temporary files, the only files left we have to take
> care of are index and index.lock. I believe pruning here in this code
> will have an advantage over in "git gc --auto" because when this is
> executed, we know we're holding index.lock, so nobody else is updating
> the index, it's race-free.
>
> All we need to do is peek in $GIT_DIR/index
> to see what shared index file it requires and keep it alive too, the
> remaining of shared index files can be deleted safely. We don't even
> need to fall back to mtime.

Yes, that exactly was why I wondered if we can afford to limit
splitting only to the primary index, because it makes things a
lot simpler.

But I suspect that temporary index is where split-index shines most,
e.g. while creating a partial commit.  The mechanism penalizes the
read performance by making the format more complex in order to favor
the write performance, which is very much suited for temporary one
that is read only once after it is written before it gets discarded
(on the other hand, splitting the primary index will penalize reads
that happen a lot more than writes).

While I still find it attractive at the conceptual level to limit
splitting only to the primary index for the resulting simplicity,
I doubt it is a good way to go, as I meant to say in


> git-gc just can't match this because while it's running, somebody else
> may be updating $GIT_DIR/index. Handling races would be a lot harder.

It could attempt to take a lock on the primary index while it runs,
and refrain to do anything if it can't take the lock ("gc --auto"
may want to silently retry), and then the race is no longer an
issue, no?


Re: [PATCH] reset: --unmerge

2016-10-27 Thread Andreas Schwab
On Okt 25 2016, Junio C Hamano  wrote:

> Somebody with a bright idea decided that vc-git-resolve-conflicts
> variable should be on by default in Emacs 25.1 X-<

This is consistent with the behaviour of the other VC backends, where it
isn't even customizable.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


Re: What's cooking in git.git (Oct 2016, #06; Mon, 24)

2016-10-27 Thread Johannes Schindelin
Hi Hannes,

On Tue, 25 Oct 2016, Johannes Sixt wrote:

> Am 25.10.2016 um 20:13 schrieb Stefan Beller:
> > On Tue, Oct 25, 2016 at 10:15 AM, Junio C Hamano  wrote:
> > >  - the "off-by-one fix" part of sb/submodule-ignore-trailing-slash
> > >needs to be in the upcoming release but the "trailing /. in base
> > >should not affect the resolution of ../relative/path" part that
> > >is still under discussion can wait.  Which means we'd need a few
> > >more !MINGW prerequisites in the tests by -rc0.
> > >[...]
> >
> > So maybe instead of adding !MINGW we rather want to apply
> > https://public-inbox.org/git/2908451e-4273-8826-8989-5572263cc...@kdbg.org/
> > instead for now?
> 
> I was about to submit this very patch again, and only then saw your message.
> So, yes, that's what I propose, too.
> 
> Dscho, does this patch fix the test failures that you observed, too?
> Unfortunately, it goes against our endeavor to reduce subshells.

I am fine with the patch, even if I did not have a chance to test it yet
(ran out of time today).

Ciao,
Dscho


Re: Expanding Includes in .gitignore

2016-10-27 Thread Jeff King
On Thu, Oct 27, 2016 at 01:22:43PM +1300, Aaron Pelly wrote:

> The use case for this is where I did not write my own rules, but I want
> to keep them updated. https://github.com/github/gitignore is a damn good
> resource, but I want to pull it and include relevant bits project by
> project and/or system wide. I don't want to have to update many projects
> manually if that, or any other, repo changes.

That seems like a reasonable thing to want.

> A very brief look at dir.c would indicate that a recursive call from
> add_excludes to itself when it parses some sort of include tag would do
> it within a file. I'm sure it'd be pretty straight forward to hook into
> something in dir.c to parse directories too.
> 
> I'm thinking something like ". path/to/include/file" in an ignore file,
> and/or creating .gitignore.d and/or allowing $HOME/.config/git/ignore
> and $GIT_DIR/info/exclude to be directories. Or some sane and consistent
> mixture of these things.

I'd shy away from an actual include directive, as it raises a lot of
complications:

  - as you noted, cycles in the include graph need to be detected and
broken

  - we parse possibly-hostile .gitignore files from cloned repositories.
What happens when I include ask to include /etc/passwd? Probably
nothing, but there are setups where it might matter (e.g., something
like Travis that auto-builds untrusted repositories, and you could
potentially leak the contents of files via error messages). It's
nice to avoid the issue entirely.

  - finding a backwards-compatible syntax

Whereas letting any of the user- or repo-level exclude files be a
directory, and simply reading all of the files inside, seems simple and
obvious. If you go that route, it probably makes sense to teach
gitattributes the same trick.

> In the case of a directory the plan would be to add links to files
> stored/sourced elsewhere. This does pose a precedence question which I
> haven't thought about yet, but probably makes it too hard for the
> limited value it brings.

I think the normal behavior in such "foo.d" directory is to just sort
the contents lexically and read them in order, as if they were all
concatenated together, and with no recursion. I.e., behave "as if" the
user had run "cat $dir/*".

That lets you handle precedence via the filenames (or symlink names). It
can't handle all cases (some items in "00foo" want precedence over "01bar"
and vice versa), but I don't think there's an easy solution. That's a
good sign that one or more of the files should be broken up.

-Peff


Re: [PATCH v1 16/19] read-cache: unlink old sharedindex files

2016-10-27 Thread Duy Nguyen
On Tue, Oct 25, 2016 at 5:43 PM, Duy Nguyen  wrote:
>>  static int write_shared_index(struct index_state *istate,
>> @@ -2211,8 +2269,11 @@ static int write_shared_index(struct index_state 
>> *istate,
>> }
>> ret = rename_tempfile(_sharedindex,
>>   git_path("sharedindex.%s", 
>> sha1_to_hex(si->base->sha1)));
>> -   if (!ret)
>> +   if (!ret) {
>> hashcpy(si->base_sha1, si->base->sha1);
>> +   clean_shared_index_files(sha1_to_hex(si->base->sha1));
>
> This operation is technically garbage collection and should belong to
> "git gc --auto", which is already called automatically in a few
> places. Is it not called often enough that we need to do the cleaning
> up right after a new shared index is created?

Christian, if we assume to go with Junio's suggestion to disable
split-index on temporary files, the only files left we have to take
care of are index and index.lock. I believe pruning here in this code
will have an advantage over in "git gc --auto" because when this is
executed, we know we're holding index.lock, so nobody else is updating
the index, it's race-free. All we need to do is peek in $GIT_DIR/index
to see what shared index file it requires and keep it alive too, the
remaining of shared index files can be deleted safely. We don't even
need to fall back to mtime.

git-gc just can't match this because while it's running, somebody else
may be updating $GIT_DIR/index. Handling races would be a lot harder.
-- 
Duy


Re: Expanding Includes in .gitignore

2016-10-27 Thread Aaron Pelly
On 27/10/16 21:19, Alexei Lozovsky wrote:
>> I'm thinking something like ". path/to/include/file" in an ignore file,
>> and/or creating .gitignore.d and/or allowing $HOME/.config/git/ignore
>> and $GIT_DIR/info/exclude to be directories. Or some sane and consistent
>> mixture of these things.
>
> I think the rc.d-like approach with directories is better as it
> does not add new magical filenames (what if I absolutely do need
> to name my directories ". path", with a space? :)

Yes. Another alternative I thought of was #include path/to/include/file.
That'd be backwards compatible, which is a good thing, but would involve
parsing comments, which obviously could start with #include. Or lines
like ^Include /path/file$ In the case of finding an invalid file,
passing it over and issuing a simple warning should surface any issues
with existing gitignores. Anyway, this conversation is why I bring it up
on the list.

Coming back to this, maybe :(include)/path/file might be more git-like

>> In the case of a directory the plan would be to add links to files
>> stored/sourced elsewhere. This does pose a precedence question which I
>> haven't thought about yet, but probably makes it too hard for the
>> limited value it brings.
>
> Now, if we consider the case of multiple .gitignore files, it
> could be unexpected and possibly annoying for negative patterns
> in one file to affect the patterns added by some other files.

That is a concern. It is non obvious; the worst kind of annoying.

> I would find it more conceptually simple to apply individual
> .gitignores one by one, as opposed to parsing them all and
> creating one giant exclusion rule. (In technical terms, this
> means keeping one struct exclude_list for each .gitignore,
> not merging them all into one single list.)

I agree. I haven't looked, but that sounds like touching significantly
more code though. Actually, thinking about it for 20 seconds more, it
shouldn't be too hard, should it?

> In this case there should be no precendence problems as applied
> gitignores only add new ignored files, without un-ignoring
> anything previously ignored by other files.

Again, I haven't looked yet, but there is still an issue of precedence
with other gitignore files in $HOME and the repo.

> However, if we allow textual inclusion, then it means that we
> can put a gitignore into our gitignore so that we can unignore
> while we ignore, which again brings us the question of whether
> it is actually needed and expected.

Gah! Yes. One way or the other.

>> I would like to know the desirability/practicality/stupidity of such a
>> feature as I believe it is within my skillset to implement it.
>
> However, I do not recall any precendent of git using rc.d-like
> configs.

Many things have adopted this technique recently. Well, the last 15
years. It is common, understood, and fairly simple. I see no issue with it.

> And some can argue that your goal can be achieved by
> generating the .gitignore by some external means and symlinking
> the result into .git/info/exclude, so this is not Git's problem
> and we should not be overcomplicating things with something as
> simple as a list exclude patterns. This line of argument also
> can be used to opposes any textual inclusion as well, because
> it can be expanded into 'why don't we add a Turing-complete
> programming language then to specify the patterns to ignore'.

I know. Another reason to bring the idea to the list. I sort of have
this attitude myself. My main objection to it is that I can't think of a
hook to automate it with.

But: What about some kind of :(exec) that executes a script and returns
a gitignore file? You write the script; you're responsible. And the
behaviour is obvious. I haven't thought that through. It just came to me
then, and might present security issues, but could greatly simplify things.







Re: [PATCH] rebase: add --forget to cleanup rebase, leave HEAD untouched

2016-10-27 Thread Duy Nguyen
On Wed, Oct 26, 2016 at 11:51 PM, Junio C Hamano  wrote:
> Nguyễn Thái Ngọc Duy   writes:
>
>> There are occasions when you decide to abort an in-progress rebase and
>> move on to do something else but you forget to do "git rebase --abort"
>> first. Or the rebase has been in progress for so long you forgot about
>> it. By the time you realize that (e.g. by starting another rebase)
>> it's already too late to retrace your steps. The solution is normally
>>
>> rm -r .git/
>>
>> and continue with your life. But there could be two different
>> directories for  (and it obviously requires some
>> knowledge of how rebase works), and the ".git" part could be much
>> longer if you are not at top-dir, or in a linked worktree. And
>> "rm -r" is very dangerous to do in .git, a mistake in there could
>> destroy object database or other important data.
>>
>> Provide "git rebase --forget" for this exact use case.
>
> Two and a half comments.
>
>  - The title says "leave HEAD untouched".  Are my working tree files
>and my index also safe from this operation, or is HEAD the only
>thing that is protected?

Everything is protected. I will rephrase the title a bit. The option
is basically a safe form of "rm -r .git/rebase-{apply,merge}".

>  - I think I saw a variant of this gotcha for an unconcluded
>cherry-pick that was left behind, which the bash-prompt script
>did not notice but the next "git cherry-pick" did by complaining
>"you are in the middle" or something like that.  Perhaps we would
>want to have a similarly sounding option to help that case, too,
>not in this patch but as another patch on the same theme?

That would be nice. I don't put lots of git info on my shell prompt
though, so it does not help me. And it's probably difficult to report
the right thing too. Sometimes in the middle of rebase I would switch
to another branch, look or do stuff, then "git checkout -" back. I
don't think we can make the prompt script clever enough to see my
intention.

>  - Would it have helped if bash-prompt were in use?  I am not saying
>that this patch becomes unnecessary if you use it; I am trying to
>see if it helps its users by reminding them what state they are
>in.

Since I don't use it because I want to keep shell prompt short and
light, it doe not help me. But it looks like git-prompt.sh does print
rebase in progress and others (only checked the code, didn't test it).
-- 
Duy


Re: [PATCH v3 2/3] sha1_file: open window into packfiles with O_CLOEXEC

2016-10-27 Thread Jeff King
On Wed, Oct 26, 2016 at 02:15:33PM -0700, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > On Wed, Oct 26, 2016 at 10:52:41AM -0700, Junio C Hamano wrote:
> >
> >> > I actually wonder if it is worth carrying around the O_NOATIME hack at
> >> > all.
> >> 
> >> Yes, I share the thought.  We no longer have too many loose objects
> >> to matter.
> >> 
> >> I do not mind flipping the order, but I'd prefer to cook the result
> >> even longer.  I am tempted to suggest we take two step route:
> >> 
> >>  - ship 2.11 with the "atime has been there and we won't regress it"
> >>shape, while cooking the "cloexec is semantically more
> >>important" version in 'next' during the feature freeze
> >> 
> >>  - immediately after 2.11 merge it to 'master' for 2.12 to make sure
> >>there is no fallout.
> >
> > That sounds reasonable, though I'd consider jumping straight to "NOATIME
> > is not worth it; drop it" as the patch for post-2.11.
> 
> That endgame is fine by me too.  Thanks for a sanity-check.

So here's that endgame patch. My main concern with it was that there
might be non-Linux systems that could be affected. But when I dug into
it, I found that this code was never activated anywhere besides Linux in
the first place. So I really doubt this will have any negative impact at
all. I certainly don't mind cooking it until post-2.11, though.

+cc Linus as the original author of 144bde78e9 in case there is
something subtle I'm missing, but this really just seems like it's
an outdated optimization.

-- >8 --
Subject: [PATCH] sha1_file: stop opening files with O_NOATIME

When we open object files, we try to do so with O_NOATIME.
This dates back to 144bde78e9 (Use O_NOATIME when opening
the sha1 files., 2005-04-23), which is an optimization to
avoid creating a bunch of dirty inodes when we're accessing
many objects.  But a few things have changed since then:

  1. In June 2005, git learned about packfiles, which means
 we would do a lot fewer atime updates (rather than one
 per object access, we'd generally get one per packfile).

  2. In late 2006, Linux learned about "relatime", which is
 generally the default on modern installs. So
 performance around atimes updates is a non-issue there
 these days.

 All the world isn't Linux, but as it turns out, Linux
 is the only platform to implement O_NOATIME in the
 first place.

So it's very unlikely that this code is helping anybody
these days.

It's not a particularly large amount of code, but the
fallback-retry creates complexity. E.g., we do a similar
fallback for CLOEXEC; which one should take precedence, or
should we try all possible combinations? Dropping O_NOATIME
makes those questions go away.

Signed-off-by: Jeff King 
---
 sha1_file.c | 15 +--
 1 file changed, 1 insertion(+), 14 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index 09045df1dc..6f02a57d8b 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -27,14 +27,6 @@
 #include "list.h"
 #include "mergesort.h"
 
-#ifndef O_NOATIME
-#if defined(__linux__) && (defined(__i386__) || defined(__PPC__))
-#define O_NOATIME 0100
-#else
-#define O_NOATIME 0
-#endif
-#endif
-
 #define SZ_FMT PRIuMAX
 static inline uintmax_t sz_fmt(size_t s) { return s; }
 
@@ -1561,7 +1553,7 @@ int check_sha1_signature(const unsigned char *sha1, void 
*map,
 
 int git_open(const char *name)
 {
-   static int sha1_file_open_flag = O_NOATIME | O_CLOEXEC;
+   static int sha1_file_open_flag = O_CLOEXEC;
 
for (;;) {
int fd;
@@ -1577,11 +1569,6 @@ int git_open(const char *name)
continue;
}
 
-   /* Might the failure be due to O_NOATIME? */
-   if (errno != ENOENT && (sha1_file_open_flag & O_NOATIME)) {
-   sha1_file_open_flag &= ~O_NOATIME;
-   continue;
-   }
return -1;
}
 }
-- 
2.10.1.916.g0d2035c



Re: Expanding Includes in .gitignore

2016-10-27 Thread Aaron Pelly
On 27/10/16 15:22, Stefan Beller wrote:
>> The use case for this is where I did not write my own rules, but I want
>> to keep them updated. https://github.com/github/gitignore is a damn good
>> resource, but I want to pull it and include relevant bits project by
>> project and/or system wide. I don't want to have to update many projects
>> manually if that, or any other, repo changes.
>
> .git/info/exclude could be a (sym)link to an up to date version
> of the gitignore repo as a hack?
>

Using links isn't a bad idea, but you still end up at some stage
combining the contents of several files that already exist. Well, in my
example, anyway.

I accept that I'm being pretty trivial, and once it's set up there's
never any pressing need to change anything, but it still irks me.

Even with a linked .gitignore, or .git/info/exclude there will be
sections that are project, language, editor, machine, whatever,
specific. So I still need to copy stuff from one file to another by
hand. By allowing includes, I only have to have a link to each file
describing the data types for each component of the environment.

And they are community maintained, so I don't have to google every time
I try a new editor.

note to self: reply-to isn't the list.


Re: [PATCH v1 16/19] read-cache: unlink old sharedindex files

2016-10-27 Thread Christian Couder
On Thu, Oct 27, 2016 at 12:25 PM, Duy Nguyen  wrote:
> On Tue, Oct 25, 2016 at 5:43 PM, Duy Nguyen  wrote:
>>>  static int write_shared_index(struct index_state *istate,
>>> @@ -2211,8 +2269,11 @@ static int write_shared_index(struct index_state 
>>> *istate,
>>> }
>>> ret = rename_tempfile(_sharedindex,
>>>   git_path("sharedindex.%s", 
>>> sha1_to_hex(si->base->sha1)));
>>> -   if (!ret)
>>> +   if (!ret) {
>>> hashcpy(si->base_sha1, si->base->sha1);
>>> +   clean_shared_index_files(sha1_to_hex(si->base->sha1));
>>
>> This operation is technically garbage collection and should belong to
>> "git gc --auto", which is already called automatically in a few
>> places. Is it not called often enough that we need to do the cleaning
>> up right after a new shared index is created?
>
> Christian, if we assume to go with Junio's suggestion to disable
> split-index on temporary files, the only files left we have to take
> care of are index and index.lock. I believe pruning here in this code
> will have an advantage over in "git gc --auto" because when this is
> executed, we know we're holding index.lock, so nobody else is updating
> the index, it's race-free. All we need to do is peek in $GIT_DIR/index
> to see what shared index file it requires and keep it alive too, the
> remaining of shared index files can be deleted safely. We don't even
> need to fall back to mtime.
>
> git-gc just can't match this because while it's running, somebody else
> may be updating $GIT_DIR/index. Handling races would be a lot harder.

Yeah, I agree that if we disable split-index on temporary files and on
commands like "git read-tree --index-output=" then it is the
right thing to do it in write_shared_index(). (In fact when I wrote
the previous RFC series I didn't think about those special cases, and
that was the reason why I did it like this. So I just need to go back
to the implementation that was in the previous RFC series.)

I am just still wondering if disabling split-index on temporary files
could not have a bad performance impact for some use cases, but I
guess we could always come back to problem again if that happens.

Thanks,
Christian.


Re: Expanding Includes in .gitignore

2016-10-27 Thread Alexei Lozovsky
> I'm thinking something like ". path/to/include/file" in an ignore file,
> and/or creating .gitignore.d and/or allowing $HOME/.config/git/ignore
> and $GIT_DIR/info/exclude to be directories. Or some sane and consistent
> mixture of these things.

I think the rc.d-like approach with directories is better as it
does not add new magical filenames (what if I absolutely do need
to name my directories ". path", with a space? :) keeping the
syntax of gitignores themselves simple, and allowing us to think
of files as still being separate entities. This can be useful
for the following case:

> In the case of a directory the plan would be to add links to files
> stored/sourced elsewhere. This does pose a precedence question which I
> haven't thought about yet, but probably makes it too hard for the
> limited value it brings.

As I understand, the precedence only matters for negative patterns
(the ones that start with an exclamation mark, !). For example,
suppose you have files 'foo1' and 'foo2'. This .gitignore

foo*
!foo1

will ignore foo2, but will show foo1. However, if the lines are
swapped:

!foo1
foo*

then both foo1 and foo2 will be ignored.

Now, if we consider the case of multiple .gitignore files, it
could be unexpected and possibly annoying for negative patterns
in one file to affect the patterns added by some other files.
I would find it more conceptually simple to apply individual
.gitignores one by one, as opposed to parsing them all and
creating one giant exclusion rule. (In technical terms, this
means keeping one struct exclude_list for each .gitignore,
not merging them all into one single list.)

In this case there should be no precendence problems as applied
gitignores only add new ignored files, without un-ignoring
anything previously ignored by other files.

However, if we allow textual inclusion, then it means that we
can put a gitignore into our gitignore so that we can unignore
while we ignore, which again brings us the question of whether
it is actually needed and expected.

> I would like to know the desirability/practicality/stupidity of such a
> feature as I believe it is within my skillset to implement it.

In my mind, this feature definitely has utility and it can be
implemented in backwards-compatible way, so why not.

However, I do not recall any precendent of git using rc.d-like
configs. And some can argue that your goal can be achieved by
generating the .gitignore by some external means and symlinking
the result into .git/info/exclude, so this is not Git's problem
and we should not be overcomplicating things with something as
simple as a list exclude patterns. This line of argument also
can be used to opposes any textual inclusion as well, because
it can be expanded into 'why don't we add a Turing-complete
programming language then to specify the patterns to ignore'.


Re: [PATCH] Documentation/git-diff: document git diff with 3+ commits

2016-10-27 Thread Michael J Gruber
Junio C Hamano venit, vidit, dixit 26.10.2016 20:11:
> Michael J Gruber  writes:
> 
>> That one is difficult to discover but super useful, so document it:
>> Specifying 3 or more commits makes git diff switch to combined diff.
>>
>> Signed-off-by: Michael J Gruber 
>> ---
>>
>> Notes:
>> Note that we have the following now:
>> ...
>> 'git diff A..B' equivalent to 'git diff A B'
>> in contrast to 'git log A..B' listing commits between M and B only
>> (without the commits between M and A unless they are "in" B).
> 
> The standard answer is: 
> 
> Do not use two-dot form with 'git diff', if you find it
> confusing.  Diff is about two endpoints, not about a range
> between two.
> 
> The reason why we do not reject can be easily guessed by any
> intelligent person when some historical background is given, I
> think.

That is very well true. I'm more concerned with the presence, though,
that is: How easy to use is git now? Users choose git because they care
about the history of their project, not necessarily that of Git ;)

>  - In the beginning A...B did not exist.  A..B was the only "range"
>notation.
> 
>  - "git log A..B" was in wide use.  Remember, "git log A...B" did
>not exist.
> 
>  - People started mistyping "git diff A..B", which looked as if the
>user typed "git diff ^A B" to the internal.
> 
>  - Git _could_ have rejected that as a bogus request to diff two
>points, ^A (what is that???) and B, but "What else could the user
>have meant with 'git diff A..B' other than 'git diff A B'?" was
>an argument to favor doing _something_ useful rather than
>erroring out.  Remember, "A...B" did not exist when this
>happened.

It did not exist, but even at that point in time, "git log A..B" listed
only commits between the merge base and B, not those which are only in A
and not in B. Whereas "git diff A B" shows the differences between the
endpoints A and B.

>> @@ -12,6 +12,7 @@ SYNOPSIS
>>  'git diff' [options] [] [--] [...]
>>  'git diff' [options] --cached [] [--] [...]
>>  'git diff' [options]   [--] [...]
>> +'git diff' [options][...]
> 
> Made me wonder "is [...] 0-or-more As or 1-or-more As?".

0-or-more, at least that's the way it is used in all lines here.

> Don't we allow pathspecs in this case?

Yes, the combinded diff mode kicks in only with no blobs (not: no
pathspec, which I had misread) and N>=3 commits. Maybe I should update
the code comments in builtin/diff.c to describe this, too.

Michael


Re: [PATCH] Update git rebase documentation to clarify HEAD behavior

2016-10-27 Thread Junio C Hamano
Cody Sehl  writes:

> The first few paragraphs in the git-rebase.txt documentation lay out the 
> steps git takes during a rebase:
> 1. everything from `..HEAD` is saved to a temporary area
> 2. `HEAD` is set to ``
> 3. the changes held in the temporary area are applied one by one in order on 
> top of the new `HEAD`
>
> The second step was described using the phrase `The current branch is reset 
> to `, which is true (because `HEAD` == current branch), but not 
> clear.
> ---

Please wrap your lines to reasonable lengths like 70 columns or so.
Please sign off your patch.

>  Documentation/git-rebase.txt | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt
> index de222c8..c47ca11 100644
> --- a/Documentation/git-rebase.txt
> +++ b/Documentation/git-rebase.txt
> @@ -33,7 +33,7 @@ of commits that would be shown by `git log 
> ..HEAD`; or by
>  description on `--fork-point` below); or by `git log HEAD`, if the
>  `--root` option is specified.
>  
> -The current branch is reset to , or  if the
> +HEAD is reset to , or  if the
>  --onto option was supplied.  This has the exact same effect as
>  `git reset --hard ` (or ).  ORIG_HEAD is set
>  to point at the tip of the branch before the reset.

This is describing an ancient behaviour before 6fd2f5e60d ("rebase:
operate on a detached HEAD", 2007-11-08) in v1.5.4 timeframe.  We
apparently failed to update the description.  

This depends on the desired technical detail of the description, but
a correct rewrite would be "HEAD is detached at , or
, and ORIG_HEAD is set to point at the tip of the branch
before this happens".  Detaching the HEAD at  no longer
has the same effect as "git reset --hard " (or ),
so that sentence must go.  It was the primary point of the ancient
change at 6fd2f5e60d after all.

And then there is a new step (to be numbered 4. in your description
in the proposed log message), which updates the tip of the branch to
the resulting HEAD (after replaying all these changes) and check the
branch out, which needs to be added.  Perhaps after "one by one, in
order."  Oh, the mention of "reapplied to the current branch" also
needs to be updated to "reapplied to the detached HEAD", too.

On the other hand, if we do not aim for that deep level of technical
correctness, but want to tell a white lie to make it easier to
understand at the conceptual level to new readers who haven't
grasped the detached HEAD, then the current description is fine.  By
bringing up "HEAD", you seem to be aiming for techincal correctness
(which I tend to agree is a good direction to go in this part of the
documentation), so the existing text needs a bit more work than your
patch to be brought to the modern world.




Re: [PATCH] compat: Allow static initializer for pthreads on Windows

2016-10-27 Thread Junio C Hamano
Junio C Hamano  writes:

> Johannes Sixt  writes:
>
>>> As many codepaths may not even need access to the attributes, I
>>> doubt that would be a very productive direction to go.
>>
>> So, what is productive then? Pessimizing one (not exactly minor) platform?
>
> Lazy on-demand initialization as needed, perhaps?  The on-demand
> initialization mechanism may become no-op on some platforms that can
> do static initialization.

Ah, I think I misunderstood your "please rewrite".  Did you mean to
add "void attr_start(void)" helper function to attr.c that does
series of pthread_mutex_init() calls as needed?  That function can
be called from main() of platforms that cannot statically initialize
mutices, while on other platforms it can be a no-op as long as the
variables are statically initialized?  If so, that would not pessimize
any platform, I would think.





Re: [PATCH] compat: Allow static initializer for pthreads on Windows

2016-10-27 Thread Johannes Sixt

Am 27.10.2016 um 08:21 schrieb Junio C Hamano:

Johannes Sixt  writes:


As many codepaths may not even need access to the attributes, I
doubt that would be a very productive direction to go.


So, what is productive then? Pessimizing one (not exactly minor) platform?


Lazy on-demand initialization as needed, perhaps?  The on-demand
initialization mechanism may become no-op on some platforms that can
do static initialization.


This is the pessimization that I am talking about. I would not mind at 
all if it were only for the attribute subsystem, but the proposed patch 
would pessimize *all* uses of pthread_mutex_lock.


-- Hannes



[PATCH] Update git rebase documentation to clarify HEAD behavior

2016-10-27 Thread Cody Sehl
The first few paragraphs in the git-rebase.txt documentation lay out the steps 
git takes during a rebase:
1. everything from `..HEAD` is saved to a temporary area
2. `HEAD` is set to ``
3. the changes held in the temporary area are applied one by one in order on 
top of the new `HEAD`

The second step was described using the phrase `The current branch is reset to 
`, which is true (because `HEAD` == current branch), but not clear.
---
 Documentation/git-rebase.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt
index de222c8..c47ca11 100644
--- a/Documentation/git-rebase.txt
+++ b/Documentation/git-rebase.txt
@@ -33,7 +33,7 @@ of commits that would be shown by `git log ..HEAD`; 
or by
 description on `--fork-point` below); or by `git log HEAD`, if the
 `--root` option is specified.
 
-The current branch is reset to , or  if the
+HEAD is reset to , or  if the
 --onto option was supplied.  This has the exact same effect as
 `git reset --hard ` (or ).  ORIG_HEAD is set
 to point at the tip of the branch before the reset.

--
https://github.com/git/git/pull/301


Re: [PATCH] compat: Allow static initializer for pthreads on Windows

2016-10-27 Thread Junio C Hamano
Johannes Sixt  writes:

>> As many codepaths may not even need access to the attributes, I
>> doubt that would be a very productive direction to go.
>
> So, what is productive then? Pessimizing one (not exactly minor) platform?

Lazy on-demand initialization as needed, perhaps?  The on-demand
initialization mechanism may become no-op on some platforms that can
do static initialization.


Re: [PATCH] compat: Allow static initializer for pthreads on Windows

2016-10-27 Thread Johannes Sixt

Am 27.10.2016 um 08:02 schrieb Junio C Hamano:

Johannes Sixt  writes:


Am 26.10.2016 um 23:57 schrieb Stefan Beller:

In Windows it is not possible to have a static initialized mutex as of
now, but that seems to be painful for the upcoming refactoring of the
attribute subsystem, as we have no good place to put the initialization
of the attr global lock.


Please rewrite the attribute system such that it can have a dynamic
initialization. If you find a global initialization in main() too
gross (I would agree) then setup_git_directory() might be the right
place.


As many codepaths may not even need access to the attributes, I
doubt that would be a very productive direction to go.


So, what is productive then? Pessimizing one (not exactly minor) platform?

-- Hannes



Re: [PATCH] compat: Allow static initializer for pthreads on Windows

2016-10-27 Thread Junio C Hamano
Johannes Sixt  writes:

> Am 26.10.2016 um 23:57 schrieb Stefan Beller:
>> In Windows it is not possible to have a static initialized mutex as of
>> now, but that seems to be painful for the upcoming refactoring of the
>> attribute subsystem, as we have no good place to put the initialization
>> of the attr global lock.
>
> Please rewrite the attribute system such that it can have a dynamic
> initialization. If you find a global initialization in main() too
> gross (I would agree) then setup_git_directory() might be the right
> place.

As many codepaths may not even need access to the attributes, I
doubt that would be a very productive direction to go.



Re: "git subtree --squash" interacts poorly with revert, merge, and rebase

2016-10-27 Thread Junio C Hamano
Junio C Hamano  writes:

> Peter Williams  writes:
>
>> However, for git commands such as diff/status whose job is to display
>> information it would be nice if they had a --recursive option to
>> override the default submodule diff/status and show details of the
>> changes in the submodules.  Sometimes you want to see the big picture
>> in detail.
>
> I won't disagree. My comment was only on this part from the original:
>
>>> - We have to make separate commits and manage corresponding topic
>>> branches for the superproject and subprojects.
>
> and on this point, we seem to be in agreement.

Oh, and as Stefan mentioned, a "git diff" that recurses into the
submodules to give you detailed big picture has been in 'next'
(perhaps aready in 'master' as of tonight, but I am not sure
offhand) to be tested, together with many other fixes and
enhancements that all are waiting to be included in future releases.

The more people try and give feedback to these branches early, the
more solid release with better support for more goodies you'd want
we will be able to give you.  Early adopters are always appreciated
but especially in time like this before the feature freeze for the
upcoming release (see tinyurl.com/gitCal for the schedule), they are
of great help.

Start by cloning from any one of these places

  git://git.kernel.org/pub/scm/git/git.git/
  https://kernel.googlesource.com/pub/scm/git/git
  git://repo.or.cz/alt-git.git/
  https://github.com/git/git/

and then

  $ git checkout -b next origin/next
  : read INSTALL to figure out if any custom options are needed
  : in the following 'make' invocations for your environment
  $ make && make install
  $ PATH=$HOME/bin:$PATH

to join the fun.