Re: [PATCH] compat: Allow static initializer for pthreads on Windows
Am 27.10.2016 um 23:49 schrieb Jacob Keller: Ok, so I've been reading this thread. I don't understand your objections to emulating in this way.. Could you clearly spell out why you believe this solution isn't acceptable? So far all I've understood was "it's not critical sections" and "it penalizes Windows too much" but... If Windows cannot statically initialize a pthread mutex, then we *have* to dynamically initialize it somewhere. This solution adds a single check before each lock and is safe due to use of memory barriers. Yes, this will cost a tiny bit extra overhead for each use of "pthread_mutex_lock" but I fail to see how that is a huge penalty... One point is that the DCLP idiom must be implemented correctly. There are solutions, of course, and when the initialization is over, we have a miniscule overhead at each pthread_mutex_lock call. The main point is that the initialization has to solve a chicken-and-egg problem: After we have found an uninitialized critical section, we have to have mutual exclusion for the initialization. We need another critical section for this, but we cannot have one that is initialized. For this reason, the solution uses a different kind of mutual exclusion primitive, which is more akin to POSIX semaphores and works across processes. In the patch proposed by Stefan, a *session-wide* mutex is used. That means that all concurrent git invocations in a user's session synchronize their initialization of critical section objects. That's just ridiculous. It's like waiting for a ... no, *the* ... battle ship just to get our bouncers in their position. We are talking milliseconds here, not nanoseconds. -- Hannes
Re: [PATCH v3 2/3] sha1_file: open window into packfiles with O_CLOEXEC
Junio C Hamanowrote: > Junio C Hamano writes: > > > Linus Torvalds writes: > > > >> On Thu, Oct 27, 2016 at 4:36 PM, Junio C Hamano wrote: > >>> > >>> Would the best endgame shape for this function be to open with > >>> O_NOATIME (and retry without), and then add CLOEXEC with fcntl(2) > >>> but ignoring an error from it, I guess? That would be the closest > >>> to what we historically had, I would think. > >> > >> I think that's the best model. Actually, I would flip the order of flags. O_CLOEXEC is more important from a correctness standpoint. > > OK, so perhaps like this. > > Hmph. This may not fly well in practice, though. > > To Unix folks, CLOEXEC is not a huge correctness issue. A child > process may hold onto an open file descriptor a bit longer than the > lifetime of the parent but as long as the child eventually exits, I'm not too familiar with C internals of git; but I know we use threads in some places, and fork+execve in others. If our usage of threads and execve intersects, and we run untrusted code in an execve-ed child, then only having cloexec on open() will save us time when auditing for leaking FDs. fcntl(fd, F_SETFD, O_CLOEXEC) is racy in if there are other threads doing execve; so I wouldn't rely on it as a first choice. So I suppose something like this: static int noatime = 1; int fd = open(... | O_CLOEXEC); ...error checking and retrying... if (fd >= 0 && noatime && fcntl(fd, F_SETFL, O_NOATIME) != 0) noatime = 0; return fd;
Re: [RFC PATCH 0/5] recursively grep across submodules
On Thu, Oct 27, 2016 at 7:50 PM, Junio C Hamanowrote: > > Unless you are imagining "git grep" to initialize and checkout a > submodule that is not checked out on-demand, I do not think you have > any reason to even look at ".gitmodules" for the purpose of "I want > to grep both in superproject and submodules that are checked out." In tree-ish mode you may have this example: git -C superproject rm path/to/submodule git -C superproject commit -a -m "delete submodule" ... time passes ... git -C superproject grep --recurse-submodule -e \ HEAD~42 path/to/submodule In the last command you need to map the path to submodule to the name of the submodule to find out the place of the object store for that submodule and see if it exists. > If it is working with a tree-ish, again, go look at the object store > in that submodule repository. and to find out the object store for that submodule you need the path -> name mapping at that point in time, i.e. you want to look at the .gitmodules file at the given tree-ish.
Loan
-- Are you in need of easy qualifying business or personal loans? We offer all kinds of loans at 3% contact us today at. Carmona.gomez402@gmail.comFull Name:Need Loan Amount:Loan Duration:Phone Number:Applied before :?State:Monthly Income:Country:contact us today at. carmona.gomez402@gmail.comYours faithfullyGloria Gomez CarmonaMD / CEO
"git push" says "src refspec XYZ matches more than one" even without explicit XYZ argument.
1. My repo has a branch named 'v1' that is tracking 'origin/v1'. 2. My repo has a tag named 'v1'. 3. I have "push.default" set to "upstream". I made a commit on branch 'v1' and tried doing a push: # git push error: src refspec v1 matches more than one. error: failed to push some refs to 'g...@github.com:whatever/ns1-go.git' If I rename the branch to 'v1-dev', then the push goes through. I understand why the command "git push origin/v1 v1" is ambiguous. But if I do a plain "git push", I thought Git would know to push my current branch. [Git version 2.10.1 from Homebrew on Mac OS 10.11.6.]
Re: Expanding Includes in .gitignore
Jeff Kingwrites: > However, as I said elsewhere, I'm not convinced this feature is all that > helpful for in-repository .gitignore files, and I think it does > introduce compatibility complications. People with older git will not > respect your .gitignore.d files. Whereas $GIT_DIR/info is purely a local > matter. As I do not see the point of making in-tree .gitignore to a forest of .gitignore.d/ at all, compatibility complications is not worth even thinking about, I would have to say. Thanks.
Re: [RFC PATCH 0/5] recursively grep across submodules
Stefan Bellerwrites: >> Just a few brief comments, before reading the patches carefully. >> >> * It is somewhat surprising that [1/5] is even needed (in other >>words, I would have expected something like this to be already >>there, and my knee-jerk reaction was "Heh, how does 'git status' >>know how to show submodules that are and are not initialized >>differently without this?" > > The issue with much of the existing code is that it is submodule centric, > i.e. it is written to not care about the rest. > > git status for example just calls "git submodule summary" to > parse and display the submodule information additionally. > It doesn't integrate submodules and treats them "just like files". Oh, I know all that after/while writing the above "it is somewhat surprising" and reading what wt-status.c does. It was just that it was somewhat surprising ;-) > My reaction to 1/5 was that the implementation is sound, > but the design may need rethinking. > > Instead of asking all these question, "Is a submodule > * initialized > * checked out (== have a working dir) > * have a .git dir (think of deleted submodules that keep the > historical git dir around) > (* have commit X) > we would want to either extend the submodule-config API > to also carry these informations just like > name/path/sha1/url/shallow clone recommendation. I think you are going in a wrong direction with all the above. Unless you are imagining "git grep" to initialize and checkout a submodule that is not checked out on-demand, I do not think you have any reason to even look at ".gitmodules" for the purpose of "I want to grep both in superproject and submodules that are checked out." You only need to detect .gitlink that exists in the index of the superproject, and then there would be only two cases: * $path has an empty directory (not even .git in there). The user is not interested in that submodule. * $path has ".git", either a directory (old layout or we are dealing with the repository that originated the submodule) or a "gitdir:" file that points into .git/modules of the repository of the superproject. The user is interested in the submodule. If $path has ".git" and nothing else, the only explanation is that the user removed the working tree files in the submodule. If your grep is looking at working tree files, it is correct not to find anything in there. If it is working with "--cached", go look at the index of the submodule repository (either the ".git" directory, or the stashed-away repository in .git/modules/ in the superproject). If it is working with a tree-ish, again, go look at the object store in that submodule repository. >> * It is somewhat surprising that [4/5] does not even use the >>previous ls-files to find out the paths. Also it is a bit >>disappointing to see that the way processes are spawned and >>managed does not share much with Stefan's earlier work, i.e. >>run_processes_parallel(). I was somehow hoping that it can be >>extended to support this use case, but apparently there aren't >>much to be shared. > > I think there are 2 issues here: There is no issue here. I was just giving my impressions (i.e. "somewhat surprising"). > * git-grep already has its own thread pool I know. I was expecting that the previous "ls-files" that recurses will be used to feed into that thread pool, but I didn't find that in my cursory look at the patch, hence "somewhat surprising". I hate it when people become overly defensive and start making excuses when given harmless observations.
Re: [PATCH v3 2/3] sha1_file: open window into packfiles with O_CLOEXEC
Junio C Hamanowrites: > Linus Torvalds writes: > >> On Thu, Oct 27, 2016 at 4:36 PM, Junio C Hamano wrote: >>> >>> Would the best endgame shape for this function be to open with >>> O_NOATIME (and retry without), and then add CLOEXEC with fcntl(2) >>> but ignoring an error from it, I guess? That would be the closest >>> to what we historically had, I would think. >> >> I think that's the best model. > > OK, so perhaps like this. Hmph. This may not fly well in practice, though. To Unix folks, CLOEXEC is not a huge correctness issue. A child process may hold onto an open file descriptor a bit longer than the lifetime of the parent but as long as the child eventually exits, nothing is affected. Over there, things are different. The parent cannot even rename(2) or unlink(2) a file it created and closed while the child is still holding the file descriptor open and the lack of CLOEXEC will make the parent fail. I do not know how well fcntl(2) emulation works on Windows, but I would not be surprised if J6t or Dscho comes back and says that FD_CLOEXEC given to F_SETFD would not work while O_CLOEXEC given to open(2) does.
Re: [PATCH v3 2/3] sha1_file: open window into packfiles with O_CLOEXEC
Linus Torvaldswrites: > On Thu, Oct 27, 2016 at 4:36 PM, Junio C Hamano wrote: >> >> Would the best endgame shape for this function be to open with >> O_NOATIME (and retry without), and then add CLOEXEC with fcntl(2) >> but ignoring an error from it, I guess? That would be the closest >> to what we historically had, I would think. > > I think that's the best model. OK, so perhaps like this. -- >8 -- Subject: git_open(): untangle possible NOATIME and CLOEXEC interactions The way we structured the fallback-retry for opening with O_NOATIME and O_CLOEXEC meant that if we failed due to lack of support to open the file with O_NOATIME option (i.e. EINVAL), we would still try to drop O_CLOEXEC first and retry, and then drop O_NOATIME. A platform on which O_NOATIME is defined in the header without support from the kernel wouldn't have a chance to open with O_CLOEXEC option due to this code structure. Arguably, O_CLOEXEC is more important than O_NOATIME, as the latter is mostly about performance, while the former can affect correctness. Let's revert the recent changes to the way git_open() attempts to open a file with O_NOATIME and retries without to the original sequence, and then use a separate fcntl(fd, F_SETFD, FD_CLOEXEC) on the resulting file descriptor. The helper to do the latter can be usable in the codepath in ce_compare_data() that was recently added to open a file descriptor with O_CLOEXEC, so let's refactor that codepath with the helper while we are at it. Signed-off-by: Junio C Hamano --- git-compat-util.h | 5 +++-- read-cache.c | 12 sha1_file.c | 49 ++--- 3 files changed, 37 insertions(+), 29 deletions(-) diff --git a/git-compat-util.h b/git-compat-util.h index 43718dabae..a751630db5 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -679,9 +679,10 @@ char *gitstrdup(const char *s); #define getpagesize() sysconf(_SC_PAGESIZE) #endif -#ifndef O_CLOEXEC -#define O_CLOEXEC 0 +#ifndef FD_CLOEXEC +#define FD_CLOEXEC 0 #endif +extern int git_set_cloexec(int); #ifdef FREAD_READS_DIRECTORIES #ifdef fopen diff --git a/read-cache.c b/read-cache.c index db5d910642..fb91514885 100644 --- a/read-cache.c +++ b/read-cache.c @@ -156,17 +156,13 @@ void fill_stat_cache_info(struct cache_entry *ce, struct stat *st) static int ce_compare_data(const struct cache_entry *ce, struct stat *st) { int match = -1; - static int cloexec = O_CLOEXEC; - int fd = open(ce->name, O_RDONLY | cloexec); - - if ((cloexec & O_CLOEXEC) && fd < 0 && errno == EINVAL) { - /* Try again w/o O_CLOEXEC: the kernel might not support it */ - cloexec &= ~O_CLOEXEC; - fd = open(ce->name, O_RDONLY | cloexec); - } + int fd = open(ce->name, O_RDONLY); if (fd >= 0) { unsigned char sha1[20]; + + /* do not let child processes to hold onto the open fd */ + git_set_cloexec(fd); if (!index_fd(sha1, fd, st, OBJ_BLOB, ce->name, 0)) match = hashcmp(sha1, ce->oid.hash); /* index_fd() closed the file descriptor already */ diff --git a/sha1_file.c b/sha1_file.c index 09045df1dc..41383a6c20 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -1559,31 +1559,42 @@ int check_sha1_signature(const unsigned char *sha1, void *map, return hashcmp(sha1, real_sha1) ? -1 : 0; } -int git_open(const char *name) +int git_set_cloexec(int fd) { - static int sha1_file_open_flag = O_NOATIME | O_CLOEXEC; + static int cloexec = FD_CLOEXEC; - for (;;) { - int fd; + if (cloexec) { + if (fcntl(fd, F_SETFD, FD_CLOEXEC) < 0) + cloexec = 0; + /* +* We might want to diagnose and complain upon seeing +* an error from this call, but let's keep the same +* behaviour as before for now. +*/ + } + return 0; +} - errno = 0; - fd = open(name, O_RDONLY | sha1_file_open_flag); - if (fd >= 0) - return fd; +int git_open(const char *name) +{ + static int noatime = O_NOATIME; + int fd; - /* Try again w/o O_CLOEXEC: the kernel might not support it */ - if ((sha1_file_open_flag & O_CLOEXEC) && errno == EINVAL) { - sha1_file_open_flag &= ~O_CLOEXEC; - continue; - } + errno = 0; + fd = open(name, O_RDONLY | noatime); - /* Might the failure be due to O_NOATIME? */ - if (errno != ENOENT && (sha1_file_open_flag & O_NOATIME)) { - sha1_file_open_flag &= ~O_NOATIME; - continue; - } - return -1; + /* Might the
Re: [RFC PATCH 0/5] recursively grep across submodules
On Thu, Oct 27, 2016 at 4:26 PM, Junio C Hamanowrote: > Brandon Williams writes: > >> As for the rest of the series, it should be ready for review or comments. > > Just a few brief comments, before reading the patches carefully. > > * It is somewhat surprising that [1/5] is even needed (in other >words, I would have expected something like this to be already >there, and my knee-jerk reaction was "Heh, how does 'git status' >know how to show submodules that are and are not initialized >differently without this?" The issue with much of the existing code is that it is submodule centric, i.e. it is written to not care about the rest. git status for example just calls "git submodule summary" to parse and display the submodule information additionally. It doesn't integrate submodules and treats them "just like files". git submodule summary then proceeds to use "submodule--helper list" that lists submodules *only* ignoring all files. > >The implementation that reads from the config of the current >repository may be OK, but I actually would have expected that a >check would be "given a $path, check to see if $path/.git is >there and is a valid repository". In a repository where the >submodules originate, there may not even be submodule.$name.url >entries there yet. My reaction to 1/5 was that the implementation is sound, but the design may need rethinking. Instead of asking all these question, "Is a submodule * initialized * checked out (== have a working dir) * have a .git dir (think of deleted submodules that keep the historical git dir around) (* have commit X) we would want to either extend the submodule-config API to also carry these informations just like name/path/sha1/url/shallow clone recommendation. Obtaining the information above is however not as cheap, because we'd need to do extra work additionally to parsing the .gitmodules file. So the submodule-config would need to learn an input that will tell the submodule-config what informations should be evaluated and which can be omitted. > > * It is somewhat surprising that [4/5] does not even use the >previous ls-files to find out the paths. Also it is a bit >disappointing to see that the way processes are spawned and >managed does not share much with Stefan's earlier work, i.e. >run_processes_parallel(). I was somehow hoping that it can be >extended to support this use case, but apparently there aren't >much to be shared. I think there are 2 issues here: * The API I designed runs processes in parallel and the order or output is non-deterministic. git-grep uses threads and output is alphabetically sorted. The order is fixable though (by e.g. adding a flag that indicates which parallel processing output the caller wants). * git-grep already has its own thread pool; integrating/combining 2 worker pools doesn't sound trivial even to someone who wrote one of them. Maybe we could extend/rewrite the run_processes_parallel API to not just run processes, but instead you could also provide a function pointer that is used in a thread instead. Then we'd have one machinery that e.g. keeps track of the number of parallel processes/threads. Thanks, Stefan
Re: [PATCH v3 2/3] sha1_file: open window into packfiles with O_CLOEXEC
On Thu, Oct 27, 2016 at 4:36 PM, Junio C Hamanowrote: > > Would the best endgame shape for this function be to open with > O_NOATIME (and retry without), and then add CLOEXEC with fcntl(2) > but ignoring an error from it, I guess? That would be the closest > to what we historically had, I would think. I think that's the best model. Note that the O_NOATIME code is very much designed to try O_NOATIME only _once_. Because even when the kernel supports O_NOATIME, if you have a shared object tree where you may not be the owner of all the files, a O_NOATIME open can fail with NOPERM ("You are not allowed to hide your accesses to this file"). This is why it uses that static unsigned int sha1_file_open_flag = O_NOATIME; and if the O_NOATIME open ever fails (and the no-O_NOATIME open succeeds), it clears that flag. Exactly so that it will *not* end up in some kind of "let's open and fail and re-open" loop. It's designed to fail once. Or at least that's how it used to be originally. This code has obviously changed since that early design. Now it seems to clear it for any non-ENOENT error. Which looks fine too. Linus
Re: [PATCH v3 2/3] sha1_file: open window into packfiles with O_CLOEXEC
Linus Torvaldswrites: > But the basic issue still remains - I'd really prefer to have NOATIME > stay around for all those poor misguided souls that for some reason > don't like "relatime" or run old kernels. But whether it is with > O_NOATIME at open time or with F_SETFL, I don't care. Understood. Would the best endgame shape for this function be to open with O_NOATIME (and retry without), and then add CLOEXEC with fcntl(2) but ignoring an error from it, I guess? That would be the closest to what we historically had, I would think.
Re: feature request
On Thu, 27 Oct 2016, John Rood wrote: Thanks, I think changing the default for windows is a good idea. notepad doesn't work well with unix line endings, wordpad handles the files much more cleanly. David Lang
Re: [RFC PATCH 0/5] recursively grep across submodules
Brandon Williamswrites: > As for the rest of the series, it should be ready for review or comments. Just a few brief comments, before reading the patches carefully. * It is somewhat surprising that [1/5] is even needed (in other words, I would have expected something like this to be already there, and my knee-jerk reaction was "Heh, how does 'git status' know how to show submodules that are and are not initialized differently without this?" The implementation that reads from the config of the current repository may be OK, but I actually would have expected that a check would be "given a $path, check to see if $path/.git is there and is a valid repository". In a repository where the submodules originate, there may not even be submodule.$name.url entries there yet. * It is somewhat surprising that [4/5] does not even use the previous ls-files to find out the paths. Also it is a bit disappointing to see that the way processes are spawned and managed does not share much with Stefan's earlier work, i.e. run_processes_parallel(). I was somehow hoping that it can be extended to support this use case, but apparently there aren't much to be shared. Thanks.
Re: [PATCH v3 2/3] sha1_file: open window into packfiles with O_CLOEXEC
On Thu, Oct 27, 2016 at 4:09 PM, Linus Torvaldswrote: > > That said, now that I think about it, I should double-check: maybe > open() doesn't actually set atime at all, and we *could* do NOATIME > with SETFL after all. Checked. Yup. O_NOATIME could easily be done with SETFL:, because as with O_CLOEXEC, it only affects operations _after_ the open. The open itself doesn't set the access time. So I was full of it. But the basic issue still remains - I'd really prefer to have NOATIME stay around for all those poor misguided souls that for some reason don't like "relatime" or run old kernels. But whether it is with O_NOATIME at open time or with F_SETFL, I don't care. Linus
Re: feature request
Thanks. I wasn't aware of --no-edit, but that is indeed exactly what I was looking for. I think your point about encouraging users to make good use of commit messages is good. My concern though is that vim isn't encouraging users to leave good messages as much as it is scaring them away from leaving messages at all. On Thu, Oct 27, 2016 at 5:51 PM, Junio C Hamanowrote: > John Rood writes: > > [administrivia: do not top post] > >> What I'm really seeking is not a make-shift solution for myself, but >> an intuitive solution for the novice user-base at large. > > Well, there are -m and --no-edit. Recording commits with useless > single liner is a bad habit to get into, and change to encourage > novice user-base at large to do so is not a good idea.
Re: [PATCH v3 2/3] sha1_file: open window into packfiles with O_CLOEXEC
On Thu, Oct 27, 2016 at 3:56 PM, Junio C Hamanowrote: > > I also thought O_NOATIME shouldn't be an effective fcntl(2) thing, > for the reasons you stated above, when I was studying the area > because of the discussion on these patches. I was surprised to see > that http://man7.org/linux/man-pages/man2/fcntl.2.html contradicts > by saying F_SETFL can take O_NOATIME. > > Perhaps it deserves a bugreport? It's not a bug per se. You can indeed change O_NOATIME with F_SETFL. And it actually does matter, since atime is normally changed by mmap(), read() and write() calls. So F_SETFL O_NOATIME actually does make sense, but it doesn't cover the atime update done by open() itself. That said, now that I think about it, I should double-check: maybe open() doesn't actually set atime at all, and we *could* do NOATIME with SETFL after all. The exact [acm]time semantics are damn subtle. Linus
Re: Expanding Includes in .gitignore
On 28/10/16 10:07, Jeff King wrote: > I think it does > introduce compatibility complications. If this is not a show stopper, I am happy to knock out a short functional spec. I'll give it some hours before I begin.
Re: [PATCH v3 2/3] sha1_file: open window into packfiles with O_CLOEXEC
Linus Torvaldswrites: > Note that you can *not* do the same thing with O_NOATIME, since the > whole point of O_NOATIME is that it changes the behavior of the open > itself (unlike O_CLOEXEC which changes _later_ behavior, and can > always be replaced by FD_CLOEXEC fnclt modulo races that are > immaterial for git) A tangent. I also thought O_NOATIME shouldn't be an effective fcntl(2) thing, for the reasons you stated above, when I was studying the area because of the discussion on these patches. I was surprised to see that http://man7.org/linux/man-pages/man2/fcntl.2.html contradicts by saying F_SETFL can take O_NOATIME. Perhaps it deserves a bugreport?
Re: feature request
John Roodwrites: [administrivia: do not top post] > What I'm really seeking is not a make-shift solution for myself, but > an intuitive solution for the novice user-base at large. Well, there are -m and --no-edit. Recording commits with useless single liner is a bad habit to get into, and change to encourage novice user-base at large to do so is not a good idea.
Re: feature request
What I'm really seeking is not a make-shift solution for myself, but an intuitive solution for the novice user-base at large. On Thu, Oct 27, 2016 at 5:27 PM, Junio C Hamanowrote: > John Rood writes: > >> I suppose I can do git config --global core.editor notepad >> However, this really only addresses my second concern. >> >> My first concern is that using a text editor at all seems like >> overkill in many scenarios. > > Nobody stops you from writing a "type whatever you want; I won't let > you edit any mistakes as I am not even a text editor; just hit > RETURN when you are done, as you can only write a single line" > program and set it as your GIT_EDITOR. > > I do not know what would happen when you need "git commit --amend", > though.
Re: feature request
John Roodwrites: > On Thu, Oct 27, 2016 at 5:30 PM, Stefan Beller wrote: >> On Thu, Oct 27, 2016 at 2:55 PM, John Rood wrote: >>> Users should be able to configure Git to not send them into a Vim editor. >>> >>> When users pull commits, and a new commit needs to be created for a >>> merge, Git's current way of determining a commit message is to send >>> the user into a Vim window so that they can write a message. There are >>> 2 reasons why this might not be the ideal way to prompt for a commit >>> message. >>> >>> 1. Many users are used to writing concise one-line commit messages and >>> would not expect to save a commit message in a multi-line file. Some >>> users will wonder why they are in a text editor or which file they are >>> editing. Others may not, in fact, realize at all that a text editor is >>> what they are in. >> >> Look at the -m option of git commit, >> [administrivia: do not top post] > Thanks, I think changing the default for windows is a good idea. > > The -m indeed accomplishes one-line messages when you are voluntarily > doing a commit. However, the scenario I mentioned is "When users pull > commits, and a new commit needs to be created for the merge" In this > situation, the user isn't issuing the "git commit" command, and so > he/she doesn't have the opportunity to use the -m flag. There is --no-edit there.
Re: feature request
Thanks, I think changing the default for windows is a good idea. The -m indeed accomplishes one-line messages when you are voluntarily doing a commit. However, the scenario I mentioned is "When users pull commits, and a new commit needs to be created for the merge" In this situation, the user isn't issuing the "git commit" command, and so he/she doesn't have the opportunity to use the -m flag. On Thu, Oct 27, 2016 at 5:30 PM, Stefan Bellerwrote: > On Thu, Oct 27, 2016 at 2:55 PM, John Rood wrote: >> Users should be able to configure Git to not send them into a Vim editor. >> >> When users pull commits, and a new commit needs to be created for a >> merge, Git's current way of determining a commit message is to send >> the user into a Vim window so that they can write a message. There are >> 2 reasons why this might not be the ideal way to prompt for a commit >> message. >> >> 1. Many users are used to writing concise one-line commit messages and >> would not expect to save a commit message in a multi-line file. Some >> users will wonder why they are in a text editor or which file they are >> editing. Others may not, in fact, realize at all that a text editor is >> what they are in. > > Look at the -m option of git commit, > > git commit -a -m "look a commit with no editor, and a precise one line > message" > > I do not advocate this use though, as I think commit messages should be > more wordy. > >> >> 2. Many users are not familiar with Vim, and do not understand how to >> modify, save, and exit. It is not very considerate to require a user >> to learn Vim in order to finish a commit that they are in the middle >> of. > > That is true, but vi is like the most available editor as a relict > from ancient times; > as you are on Windows, maybe notepad is the best on that platform. > > Maybe file a bug/issue at https://github.com/git-for-windows to change > the default? > >> >> The existing behavior should be optional, and there should be two new >> options: >> >> 1. Use a simple inline prompt for a commit message (in the same way >> Git might prompt for a username). >> >> 2. Automatically assign names for commits in the form of "Merged x into y".
[PATCH 4/5] grep: optionally recurse into submodules
Allow grep to recognize submodules and recursively search for patterns in each submodule. This is done by forking off a process to recursively call grep on each submodule. The top level --super-prefix option is used to pass a path to the submodule which can in turn be used to prepend to output or in pathspec matching logic. Recursion only occurs for submodules which have been initialized and checked out by the parent project. If a submodule hasn't been initialized and checked out it is simply skipped. In order to support the existing multi-threading infrastructure in grep, output from each child process is captured in a strbuf so that it can be later printed to the console in an ordered fashion. To limit the number of theads that are created, each child process has half the number of threads as its parents (minimum of 1), otherwise we potentailly have a fork-bomb. Signed-off-by: Brandon Williams--- Documentation/git-grep.txt | 5 + builtin/grep.c | 301 ++--- git.c | 2 +- t/t7814-grep-recurse-submodules.sh | 99 4 files changed, 386 insertions(+), 21 deletions(-) create mode 100755 t/t7814-grep-recurse-submodules.sh diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt index 0ecea6e49..17aa1ba70 100644 --- a/Documentation/git-grep.txt +++ b/Documentation/git-grep.txt @@ -26,6 +26,7 @@ SYNOPSIS [--threads ] [-f ] [-e] [--and|--or|--not|(|)|-e ...] + [--recurse-submodules] [ [--[no-]exclude-standard] [--cached | --no-index | --untracked] | ...] [--] [...] @@ -88,6 +89,10 @@ OPTIONS mechanism. Only useful when searching files in the current directory with `--no-index`. +--recurse-submodules:: + Recursively search in each submodule that has been initialized and + checked out in the repository. + -a:: --text:: Process binary files as if they were text. diff --git a/builtin/grep.c b/builtin/grep.c index 8887b6add..f34f16df9 100644 --- a/builtin/grep.c +++ b/builtin/grep.c @@ -18,12 +18,20 @@ #include "quote.h" #include "dir.h" #include "pathspec.h" +#include "submodule.h" static char const * const grep_usage[] = { N_("git grep [] [-e] [...] [[--] ...]"), NULL }; +static const char *super_prefix; +static int recurse_submodules; +static struct argv_array submodule_options = ARGV_ARRAY_INIT; + +static int grep_submodule_launch(struct grep_opt *opt, +const struct grep_source *gs); + #define GREP_NUM_THREADS_DEFAULT 8 static int num_threads; @@ -174,7 +182,10 @@ static void *run(void *arg) break; opt->output_priv = w; - hit |= grep_source(opt, >source); + if (w->source.type == GREP_SOURCE_SUBMODULE) + hit |= grep_submodule_launch(opt, >source); + else + hit |= grep_source(opt, >source); grep_source_clear_data(>source); work_done(w); } @@ -300,6 +311,10 @@ static int grep_sha1(struct grep_opt *opt, const unsigned char *sha1, if (opt->relative && opt->prefix_length) { quote_path_relative(filename + tree_name_len, opt->prefix, ); strbuf_insert(, 0, filename, tree_name_len); + } else if (super_prefix) { + strbuf_add(, filename, tree_name_len); + strbuf_addstr(, super_prefix); + strbuf_addstr(, filename + tree_name_len); } else { strbuf_addstr(, filename); } @@ -328,10 +343,13 @@ static int grep_file(struct grep_opt *opt, const char *filename) { struct strbuf buf = STRBUF_INIT; - if (opt->relative && opt->prefix_length) + if (opt->relative && opt->prefix_length) { quote_path_relative(filename, opt->prefix, ); - else + } else { + if (super_prefix) + strbuf_addstr(, super_prefix); strbuf_addstr(, filename); + } #ifndef NO_PTHREADS if (num_threads) { @@ -378,31 +396,259 @@ static void run_pager(struct grep_opt *opt, const char *prefix) exit(status); } -static int grep_cache(struct grep_opt *opt, const struct pathspec *pathspec, int cached) +static void compile_submodule_options(const struct grep_opt *opt, + const struct pathspec *pathspec, + int cached, int untracked, + int opt_exclude, int use_index, + int pattern_type_arg) +{ + struct grep_pat *pattern; + int i; + + if (recurse_submodules) + argv_array_push(_options, "--recurse-submodules"); + + if (cached) + argv_array_push(_options,
[PATCH 5/5] grep: enable recurse-submodules to work on objects
Teach grep to recursively search in submodules when provided with a object. This allows grep to search a submodule based on the state of the submodule that is present in a commit of the super project. When grep is provided with a object, the name of the object is prefixed to all output. In order to provide uniformity of output between the parent and child processes the option `--parent-basename` has been added so that the child can preface all of it's output with the name of the parent's object instead of the name of the commit SHA1 of the submodule. This changes output from the command `git grep -e. -l --recurse-submodules HEAD` from: HEAD:file :sub/file to: HEAD:file HEAD:sub/file Signed-off-by: Brandon Williams--- Documentation/git-grep.txt | 13 ++-- builtin/grep.c | 67 +++--- t/t7814-grep-recurse-submodules.sh | 44 - tree-walk.c| 17 +- 4 files changed, 125 insertions(+), 16 deletions(-) diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt index 17aa1ba70..386a868c6 100644 --- a/Documentation/git-grep.txt +++ b/Documentation/git-grep.txt @@ -26,7 +26,7 @@ SYNOPSIS [--threads ] [-f ] [-e] [--and|--or|--not|(|)|-e ...] - [--recurse-submodules] + [--recurse-submodules] [--parent-basename] [ [--[no-]exclude-standard] [--cached | --no-index | --untracked] | ...] [--] [...] @@ -91,7 +91,16 @@ OPTIONS --recurse-submodules:: Recursively search in each submodule that has been initialized and - checked out in the repository. + checked out in the repository. When used in combination with the +option the prefix of all submodule output will be the name of + the parent project's object. + +--parent-basename:: + For internal use only. In order to produce uniform output with the + --recurse-submodules option, this option can be used to provide the + basename of a parent's object to a submodule so the submodule + can prefix its output with the parent's name rather than the SHA1 of + the submodule. -a:: --text:: diff --git a/builtin/grep.c b/builtin/grep.c index f34f16df9..bdf1b9089 100644 --- a/builtin/grep.c +++ b/builtin/grep.c @@ -19,6 +19,7 @@ #include "dir.h" #include "pathspec.h" #include "submodule.h" +#include "submodule-config.h" static char const * const grep_usage[] = { N_("git grep [] [-e] [...] [[--] ...]"), @@ -28,6 +29,7 @@ static char const * const grep_usage[] = { static const char *super_prefix; static int recurse_submodules; static struct argv_array submodule_options = ARGV_ARRAY_INIT; +static const char *parent_basename; static int grep_submodule_launch(struct grep_opt *opt, const struct grep_source *gs); @@ -535,6 +537,7 @@ static int grep_submodule_launch(struct grep_opt *opt, { struct child_process cp = CHILD_PROCESS_INIT; int status, i; + const char *end_of_base; struct work_item *w = opt->output_priv; prepare_submodule_repo_env(_array); @@ -545,9 +548,36 @@ static int grep_submodule_launch(struct grep_opt *opt, gs->path); argv_array_push(, "grep"); + /* +* Add basename of parent project +* When performing grep on a object the filename is prefixed +* with the object's name: ':filename'. In order to +* provide uniformity of output we want to pass the name of the +* parent project's object name to the submodule so the submodule can +* prefix its output with the parent's name and not its own SHA1. +*/ + end_of_base = strchr(gs->name, ':'); + if (end_of_base) + argv_array_pushf(, "--parent-basename=%.*s", +(int) (end_of_base - gs->name), +gs->name); + /* Add options */ - for (i = 0; i < submodule_options.argc; i++) + for (i = 0; i < submodule_options.argc; i++) { + /* +* If there is a identifier for the submodule, add the +* rev after adding the submodule options but before the +* pathspecs. To do this we listen for the '--' and insert the +* sha1 before pushing the '--' onto the child process argv +* array. +*/ + if (gs->identifier && + !strcmp("--", submodule_options.argv[i])) { + argv_array_push(, sha1_to_hex(gs->identifier)); + } + argv_array_push(, submodule_options.argv[i]); + } cp.git_cmd = 1; cp.dir = gs->path; @@ -672,12 +702,21 @@ static int grep_tree(struct grep_opt *opt, const struct pathspec *pathspec, enum interesting match = entry_not_interesting;
[PATCH 1/5] submodules: add helper functions to determine presence of submodules
Add two helper functions to submodules.c. `is_submodule_initialized()` checks if a submodule has been initialized at a given path and `is_submodule_checked_out()` check if a submodule has been checked out at a given path. Signed-off-by: Brandon Williams--- submodule.c | 39 +++ submodule.h | 2 ++ 2 files changed, 41 insertions(+) diff --git a/submodule.c b/submodule.c index 6f7d883de..029b24440 100644 --- a/submodule.c +++ b/submodule.c @@ -198,6 +198,45 @@ void gitmodules_config(void) } } +/* + * Determine if a submodule has been initialized at a given 'path' + */ +int is_submodule_initialized(const char *path) +{ + int ret = 0; + const struct submodule *module = NULL; + + module = submodule_from_path(null_sha1, path); + + if (module) { + struct strbuf buf = STRBUF_INIT; + char *submodule_url = NULL; + + strbuf_addf(, "submodule.%s.url",module->name); + ret = !git_config_get_string(buf.buf, _url); + + free(submodule_url); + strbuf_release(); + } + + return ret; +} + +/* + * Determine if a submodule has been checked out at a given 'path' + */ +int is_submodule_checked_out(const char *path) +{ + int ret = 0; + struct strbuf buf = STRBUF_INIT; + + strbuf_addf(, "%s/.git", path); + ret = file_exists(buf.buf); + + strbuf_release(); + return ret; +} + int parse_submodule_update_strategy(const char *value, struct submodule_update_strategy *dst) { diff --git a/submodule.h b/submodule.h index d9e197a94..bd039ca98 100644 --- a/submodule.h +++ b/submodule.h @@ -37,6 +37,8 @@ void set_diffopt_flags_from_submodule_config(struct diff_options *diffopt, const char *path); int submodule_config(const char *var, const char *value, void *cb); void gitmodules_config(void); +extern int is_submodule_initialized(const char *path); +extern int is_submodule_checked_out(const char *path); int parse_submodule_update_strategy(const char *value, struct submodule_update_strategy *dst); const char *submodule_strategy_to_string(const struct submodule_update_strategy *s); -- 2.10.1.613.g6021889
[PATCH 3/5] grep: add submodules as a grep source type
Add `GREP_SOURCE_SUBMODULE` as a grep_source type and cases for this new type in the various switch statements in grep.c. When initializing a grep_source with type `GREP_SOURCE_SUBMODULE` the identifier can either be NULL (to indicate that the working tree will be used) or a SHA1 (the REV of the submodule to be grep'd). If the identifier is a SHA1 then we want to fall through to the `GREP_SOURCE_SHA1` case to handle the copying of the SHA1. Signed-off-by: Brandon Williams--- grep.c | 16 +++- grep.h | 1 + 2 files changed, 16 insertions(+), 1 deletion(-) diff --git a/grep.c b/grep.c index 1194d35b5..0dbdc1d00 100644 --- a/grep.c +++ b/grep.c @@ -1735,12 +1735,23 @@ void grep_source_init(struct grep_source *gs, enum grep_source_type type, case GREP_SOURCE_FILE: gs->identifier = xstrdup(identifier); break; + case GREP_SOURCE_SUBMODULE: + if (!identifier) { + gs->identifier = NULL; + break; + } + /* +* FALL THROUGH +* If the identifier is non-NULL (in the submodule case) it +* will be a SHA1 that needs to be copied. +*/ case GREP_SOURCE_SHA1: gs->identifier = xmalloc(20); hashcpy(gs->identifier, identifier); break; case GREP_SOURCE_BUF: gs->identifier = NULL; + break; } } @@ -1760,6 +1771,7 @@ void grep_source_clear_data(struct grep_source *gs) switch (gs->type) { case GREP_SOURCE_FILE: case GREP_SOURCE_SHA1: + case GREP_SOURCE_SUBMODULE: free(gs->buf); gs->buf = NULL; gs->size = 0; @@ -1831,8 +1843,10 @@ static int grep_source_load(struct grep_source *gs) return grep_source_load_sha1(gs); case GREP_SOURCE_BUF: return gs->buf ? 0 : -1; + case GREP_SOURCE_SUBMODULE: + break; } - die("BUG: invalid grep_source type"); + die("BUG: invalid grep_source type to load"); } void grep_source_load_driver(struct grep_source *gs) diff --git a/grep.h b/grep.h index 5856a23e4..267534ca2 100644 --- a/grep.h +++ b/grep.h @@ -161,6 +161,7 @@ struct grep_source { GREP_SOURCE_SHA1, GREP_SOURCE_FILE, GREP_SOURCE_BUF, + GREP_SOURCE_SUBMODULE, } type; void *identifier; -- 2.10.1.613.g6021889
[PATCH 2/5] submodules: load gitmodules file from commit sha1
teach submodules to load a '.gitmodules' file from a commit sha1. This enables the population of the submodule_cache to be based on the state of the '.gitmodules' file from a particular commit. Signed-off-by: Brandon Williams--- cache.h| 2 ++ config.c | 8 submodule-config.c | 6 +++--- submodule-config.h | 3 +++ submodule.c| 12 submodule.h| 1 + 6 files changed, 25 insertions(+), 7 deletions(-) diff --git a/cache.h b/cache.h index f7ee41456..74b0c3cba 100644 --- a/cache.h +++ b/cache.h @@ -1681,6 +1681,8 @@ extern int git_default_config(const char *, const char *, void *); extern int git_config_from_file(config_fn_t fn, const char *, void *); extern int git_config_from_mem(config_fn_t fn, const enum config_origin_type, const char *name, const char *buf, size_t len, void *data); +extern int git_config_from_blob_sha1(config_fn_t fn, const char *name, +const unsigned char *sha1, void *data); extern void git_config_push_parameter(const char *text); extern int git_config_from_parameters(config_fn_t fn, void *data); extern void git_config(config_fn_t fn, void *); diff --git a/config.c b/config.c index 83fdecb1b..4d78e7227 100644 --- a/config.c +++ b/config.c @@ -1214,10 +1214,10 @@ int git_config_from_mem(config_fn_t fn, const enum config_origin_type origin_typ return do_config_from(, fn, data); } -static int git_config_from_blob_sha1(config_fn_t fn, -const char *name, -const unsigned char *sha1, -void *data) +int git_config_from_blob_sha1(config_fn_t fn, + const char *name, + const unsigned char *sha1, + void *data) { enum object_type type; char *buf; diff --git a/submodule-config.c b/submodule-config.c index 098085be6..8b9a2ef28 100644 --- a/submodule-config.c +++ b/submodule-config.c @@ -379,9 +379,9 @@ static int parse_config(const char *var, const char *value, void *data) return ret; } -static int gitmodule_sha1_from_commit(const unsigned char *commit_sha1, - unsigned char *gitmodules_sha1, - struct strbuf *rev) +int gitmodule_sha1_from_commit(const unsigned char *commit_sha1, + unsigned char *gitmodules_sha1, + struct strbuf *rev) { int ret = 0; diff --git a/submodule-config.h b/submodule-config.h index d05c542d2..78584ba6a 100644 --- a/submodule-config.h +++ b/submodule-config.h @@ -29,6 +29,9 @@ const struct submodule *submodule_from_name(const unsigned char *commit_sha1, const char *name); const struct submodule *submodule_from_path(const unsigned char *commit_sha1, const char *path); +extern int gitmodule_sha1_from_commit(const unsigned char *commit_sha1, + unsigned char *gitmodules_sha1, + struct strbuf *rev); void submodule_free(void); #endif /* SUBMODULE_CONFIG_H */ diff --git a/submodule.c b/submodule.c index 029b24440..f2a56689f 100644 --- a/submodule.c +++ b/submodule.c @@ -198,6 +198,18 @@ void gitmodules_config(void) } } +void gitmodules_config_sha1(const unsigned char *commit_sha1) +{ + struct strbuf rev = STRBUF_INIT; + unsigned char sha1[20]; + + if (gitmodule_sha1_from_commit(commit_sha1, sha1, )) { + git_config_from_blob_sha1(submodule_config, rev.buf, + sha1, NULL); + } + strbuf_release(); +} + /* * Determine if a submodule has been initialized at a given 'path' */ diff --git a/submodule.h b/submodule.h index bd039ca98..9a24ac82e 100644 --- a/submodule.h +++ b/submodule.h @@ -37,6 +37,7 @@ void set_diffopt_flags_from_submodule_config(struct diff_options *diffopt, const char *path); int submodule_config(const char *var, const char *value, void *cb); void gitmodules_config(void); +extern void gitmodules_config_sha1(const unsigned char *commit_sha1); extern int is_submodule_initialized(const char *path); extern int is_submodule_checked_out(const char *path); int parse_submodule_update_strategy(const char *value, -- 2.10.1.613.g6021889
Re: [PATCH v3 2/3] sha1_file: open window into packfiles with O_CLOEXEC
On Thu, Oct 27, 2016 at 3:24 AM, Jeff Kingwrote: > > +cc Linus as the original author of 144bde78e9 in case there is > something subtle I'm missing, but this really just seems like it's > an outdated optimization. I'd *really* like to keep O_NOATIME if at all possible. It made a huge difference on older kernels, and I'm not convinced that relatime really fixes it as well as O_NOATIME. There are people who don't like relatime. And even if you do have relatime enabled, it will update atime once every day, so then this makes your filesystem have a storm of nasty inode writebacks if you haven't touched that git repo in a while. If this is purely about mixing things up with O_CLOEXEC, then that is *trivially* fixable by just using fcntl(fd, F_SETFD, FD_CLOEXEC); after the open. Which is what you have to do anyway if you want to be portable, so its' not like you could avoid that complexity in the first place. Note that you can *not* do the same thing with O_NOATIME, since the whole point of O_NOATIME is that it changes the behavior of the open itself (unlike O_CLOEXEC which changes _later_ behavior, and can always be replaced by FD_CLOEXEC fnclt modulo races that are immaterial for git) Linus
[RFC PATCH 0/5] recursively grep across submodules
This patch series adds some basic api functions to the submodule interface as well as teaching grep to recursively search in submodules. The additions to the submodule interface allow grep to verify that a submodule has been initialized and checked out prior to launching a child process. One issue that still needs to be worked out is when greppig history, you could be in a state where the submodule doesn't have a working tree (or the path you had in the past doesn't match what currently exists) so instead of changing directory into the submdoule you need to look for the .git directory for the submodule in the parents .git/modules directory. If it exists we would need to change directory to .git/modules/ and then run the child process from there. This currently doesn't work due to commit <10f5c52656> since the GIT_DIR env variable is explicitly set to be '.git'. I'm going to spend some more time thinking about this problem and will address it as an additional patch in the series at a later time. As for the rest of the series, it should be ready for review or comments. Brandon Williams (5): submodules: add helper functions to determine presence of submodules submodules: load gitmodules file from commit sha1 grep: add submodules as a grep source type grep: optionally recurse into submodules grep: enable recurse-submodules to work on objects Documentation/git-grep.txt | 14 ++ builtin/grep.c | 364 ++--- cache.h| 2 + config.c | 8 +- git.c | 2 +- grep.c | 16 +- grep.h | 1 + submodule-config.c | 6 +- submodule-config.h | 3 + submodule.c| 51 ++ submodule.h| 3 + t/t7814-grep-recurse-submodules.sh | 141 ++ tree-walk.c| 17 +- 13 files changed, 588 insertions(+), 40 deletions(-) create mode 100755 t/t7814-grep-recurse-submodules.sh -- 2.10.1.613.g6021889
Re: Expanding Includes in .gitignore
On 28/10/16 10:07, Jeff King wrote: > I think it does > introduce compatibility complications. People with older git will not > respect your .gitignore.d files. Whereas $GIT_DIR/info is purely a local > matter. I know. I don't think it's a serious compatibility issue, but more important people may disagree, in which case I'll have to review my stance. I think the pros of it working everywhere outweigh the cons. I will say that I am a proponent of consistency and obviousness. I would rather this worked the same everywhere or it was a completely novel component in $GIT_DIR. I don't think I like the idea much, but it would be possible to magically generate an ignore file from a directory. > But perhaps there is a use case I'm missing. A new repo where the dev tools haven't been standardised yet. New tool, new file. Simple, and somewhat self documenting as a table of contents in .gitignore.d
Re: feature request
On Thu, Oct 27, 2016 at 2:55 PM, John Roodwrote: > Users should be able to configure Git to not send them into a Vim editor. > > When users pull commits, and a new commit needs to be created for a > merge, Git's current way of determining a commit message is to send > the user into a Vim window so that they can write a message. There are > 2 reasons why this might not be the ideal way to prompt for a commit > message. > > 1. Many users are used to writing concise one-line commit messages and > would not expect to save a commit message in a multi-line file. Some > users will wonder why they are in a text editor or which file they are > editing. Others may not, in fact, realize at all that a text editor is > what they are in. Look at the -m option of git commit, git commit -a -m "look a commit with no editor, and a precise one line message" I do not advocate this use though, as I think commit messages should be more wordy. > > 2. Many users are not familiar with Vim, and do not understand how to > modify, save, and exit. It is not very considerate to require a user > to learn Vim in order to finish a commit that they are in the middle > of. That is true, but vi is like the most available editor as a relict from ancient times; as you are on Windows, maybe notepad is the best on that platform. Maybe file a bug/issue at https://github.com/git-for-windows to change the default? > > The existing behavior should be optional, and there should be two new options: > > 1. Use a simple inline prompt for a commit message (in the same way > Git might prompt for a username). > > 2. Automatically assign names for commits in the form of "Merged x into y".
Re: feature request
John Roodwrites: > I suppose I can do git config --global core.editor notepad > However, this really only addresses my second concern. > > My first concern is that using a text editor at all seems like > overkill in many scenarios. Nobody stops you from writing a "type whatever you want; I won't let you edit any mistakes as I am not even a text editor; just hit RETURN when you are done, as you can only write a single line" program and set it as your GIT_EDITOR. I do not know what would happen when you need "git commit --amend", though.
Re: feature request
I suppose I can do git config --global core.editor notepad However, this really only addresses my second concern. My first concern is that using a text editor at all seems like overkill in many scenarios. For that reason, I still think the other two options I mentioned would be beneficial. On Thu, Oct 27, 2016 at 5:05 PM, John Roodwrote: > Unfortunately, in my case I'm on windows (my company's choice, not mine). > > On Thu, Oct 27, 2016 at 5:01 PM, Stefan Beller wrote: >> On Thu, Oct 27, 2016 at 2:55 PM, John Rood wrote: >>> Users should be able to configure Git to not send them into a Vim editor. >> >> See https://git-scm.com/docs/git-var >> >> GIT_EDITOR >> >> Text editor for use by Git commands. The value is meant to be interpreted >> by the shell when it is used. Examples: ~/bin/vi, $SOME_ENVIRONMENT_VARIABLE, >> "C:\Program Files\Vim\gvim.exe" --nofork. The order of preference is the >> $GIT_EDITOR environment variable, then core.editor configuration, then >> $VISUAL, then $EDITOR, and then the default chosen at compile time, >> which is usually vi. >> >> >> So maybe >> >> git config --global core.editor "nano" >> >> helps in your case?
[PATCH] attr: convert to new threadsafe API
This revamps the API of the attr subsystem to be thread safe. Before we had the question and its results in one struct type. The typical usage of the API was static struct git_attr_check *check; if (!check) check = git_attr_check_initl("text", NULL); git_check_attr(path, check); act_on(check->value[0]); This has a couple of issues when it comes to thread safety: * the initialization is racy in this implementation. To make it thread safe, we need to acquire a mutex, such that only one thread is executing the code in git_attr_check_initl. As we do not want to introduce a mutex at each call site, this is best done in the attr code. However to do so, we need to have access to the `check` variable, i.e. the API has to look like git_attr_check_initl(struct git_attr_check**, ...); Then one of the threads calling git_attr_check_initl will acquire the mutex and init the `check`, while all other threads will wait on the mutex just to realize they're late to the party and they'll return with no work done. * While the check for attributes to be questioned only need to be initalized once as that part will be read only after its initialisation, the answer may be different for each path. Because of that we need to decouple the check and the answer, such that each thread can obtain an answer for the path it is currently processing. This commit changes the API and adds locking in git_attr_check_initl that provides the thread safety for constructing `struct git_attr_check`. The usage of the new API will be: /* * The initl call will thread-safely check whether the * struct git_attr_check has been initialized. We only * want to do the initialization work once, hence we do * that work inside a thread safe environment. */ static struct git_attr_check *check; git_attr_check_initl(, "text", NULL); /* We're just asking for one attribute "text". */ git_attr_result myresult[1]; /* Perform the check and act on it: */ git_check_attr(path, check, myresult); act_on(myresult[0].value); /* * No need to free the check as it is static, hence doesn't leak * memory. The result is also static, so no need to free there either. */ Signed-off-by: Stefan Beller--- * use attr_start on Windows to dynamically initialize the Single Big Attr Mutex * rewrote the documentation by Junios guiding comments * interned struct git_attr, i.e. you hand "const char *" to the attr API, and internally it will make sense of it. When asking git_all_attrs, however you still need to know about git_attrs as you need to find out the name of the attribute via `git_attr_name(struct git_attr *)`. If there are no other huge design discussions, I'll reroll the whole series. Thanks, Stefan Documentation/technical/api-gitattributes.txt | 104 +++--- archive.c | 11 +- attr.c| 150 ++ attr.h| 76 +++-- builtin/check-attr.c | 47 builtin/pack-objects.c| 16 +-- compat/mingw.c| 3 + convert.c | 40 +++ ll-merge.c| 24 +++-- userdiff.c| 16 +-- ws.c | 8 +- 11 files changed, 305 insertions(+), 190 deletions(-) diff --git a/Documentation/technical/api-gitattributes.txt b/Documentation/technical/api-gitattributes.txt index 92fc32a..ba04c30 100644 --- a/Documentation/technical/api-gitattributes.txt +++ b/Documentation/technical/api-gitattributes.txt @@ -16,15 +16,19 @@ Data Structure of no interest to the calling programs. The name of the attribute can be retrieved by calling `git_attr_name()`. -`struct git_attr_check_elem`:: - - This structure represents one attribute and its value. - `struct git_attr_check`:: - This structure represents a collection of `git_attr_check_elem`. + This structure represents a collection of `struct git_attrs`. It is passed to `git_check_attr()` function, specifying the - attributes to check, and receives their values. + attributes to check, and receives their values into a corresponding + `struct git_attr_result`. + +`struct git_attr_result`:: + + This structure represents one results for a check, such that an + array of `struct git_attr_results` corresponds to a + `struct git_attr_check`. The answers given in that array are in + the the same order as the check struct. Attribute Values @@ -32,7 +36,7 @@ Attribute Values An attribute for a path can be in one of four states: Set, Unset, Unspecified or set to a string, and `.value` member of `struct -git_attr_check` records
Re: Expanding Includes in .gitignore
On 28/10/16 10:55, Aaron Pelly wrote: > 2) I fetch a repo with a hostile ignore file. It includes files from > $GIT_DIR/test-data/ssl/private or some such. Change. Don't pay > attention. Commit. Push. Problems if my test data comes from production. > > Is this mitigated currently? > > Not that git should be an enabler, but surely it falls on the user of > untrusted software to ensure their own security? Balls, I meant $GIT_WORK_TREE not $GIT_DIR
[PATCH] valgrind: support test helpers
Tests run with --valgrind call git commands through a wrapper script that invokes valgrind on them. This script (valgrind.sh) is in turn invoked through symlinks created for each command in t/valgrind/bin/. Since e6e7530d (test helpers: move test-* to t/helper/ subdirectory) these symlinks have been broken for test helpers -- they point to the old locations in the root of the build directory. Fix that by teaching the code for creating the links about the new location of the binaries, and do the same in the wrapper script to allow it to find its payload. Signed-off-by: Rene Scharfe--- t/test-lib.sh | 9 - t/valgrind/valgrind.sh | 12 ++-- 2 files changed, 18 insertions(+), 3 deletions(-) diff --git a/t/test-lib.sh b/t/test-lib.sh index b859db6..a724181 100644 --- a/t/test-lib.sh +++ b/t/test-lib.sh @@ -809,7 +809,14 @@ then return; base=$(basename "$1") - symlink_target=$GIT_BUILD_DIR/$base + case "$base" in + test-*) + symlink_target="$GIT_BUILD_DIR/t/helper/$base" + ;; + *) + symlink_target="$GIT_BUILD_DIR/$base" + ;; + esac # do not override scripts if test -x "$symlink_target" && test ! -d "$symlink_target" && diff --git a/t/valgrind/valgrind.sh b/t/valgrind/valgrind.sh index 4215303..669ebaf 100755 --- a/t/valgrind/valgrind.sh +++ b/t/valgrind/valgrind.sh @@ -1,11 +1,19 @@ #!/bin/sh base=$(basename "$0") +case "$base" in +test-*) + program="$GIT_VALGRIND/../../t/helper/$base" + ;; +*) + program="$GIT_VALGRIND/../../$base" + ;; +esac TOOL_OPTIONS='--leak-check=no' test -z "$GIT_VALGRIND_ENABLED" && -exec "$GIT_VALGRIND"/../../"$base" "$@" +exec "$program" "$@" case "$GIT_VALGRIND_MODE" in memcheck-fast) @@ -29,4 +37,4 @@ exec valgrind -q --error-exitcode=126 \ --log-fd=4 \ --input-fd=4 \ $GIT_VALGRIND_OPTIONS \ - "$GIT_VALGRIND"/../../"$base" "$@" + "$program" "$@" -- 2.10.1
Re: feature request
Unfortunately, in my case I'm on windows (my company's choice, not mine). On Thu, Oct 27, 2016 at 5:01 PM, Stefan Bellerwrote: > On Thu, Oct 27, 2016 at 2:55 PM, John Rood wrote: >> Users should be able to configure Git to not send them into a Vim editor. > > See https://git-scm.com/docs/git-var > > GIT_EDITOR > > Text editor for use by Git commands. The value is meant to be interpreted > by the shell when it is used. Examples: ~/bin/vi, $SOME_ENVIRONMENT_VARIABLE, > "C:\Program Files\Vim\gvim.exe" --nofork. The order of preference is the > $GIT_EDITOR environment variable, then core.editor configuration, then > $VISUAL, then $EDITOR, and then the default chosen at compile time, > which is usually vi. > > > So maybe > > git config --global core.editor "nano" > > helps in your case?
Re: feature request
On Thu, Oct 27, 2016 at 2:55 PM, John Roodwrote: > Users should be able to configure Git to not send them into a Vim editor. See https://git-scm.com/docs/git-var GIT_EDITOR Text editor for use by Git commands. The value is meant to be interpreted by the shell when it is used. Examples: ~/bin/vi, $SOME_ENVIRONMENT_VARIABLE, "C:\Program Files\Vim\gvim.exe" --nofork. The order of preference is the $GIT_EDITOR environment variable, then core.editor configuration, then $VISUAL, then $EDITOR, and then the default chosen at compile time, which is usually vi. So maybe git config --global core.editor "nano" helps in your case?
Re: [PATCH] compat: Allow static initializer for pthreads on Windows
> there isn't really a great place to put a dynamic initialization. Answering this question specifically (Where to put it?), I am about to send out a patch that puts it in compat/mingw.c:2232: diff --git a/compat/mingw.c b/compat/mingw.c index 3fbfda5..9881c3d 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -2232,6 +2232,9 @@ void mingw_startup(void) /* initialize critical section for waitpid pinfo_t list */ InitializeCriticalSection(_cs); +/* initialize critical sections in the attr code */ +start_attr(); + /* set up default file mode and file modes for stdin/out/err */ _fmode = _O_BINARY; _setmode(_fileno(stdin), _O_BINARY); If I understood Johannes correctly this is the place to put it. Junio seems to be ok with static mutex init for non windows platforms, so I put it into the mingw specifc startup code. Thanks, Stefan
Re: Expanding Includes in .gitignore
On 28/10/16 09:55, Jeff King wrote: > On Fri, Oct 28, 2016 at 09:28:23AM +1300, Aaron Pelly wrote: > >>> - we parse possibly-hostile .gitignore files from cloned repositories. >>> What happens when I include ask to include /etc/passwd? Probably >>> nothing, but there are setups where it might matter (e.g., something >>> like Travis that auto-builds untrusted repositories, and you could >>> potentially leak the contents of files via error messages). It's >>> nice to avoid the issue entirely. >> >> I understand the issue. >> >> It's not obvious to me how using a .d solves this problem though. > > It doesn't by itself. But we are worried only about tracked .gitignore > files (recall that even repo-level files in $GIT_DIR/info are generated > fresh by the clone process, and don't come from the remote). If we apply > the feature only to core.excludeFile and $GIT_DIR/info/exclude, those > are already under the user's control. The things you say make sense from this perspective. I was hoping to employ this mechanism throughout the git ecosystem. Thinking out loud for a minute: 1) I clone a repo with a hostile ignore file. It includes files from /etc/ssl/private or some such. Change. Don't pay attention. Commit. Push. Problems. What is the use case for reaching out of the repo in the first place? 2) I fetch a repo with a hostile ignore file. It includes files from $GIT_DIR/test-data/ssl/private or some such. Change. Don't pay attention. Commit. Push. Problems if my test data comes from production. Is this mitigated currently? Not that git should be an enabler, but surely it falls on the user of untrusted software to ensure their own security? > It's true that we could make a similar exception for an "include" > feature, and respect include directives only in those "safe" files. > Somehow that seems more confusing to me, though, than doing adding the > feature at the file level, as it introduces slightly varying syntax > between the locations. I'm quickly getting over the include file idea. But yes, that would be non obvious. >>> Whereas letting any of the user- or repo-level exclude files be a >>> directory, and simply reading all of the files inside, seems simple and >>> obvious. >> >> Apart from backwards compatibility, unless there's something I'm missing. > > I'm not sure what you mean. If we make: > > mkdir .git/info/exclude > echo whatever >.git/info/exclude/local > > work, I don't think we have to care about backwards compatibility. That > was nonsensical before, and never did anything useful (so somebody might > have done it, but we can assume anybody relying on it not to read the > contents is crazy). Seeing your perspective, now, I can see why you didn't understand me. In your context this makes perfect sense.
feature request
Users should be able to configure Git to not send them into a Vim editor. When users pull commits, and a new commit needs to be created for a merge, Git's current way of determining a commit message is to send the user into a Vim window so that they can write a message. There are 2 reasons why this might not be the ideal way to prompt for a commit message. 1. Many users are used to writing concise one-line commit messages and would not expect to save a commit message in a multi-line file. Some users will wonder why they are in a text editor or which file they are editing. Others may not, in fact, realize at all that a text editor is what they are in. 2. Many users are not familiar with Vim, and do not understand how to modify, save, and exit. It is not very considerate to require a user to learn Vim in order to finish a commit that they are in the middle of. The existing behavior should be optional, and there should be two new options: 1. Use a simple inline prompt for a commit message (in the same way Git might prompt for a username). 2. Automatically assign names for commits in the form of "Merged x into y".
Re: [PATCH] compat: Allow static initializer for pthreads on Windows
On Thu, Oct 27, 2016 at 1:05 PM, Johannes Sixtwrote: >> The implementation under discussion (well we did not discuss the >> implementation a >> whole lot yet) ... > > > There's not a whole lot to discuss: it must be rewritten from scratch (it's > not just the memory barriers, it is everything else, too). But time is much > better spent on an attr_start() solution. > > -- Hannes > Ok, so I've been reading this thread. I don't understand your objections to emulating in this way.. Could you clearly spell out why you believe this solution isn't acceptable? So far all I've understood was "it's not critical sections" and "it penalizes Windows too much" but... If Windows cannot statically initialize a pthread mutex, then we *have* to dynamically initialize it somewhere. This solution adds a single check before each lock and is safe due to use of memory barriers. Yes, this will cost a tiny bit extra overhead for each use of "pthread_mutex_lock" but I fail to see how that is a huge penalty... So I'm trying to understand if that really is your only concern, because I don't actually buy that it is a huge overhead to pay. I agree with Stefan that there isn't really a great place to put a dynamic initialization. To be clear I am trying to understand the objections since so far all I have read is "this is bad, and I object" without clearly explaining reasoning. Regards, Jake
Re: [PATCH v3 2/3] sha1_file: open window into packfiles with O_CLOEXEC
Jeff Kingwrites: > +cc Linus as the original author of 144bde78e9 in case there is > something subtle I'm missing, but this really just seems like it's > an outdated optimization. > > -- >8 -- > Subject: [PATCH] sha1_file: stop opening files with O_NOATIME > > When we open object files, we try to do so with O_NOATIME. > This dates back to 144bde78e9 (Use O_NOATIME when opening > the sha1 files., 2005-04-23), which is an optimization to > avoid creating a bunch of dirty inodes when we're accessing > many objects. But a few things have changed since then: > > 1. In June 2005, git learned about packfiles, which means > we would do a lot fewer atime updates (rather than one > per object access, we'd generally get one per packfile). > > 2. In late 2006, Linux learned about "relatime", which is > generally the default on modern installs. So > performance around atimes updates is a non-issue there > these days. > > All the world isn't Linux, but as it turns out, Linux > is the only platform to implement O_NOATIME in the > first place. > > So it's very unlikely that this code is helping anybody > these days. > > It's not a particularly large amount of code, but the > fallback-retry creates complexity. E.g., we do a similar > fallback for CLOEXEC; which one should take precedence, or > should we try all possible combinations? Dropping O_NOATIME > makes those questions go away. > > Signed-off-by: Jeff King > --- We may want to lose the surrounding for (;;) loop as there is only one flag to retry without, which was the original code structure back when 144bde78e9 ("Use O_NOATIME when opening the sha1 files.", 2005-04-23) was written and refactored by 44d1c19ee8 ("Make loose object file reading more careful", 2008-06-14). IOW, this on top. The update to ce_compare_data() Lars has in a0a6cb9662 ("read-cache: make sure file handles are not inherited by child processes", 2016-10-24) could then made into a call to git_open(). sha1_file.c | 23 --- 1 file changed, 8 insertions(+), 15 deletions(-) diff --git a/sha1_file.c b/sha1_file.c index 6f02a57d8b..e18ea053e6 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -1553,24 +1553,17 @@ int check_sha1_signature(const unsigned char *sha1, void *map, int git_open(const char *name) { - static int sha1_file_open_flag = O_CLOEXEC; - - for (;;) { - int fd; - - errno = 0; - fd = open(name, O_RDONLY | sha1_file_open_flag); - if (fd >= 0) - return fd; + static int cloexec = O_CLOEXEC; + int fd; + errno = 0; + fd = open(name, O_RDONLY | cloexec); + if ((cloexec & O_CLOEXEC) && fd < 0 && errno == EINVAL) { /* Try again w/o O_CLOEXEC: the kernel might not support it */ - if ((sha1_file_open_flag & O_CLOEXEC) && errno == EINVAL) { - sha1_file_open_flag &= ~O_CLOEXEC; - continue; - } - - return -1; + cloexec &= ~O_CLOEXEC; + fd = open(name, O_RDONLY | cloexec); } + return fd; } static int stat_sha1_file(const unsigned char *sha1, struct stat *st)
Re: Expanding Includes in .gitignore
On Thu, Oct 27, 2016 at 2:04 PM, Jeff Kingwrote: > I'm not convinced this is needed for in-repo .gitignore files. The point > is that you are pulling together separate files that may be administered > independently. But git repositories inherently have a whole-project > view. I'm not sure that separate files buy you a lot there. And the > compatibility issues are more complicated. > I had just assumed it would be used everywhere, so I hadn't really considered whether that was any reason not to. I personally like the idea of splitting my git ignores into separate files just so each file can be about one thing, but I guess it's not really a problem either way. > I do agree that: > > cd .git/info > git clone /my/exclude/repo exclude ;# or exclude.d > > should work; ignoring dotfiles when reading the directory solves that, > and is a pretty standard solution. > > -Peff
Re: [PATCH 32/36] pathspec: allow querying for attributes
On Wed, Oct 26, 2016 at 6:33 AM, Duy Nguyenwrote: > (sorry if this should have been answered if I went through the series > patch by patch, I wanted to do a proper review but finally have to > admit to myself I won't, so I just skim through a single giant diff > instead) > > On Sun, Oct 23, 2016 at 6:32 AM, Stefan Beller wrote: >> +attr;; >> +After `attr:` comes a space separated list of "attribute >> +requirements", all of which must be met in order for the >> +path to be considered a match; > > What about (attr=abc def,attr=ghi lkj)? Does it mean (abc && def) || > (ghi && lkj), or abc && def && ghi && lkj? Or is it forbidden to have > multiple 'attr' attribute in the same pathspec? Good point. I'll add a test for that. Remembering the original discussion, multiple attrs are forbidden for now as it is unclear what you want to see as a user. To model (abc && def) || (ghi && lkj), you would need to give multiple pathspec items as these are naturally ORed: git ls-files :(attr:abc def) :(attr:ghi lkj) . (compare "git ls-files Makefile README" which gives 2 files to you, that are named respectively.) To get "abc && def && ghi && lkj" you go with git ls-files :(attr:abc def ghi lkj) . as then all things included into this one attr are ANDed. I hope the documentation is clear for one attr. Thanks, Stefan
Re: Expanding Includes in .gitignore
On 28/10/16 10:04, Jeff King wrote: > On Thu, Oct 27, 2016 at 12:48:34PM -0700, Jacob Keller wrote: > >>> I think the normal behavior in such "foo.d" directory is to just sort >>> the contents lexically and read them in order, as if they were all >>> concatenated together, and with no recursion. I.e., behave "as if" the >>> user had run "cat $dir/*". >> >> Yea, this is the normal behavior, and the user is expected to order >> their files lexically such as "00-name", "50-name" and so on. Pretty >> traditional for a lot of newer configurations. > > One thing I will say about this approach is that you can implement it > without any changes in git by doing: > > path=.git/info/exclude > cat $path.d/* >$path > > and I have seen several config mechanisms basically do that (e.g., > Debian packaging for a program that doesn't have its own ".d" mechanism, > but needs to grab config provided by several separate packages). > > The reason to teach that trick to git is convenience; you don't have to > remember to build the master file from its parts because it's done > dynamically whenever git needs to look at it. Precisely. >> One thing to keep in mind would be that we should make sure we can >> handle the .gitignore being a submodule or a git repository, so that >> users could just do something like > > I'm not convinced this is needed for in-repo .gitignore files. The point > is that you are pulling together separate files that may be administered > independently. But git repositories inherently have a whole-project > view. I'm not sure that separate files buy you a lot there. And the > compatibility issues are more complicated. > > I do agree that: > > cd .git/info > git clone /my/exclude/repo exclude ;# or exclude.d > > should work; ignoring dotfiles when reading the directory solves that, > and is a pretty standard solution. You raise a good point about the requirement. I was about to make an argument in favour of it, but I argued myself out of an argument. I will say; why have it work in one place and not others?
Re: [PATCH v2 2/2] convert.c: stream and fast search for binary
tbo...@web.de writes: > From: Torsten Bögershausen> > When statistics are done for the autocrlf handling, the search in > the content can be stopped, if e.g > - a search for binary is done, and a NUL character is found > - a search for CRLF is done, and the first CRLF is found. > > Similar when statistics for binary vs non-binary are gathered: > Whenever a lone CR or NUL is found, the search can be aborted. > > When checking out files in "auto" mode, any file that has a "lone CR" > or a CRLF will not be converted, so the search can be aborted early. > > Add the new bit, CONVERT_STAT_BITS_ANY_CR, > which is set for either lone CR or CRLF. > > Many binary files have a NUL very early and it is often not necessary > to load the whole content of a file or blob into memory. > > Split gather_stats() into gather_all_stats() and gather_stats_partly() > to do a streaming handling for blobs and files in the worktree. > > Signed-off-by: Torsten Bögershausen > --- I'll try to review in reverse order, as this seems to be doing too many things at once and cannot get my head around it without going top down. > @@ -338,11 +401,15 @@ static int crlf_to_worktree(const char *path, const > char *src, size_t len, > { > char *to_free = NULL; > struct text_stat stats; > + unsigned search_only = 0; > > if (!len || output_eol(crlf_action) != EOL_CRLF) > return 0; > > - gather_stats(src, len, ); > + if (crlf_action == CRLF_AUTO || crlf_action == CRLF_AUTO_CRLF) > + search_only = CONVERT_STAT_BITS_ANY_CR | CONVERT_STAT_BITS_BIN; > + > + gather_all_stats(src, len, , search_only); > if (!will_convert_lf_to_crlf(len, , crlf_action)) > return 0; This special case to decide whether we would limit the search_only flag has too much intimate knowledge of what happens inside will_convert_lf_to_crlf(). It knows that output_eol(crlf_action) not being EOL_CRLF is the very first thing the function checks, too. Makes one wonder if the check for output_eol(crlf_action) can be removed from will_convert_lf_to_crlf(), no? It is not apparent if that is a good idea for the other caller in crlf_to_git(). gather_all_stats() will give up immediately when it sees either ANY_CR or BIN. If CR appears before we see any BIN, stat_bits would not have BITS_BIN even if the buffer may have BIN byte later. It is OK because either lonecr or crlf would be non-zero, and will_convert_lf_to_crlf() would return 0. If BIN apepars before we see any CR, neither lonecr nor crlf will become non-zero even if the buffer may have CR byte later, but again it is OK because will_convert_lf_to_crlf() will return 0 in that case. This looks too brittle, even though it is correct. > @@ -253,7 +300,8 @@ static int crlf_to_git(const char *path, const char *src, > size_t len, > { > struct text_stat stats; > char *dst; > - int convert_crlf_into_lf; > + int has_crlf_to_convert; > + unsigned search_only = 0; > > if (crlf_action == CRLF_BINARY || > (src && !len)) > @@ -266,12 +314,16 @@ static int crlf_to_git(const char *path, const char > *src, size_t len, > if (!buf && !src) > return 1; > > - gather_stats(src, len, ); > + if (crlf_action == CRLF_AUTO || crlf_action == CRLF_AUTO_INPUT || > crlf_action == CRLF_AUTO_CRLF) > + search_only = CONVERT_STAT_BITS_BIN; > + > + gather_all_stats(src, len, , search_only); > + > /* Optimization: No CRLF? Nothing to convert, regardless. */ > - convert_crlf_into_lf = !!stats.crlf; > + has_crlf_to_convert = !!stats.crlf; The comment here may need to say a lot more, now we do not even count .crlf in some cases because of "search_only" setting. > if (crlf_action == CRLF_AUTO || crlf_action == CRLF_AUTO_INPUT || > crlf_action == CRLF_AUTO_CRLF) { The new "search_only" criteria above was added to match this if block; it is not as bad as the previous one in crlf_to_worktree() that knows, and must be kept in sync with, what a separate function will_convert_lf_to_crlf() does, but still it is horrible for both maintainability and readability. Can you devise some mechanism to ensure that these two if statements will stay in sync? > - if (convert_is_binary(len, )) > + if (stats.stat_bits & CONVERT_STAT_BITS_BIN) > return 0; We no longer need the helper function convert_is_binary() and instead need only STATS_BITS_BIN bit, so the control flow that reaches this point is obviously correct (assuming that gather_all_stats() that is limited with search_only option counts things correctly, that is). But what happens when we don't return here? We didn't get the full stats out of gather_all_stats() and continue. Let's see what happens in that case... > /* >* If the file in the index has any CR in it, do not convert. > @@ -280,24 +332,35 @@ static int
Re: Expanding Includes in .gitignore
On Thu, Oct 27, 2016 at 04:55:08PM -0400, Jeff King wrote: > On Fri, Oct 28, 2016 at 09:28:23AM +1300, Aaron Pelly wrote: > > > > - we parse possibly-hostile .gitignore files from cloned repositories. > > > What happens when I include ask to include /etc/passwd? Probably > > > nothing, but there are setups where it might matter (e.g., something > > > like Travis that auto-builds untrusted repositories, and you could > > > potentially leak the contents of files via error messages). It's > > > nice to avoid the issue entirely. > > > > I understand the issue. > > > > It's not obvious to me how using a .d solves this problem though. > > It doesn't by itself. But we are worried only about tracked .gitignore > files (recall that even repo-level files in $GIT_DIR/info are generated > fresh by the clone process, and don't come from the remote). If we apply > the feature only to core.excludeFile and $GIT_DIR/info/exclude, those > are already under the user's control. > > It's true that we could make a similar exception for an "include" > feature, and respect include directives only in those "safe" files. > Somehow that seems more confusing to me, though, than doing adding the > feature at the file level, as it introduces slightly varying syntax > between the locations. Actually, I suppose even a ".gitignore.d" inside the repo solves that problem, too, because the repository is not specifying paths, only content (it can specify symlinks outside the repository, but like other parts of git, we should be careful not to follow them in that case). However, as I said elsewhere, I'm not convinced this feature is all that helpful for in-repository .gitignore files, and I think it does introduce compatibility complications. People with older git will not respect your .gitignore.d files. Whereas $GIT_DIR/info is purely a local matter. But perhaps there is a use case I'm missing. -Peff
Re: Expanding Includes in .gitignore
On Thu, Oct 27, 2016 at 12:48:34PM -0700, Jacob Keller wrote: > > I think the normal behavior in such "foo.d" directory is to just sort > > the contents lexically and read them in order, as if they were all > > concatenated together, and with no recursion. I.e., behave "as if" the > > user had run "cat $dir/*". > > Yea, this is the normal behavior, and the user is expected to order > their files lexically such as "00-name", "50-name" and so on. Pretty > traditional for a lot of newer configurations. One thing I will say about this approach is that you can implement it without any changes in git by doing: path=.git/info/exclude cat $path.d/* >$path and I have seen several config mechanisms basically do that (e.g., Debian packaging for a program that doesn't have its own ".d" mechanism, but needs to grab config provided by several separate packages). The reason to teach that trick to git is convenience; you don't have to remember to build the master file from its parts because it's done dynamically whenever git needs to look at it. > One thing to keep in mind would be that we should make sure we can > handle the .gitignore being a submodule or a git repository, so that > users could just do something like I'm not convinced this is needed for in-repo .gitignore files. The point is that you are pulling together separate files that may be administered independently. But git repositories inherently have a whole-project view. I'm not sure that separate files buy you a lot there. And the compatibility issues are more complicated. I do agree that: cd .git/info git clone /my/exclude/repo exclude ;# or exclude.d should work; ignoring dotfiles when reading the directory solves that, and is a pretty standard solution. -Peff
Re: Expanding Includes in .gitignore
On 28/10/16 08:48, Jacob Keller wrote: > I would strongly prefer rc.d style directories either with a "if the > .gitignore is a directory treat it like rc.d" or even "add support for > .gitignore.d as well as .gitignore" I think adding .gitignore.d shouldn't break existing systems, is intuitive, and solves my issue. Does git know when it is in a repo that is too new to comprehend? My current thinking is that anywhere a .gitignore can go, so can a .gitignore.d (named appropriately of course.) Any existing .gitignore should take precedence to the result of parsing the directory. I haven't looked at the implementation of precedence yet, but I'd be surprised if the existing mechanism can't be employed. > One thing to keep in mind would be that we should make sure we can > handle the .gitignore being a submodule or a git repository, so that > users could just do something like > > "git submodule add .gitignore and then track git ignore > contents from a repository in a nice way. > > By this I mean that the reading of files in .gitignore directory > should exclude reading .git or other hidden files in some documented > manor so as to avoid problems when linking to a git directory for its > contents. Nice! I like this a lot.
Re: [PATCH 17/36] attr: expose validity check for attribute names
On Mon, Oct 24, 2016 at 2:07 PM, Stefan Bellerwrote: > On Sun, Oct 23, 2016 at 8:07 AM, Ramsay Jones > wrote: >> >> >> On 23/10/16 00:32, Stefan Beller wrote: >>> From: Junio C Hamano >>> >>> Export attr_name_valid() function, and a helper function that >>> returns the message to be given when a given pair >>> is not a good name for an attribute. >>> >>> We could later update the message to exactly spell out what the >>> rules for a good attribute name are, etc. >>> >>> Signed-off-by: Junio C Hamano >>> Signed-off-by: Stefan Beller >>> --- >> >> [snip] >> >>> +extern int attr_name_valid(const char *name, size_t namelen); >>> +extern void invalid_attr_name_message(struct strbuf *, const char *, int); >>> + >> >> The symbol 'attr_name_valid()' is not used outside of attr.c, even >> by the end of this series. Do you expect this function to be used >> in any future series? (The export is deliberate and it certainly >> seems like it should be part of the public interface, but ...) >> refactoring this series again, I will make use of attr_name_valid outside of attr.c, so I'll keep this patch.
Re: Expanding Includes in .gitignore
On Fri, Oct 28, 2016 at 09:28:23AM +1300, Aaron Pelly wrote: > > - we parse possibly-hostile .gitignore files from cloned repositories. > > What happens when I include ask to include /etc/passwd? Probably > > nothing, but there are setups where it might matter (e.g., something > > like Travis that auto-builds untrusted repositories, and you could > > potentially leak the contents of files via error messages). It's > > nice to avoid the issue entirely. > > I understand the issue. > > It's not obvious to me how using a .d solves this problem though. It doesn't by itself. But we are worried only about tracked .gitignore files (recall that even repo-level files in $GIT_DIR/info are generated fresh by the clone process, and don't come from the remote). If we apply the feature only to core.excludeFile and $GIT_DIR/info/exclude, those are already under the user's control. It's true that we could make a similar exception for an "include" feature, and respect include directives only in those "safe" files. Somehow that seems more confusing to me, though, than doing adding the feature at the file level, as it introduces slightly varying syntax between the locations. > > Whereas letting any of the user- or repo-level exclude files be a > > directory, and simply reading all of the files inside, seems simple and > > obvious. > > Apart from backwards compatibility, unless there's something I'm missing. I'm not sure what you mean. If we make: mkdir .git/info/exclude echo whatever >.git/info/exclude/local work, I don't think we have to care about backwards compatibility. That was nonsensical before, and never did anything useful (so somebody might have done it, but we can assume anybody relying on it not to read the contents is crazy). > > It > > can't handle all cases (some items in "00foo" want precedence over "01bar" > > and vice versa), but I don't think there's an easy solution. That's a > > good sign that one or more of the files should be broken up. > > I've been burned by this myself by packages interfering with each other > in /etc/sysctl.d > > Could we put this down to caveat emptor? I think this sorting should be > intuitive to most people these days, and simple to document and comprehend. Yeah, I think caveat emptor is fine there. Sorry if I implied otherwise. -Peff
Re: [PATCH] git-svn: do not reuse caches memoized for a different architecture
Junio C Hamanowrote: > Just peeking from the sideline, but the your squash looks like an > improvement to me. Thanks. > Hopefully the final version after your interaction with Dscho can > come to me via another "pull this now"? Not sure if I'll be online the next few days, but I've preeptively pushed the patch + squash to my repo: The following changes since commit 2cc2e70264e0fcba04f9ef791d144bbc8b501206: Eleventh batch for 2.11 (2016-10-26 13:28:47 -0700) are available in the git repository at: git://bogomips.org/git-svn.git svn-cache for you to fetch changes up to a2c761ce5b7a5fd8b505b036f3509a9e6617dee8: git-svn: do not reuse caches memoized for a different architecture (2016-10-27 20:17:36 +) Gavin Lambert (1): git-svn: do not reuse caches memoized for a different architecture perl/Git/SVN.pm | 12 +++- 1 file changed, 11 insertions(+), 1 deletion(-)
Re: Drastic jump in the time required for the test suite
Johannes Schindelinwrote: > I know you are a fan of testing things thoroughly in the test suite, but I > have to say that it is getting out of hand, in particular due to our > over-use of shell script idioms (which really only run fast on Linux, not > a good idea for a portable software). How much effort would it take to optimize a /bin/sh? Would replacing uses of fork+execve posix_spawn be fast and portable enough? Even on Linux, performance sucks for me. I've been hoping dash can use posix_spawn (or using vfork directly) to see if that can help things. That won't help with subshells, though... (I'm back to using a Centrino laptop from 2005)
Re: Expanding Includes in .gitignore
On 27/10/16 23:50, Jeff King wrote: > I'd shy away from an actual include directive, as it raises a lot of > complications: I'm leaning that way now too. > - we parse possibly-hostile .gitignore files from cloned repositories. > What happens when I include ask to include /etc/passwd? Probably > nothing, but there are setups where it might matter (e.g., something > like Travis that auto-builds untrusted repositories, and you could > potentially leak the contents of files via error messages). It's > nice to avoid the issue entirely. I understand the issue. It's not obvious to me how using a .d solves this problem though. > - finding a backwards-compatible syntax using .d directories solves this nicely in my opinion > Whereas letting any of the user- or repo-level exclude files be a > directory, and simply reading all of the files inside, seems simple and > obvious. Apart from backwards compatibility, unless there's something I'm missing. > If you go that route, it probably makes sense to teach > gitattributes the same trick. Understood. I'll keep that in mind. >> In the case of a directory the plan would be to add links to files >> stored/sourced elsewhere. This does pose a precedence question which I >> haven't thought about yet, but probably makes it too hard for the >> limited value it brings. > > I think the normal behavior in such "foo.d" directory is to just sort > the contents lexically and read them in order, as if they were all > concatenated together, and with no recursion. I.e., behave "as if" the > user had run "cat $dir/*". > > That lets you handle precedence via the filenames (or symlink names). That was my thinking at first, but I didn't want to bias the discussion. > It > can't handle all cases (some items in "00foo" want precedence over "01bar" > and vice versa), but I don't think there's an easy solution. That's a > good sign that one or more of the files should be broken up. I've been burned by this myself by packages interfering with each other in /etc/sysctl.d Could we put this down to caveat emptor? I think this sorting should be intuitive to most people these days, and simple to document and comprehend.
Re: [PATCH v15 01/27] bisect--helper: use OPT_CMDMODE instead of OPT_BOOL
On Thu, Oct 27, 2016 at 6:59 PM, Junio C Hamanowrote: > Cc'ed those who touched either "git-bisect.sh" or "builtin/bisect-helper.c" > in our relatively recent past. > > Does any of you (and others on the list) have time and inclination > to review this series? As part of my mentoring Pranit for the GSoC I already took a look at those patches some months ago on GitHub. I could take another look at them but my eyes will not be fresh anymore, so I don know if it will be valuable.
Re: [PATCH] compat: Allow static initializer for pthreads on Windows
Am 27.10.2016 um 21:08 schrieb Stefan Beller: On Thu, Oct 27, 2016 at 11:49 AM, Junio C Hamanowrote: Johannes Sixt writes: Am 27.10.2016 um 19:01 schrieb Stefan Beller: ... It is not possible to mark a mutex uninitialized on Windows without an extra piece of data. A solution would become quite complicated quite quickly, and at the cost of additional operations that are in the same ballpark as an uncontended mutex. I'm not enthused. The positive aspect of this way the patch proposes would be that any future contributor not knowing the details of how to do mutexes right on Windows, would not totally break the whole system, i.e. this seems to be more maintainable in the future as it reduces the friction between pthreads mutexes and the way we can do things in Git in a platform independent way This is a non-argument. Coders have to know their tools. Windows is not my tool. The tool I meant is pthreads. For example, you can't have a pthread_mutex_t variable and not initialize it with either PTHREAD_MUTEX_INITIALIZER or pthread_mutex_init. The codebase should strive to give coders a coherent abstraction that can be implemented efficiently on platforms, so that coders do not have to care too deeply about quirks that exist on individual platforms. Currently working as a coder I care about "submodules, that work on linux." I do not care about Windows in the big picture. I don't know what you meant to say with this sentence, but taken literally, are you on the right ship here, I have to ask? I am however willing to go an extra step to not break Windows. "Not break Windows" is equivalent to "make it work on Windows", mind you. We can't have a new feature only on Linux when there is no reason not to have it on Windows as well. Sorry, "Git is for Unix only" is long over. However that requires a little bit of effort from both me and you: * I need to be aware of what I cannot do with "not-my-tools". (So please somebody tell me, also in the future when I break obscure platforms. Mind that I don't equate obscure with not widely used or in any other way negative. It's just that people working on linux find some quirks on Windows not "obvious") That goes without saying. * the workaround should not be too time consuming in the bigger picture, That, however, is wishful thinking. If you want to have a feature dearly, you have to make it work, and that may take its time. I'm not saying that *you* have to make it work (there are platform experts around to lend a hand), but just an extra step to "not break" an important platform is not enough. which is why I propose to make the API a bit clearer by emulating posix mutexes harder. (From a Windows POV this might sound like making it more obscure because posix mutexes itself are obscure.) So, when you think that POSIX mutexes are obscure, why don't we settle on the simpler Windows critical sections? Emulating them with pthreads should be child's play. The implementation under discussion (well we did not discuss the implementation a whole lot yet) ... There's not a whole lot to discuss: it must be rewritten from scratch (it's not just the memory barriers, it is everything else, too). But time is much better spent on an attr_start() solution. -- Hannes
Re: [PATCH v2 1/2] read-cache: factor out get_sha1_from_index() helper
tbo...@web.de writes: > From: Torsten Bögershausen> > Factor out the retrieval of the sha1 for a given path in > read_blob_data_from_index() into the function get_sha1_from_index(). > > This will be used in the next commit, when convert.c can do the > analyze for "text=auto" without slurping the whole blob into memory > at once. > > Add a wrapper definition get_sha1_from_cache(). > > Signed-off-by: Torsten Bögershausen > --- > cache.h | 3 +++ > read-cache.c | 29 ++--- > 2 files changed, 21 insertions(+), 11 deletions(-) > > diff --git a/cache.h b/cache.h > index 1604e29..04de209 100644 > --- a/cache.h > +++ b/cache.h > @@ -380,6 +380,7 @@ extern void free_name_hash(struct index_state *istate); > #define unmerge_cache_entry_at(at) unmerge_index_entry_at(_index, at) > #define unmerge_cache(pathspec) unmerge_index(_index, pathspec) > #define read_blob_data_from_cache(path, sz) > read_blob_data_from_index(_index, (path), (sz)) > +#define get_sha1_from_cache(path) get_sha1_from_index (_index, (path)) Should have caught this earlier, but there is an extra SP after "from_index" which I'll remove (the topic is not in 'next' yet, lucky us). I re-read this to ensure that it does not break read_blob_data_from_index() the new function borrows the logic from. Thanks.
Re: Expanding Includes in .gitignore
On Thu, Oct 27, 2016 at 3:50 AM, Jeff Kingwrote: > On Thu, Oct 27, 2016 at 01:22:43PM +1300, Aaron Pelly wrote: > >> The use case for this is where I did not write my own rules, but I want >> to keep them updated. https://github.com/github/gitignore is a damn good >> resource, but I want to pull it and include relevant bits project by >> project and/or system wide. I don't want to have to update many projects >> manually if that, or any other, repo changes. > > That seems like a reasonable thing to want. > Agreed, this seems useful to me. >> A very brief look at dir.c would indicate that a recursive call from >> add_excludes to itself when it parses some sort of include tag would do >> it within a file. I'm sure it'd be pretty straight forward to hook into >> something in dir.c to parse directories too. >> >> I'm thinking something like ". path/to/include/file" in an ignore file, >> and/or creating .gitignore.d and/or allowing $HOME/.config/git/ignore >> and $GIT_DIR/info/exclude to be directories. Or some sane and consistent >> mixture of these things. > > I'd shy away from an actual include directive, as it raises a lot of > complications: > > - as you noted, cycles in the include graph need to be detected and > broken > > - we parse possibly-hostile .gitignore files from cloned repositories. > What happens when I include ask to include /etc/passwd? Probably > nothing, but there are setups where it might matter (e.g., something > like Travis that auto-builds untrusted repositories, and you could > potentially leak the contents of files via error messages). It's > nice to avoid the issue entirely. > > - finding a backwards-compatible syntax > > Whereas letting any of the user- or repo-level exclude files be a > directory, and simply reading all of the files inside, seems simple and > obvious. If you go that route, it probably makes sense to teach > gitattributes the same trick. Yep that would be easier and simpler. > >> In the case of a directory the plan would be to add links to files >> stored/sourced elsewhere. This does pose a precedence question which I >> haven't thought about yet, but probably makes it too hard for the >> limited value it brings. > > I think the normal behavior in such "foo.d" directory is to just sort > the contents lexically and read them in order, as if they were all > concatenated together, and with no recursion. I.e., behave "as if" the > user had run "cat $dir/*". Yea, this is the normal behavior, and the user is expected to order their files lexically such as "00-name", "50-name" and so on. Pretty traditional for a lot of newer configurations. > > That lets you handle precedence via the filenames (or symlink names). It > can't handle all cases (some items in "00foo" want precedence over "01bar" > and vice versa), but I don't think there's an easy solution. That's a > good sign that one or more of the files should be broken up. > Yea if you have these inter-dependent relationships it can't be expressed, but then neither can it really be expressed in the simple "include" case above. I would strongly prefer rc.d style directories either with a "if the .gitignore is a directory treat it like rc.d" or even "add support for .gitignore.d as well as .gitignore" One thing to keep in mind would be that we should make sure we can handle the .gitignore being a submodule or a git repository, so that users could just do something like "git submodule add .gitignore and then track git ignore contents from a repository in a nice way. By this I mean that the reading of files in .gitignore directory should exclude reading .git or other hidden files in some documented manor so as to avoid problems when linking to a git directory for its contents. Thanks, Jake > -Peff
Re: [PATCH] git-svn: do not reuse caches memoized for a different architecture
Eric Wongwrites: > Johannes Schindelin wrote: >> +++ b/perl/Git/SVN.pm >> @@ -1658,6 +1658,11 @@ sub tie_for_persistent_memoization { >> if ($memo_backend > 0) { >> tie %$hash => 'Git::SVN::Memoize::YAML', "$path.yaml"; >> } else { >> +# first verify that any existing file can actually be loaded >> +# (it may have been saved by an incompatible version) >> +if (-e "$path.db") { >> +unlink "$path.db" unless eval { retrieve("$path.db"); 1 >> }; >> +} > > That retrieve() call is unlikely to work without "use Storable" > to import it into the current package. > > I also favor setting "$path.db" once to detect typos and avoid > going over 80 columns. Additionally, having error-checking for > unlink might be useful. > > So perhaps squashing this on top: > > diff --git a/perl/Git/SVN.pm b/perl/Git/SVN.pm > index 025c894..b3c1460 100644 > --- a/perl/Git/SVN.pm > +++ b/perl/Git/SVN.pm > @@ -1660,10 +1660,15 @@ sub tie_for_persistent_memoization { > } else { > # first verify that any existing file can actually be loaded > # (it may have been saved by an incompatible version) > - if (-e "$path.db") { > - unlink "$path.db" unless eval { retrieve("$path.db"); 1 > }; > + my $db = "$path.db"; > + if (-e $db) { > + use Storable qw(retrieve); > + > + if (!eval { retrieve($db); 1 }) { > + unlink $db or die "unlink $db failed: $!"; > + } > } > - tie %$hash => 'Memoize::Storable', "$path.db", 'nstore'; > + tie %$hash => 'Memoize::Storable', $db, 'nstore'; > } > } > > > Thoughts? Thanks. Just peeking from the sideline, but the your squash looks like an improvement to me. Hopefully the final version after your interaction with Dscho can come to me via another "pull this now"? Thanks.
Re: [PATCH] compat: Allow static initializer for pthreads on Windows
On Thu, Oct 27, 2016 at 11:49 AM, Junio C Hamanowrote: > Johannes Sixt writes: > >> Am 27.10.2016 um 19:01 schrieb Stefan Beller: >> ... >> It is not possible to mark a mutex uninitialized on Windows without an >> extra piece of data. A solution would become quite complicated quite >> quickly, and at the cost of additional operations that are in the same >> ballpark as an uncontended mutex. I'm not enthused. >> >>> The positive aspect of this way the patch proposes would be that any >>> future contributor not knowing the details of how to do mutexes right >>> on Windows, would not totally break the whole system, i.e. this seems >>> to be more maintainable in the future as it reduces the friction between >>> pthreads mutexes and the way we can do things in Git in a platform >>> independent way >> >> This is a non-argument. Coders have to know their tools. Windows is not my tool. > > The codebase should strive to give coders a coherent abstraction > that can be implemented efficiently on platforms, so that coders do > not have to care too deeply about quirks that exist on individual > platforms. Currently working as a coder I care about "submodules, that work on linux." I do not care about Windows in the big picture. I am however willing to go an extra step to not break Windows. However that requires a little bit of effort from both me and you: * I need to be aware of what I cannot do with "not-my-tools". (So please somebody tell me, also in the future when I break obscure platforms. Mind that I don't equate obscure with not widely used or in any other way negative. It's just that people working on linux find some quirks on Windows not "obvious") * the workaround should not be too time consuming in the bigger picture, which is why I propose to make the API a bit clearer by emulating posix mutexes harder. (From a Windows POV this might sound like making it more obscure because posix mutexes itself are obscure.) > > It is OK to argue that the particular solution Stefan lifted from > somewhere (perhaps msdn article he cited???) A stack overflow article that I found with my search engine of choice, because I could not believe that Windows cannot have statically initialized mutexes. > does not qualify as > such an abstraction. The implementation under discussion (well we did not discuss the implementation a whole lot yet) may even contain an error as the first memory barrier needs to be in front of the first condition. > But I do not agree with your "Coders have to > know" as a blanket statement. Well I do to some extent, I just disagree what my tools are.
Re: [PATCH] compat: Allow static initializer for pthreads on Windows
Johannes Sixtwrites: > Am 27.10.2016 um 19:01 schrieb Stefan Beller: > ... > It is not possible to mark a mutex uninitialized on Windows without an > extra piece of data. A solution would become quite complicated quite > quickly, and at the cost of additional operations that are in the same > ballpark as an uncontended mutex. I'm not enthused. > >> The positive aspect of this way the patch proposes would be that any >> future contributor not knowing the details of how to do mutexes right >> on Windows, would not totally break the whole system, i.e. this seems >> to be more maintainable in the future as it reduces the friction between >> pthreads mutexes and the way we can do things in Git in a platform >> independent way > > This is a non-argument. Coders have to know their tools. The codebase should strive to give coders a coherent abstraction that can be implemented efficiently on platforms, so that coders do not have to care too deeply about quirks that exist on individual platforms. It is OK to argue that the particular solution Stefan lifted from somewhere (perhaps msdn article he cited???) does not qualify as such an abstraction. But I do not agree with your "Coders have to know" as a blanket statement.
Re: [PATCH] Update git rebase documentation to clarify HEAD behavior
From: "Junio C Hamano"Cody Sehl writes: The first few paragraphs in the git-rebase.txt documentation lay out the steps git takes during a rebase: 1. everything from `..HEAD` is saved to a temporary area 2. `HEAD` is set to `` 3. the changes held in the temporary area are applied one by one in order on top of the new `HEAD` The second step was described using the phrase `The current branch is reset to `, which is true (because `HEAD` == current branch), but not clear. --- Please wrap your lines to reasonable lengths like 70 columns or so. Please sign off your patch. Documentation/git-rebase.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt index de222c8..c47ca11 100644 --- a/Documentation/git-rebase.txt +++ b/Documentation/git-rebase.txt @@ -33,7 +33,7 @@ of commits that would be shown by `git log ..HEAD`; or by description on `--fork-point` below); or by `git log HEAD`, if the `--root` option is specified. -The current branch is reset to , or if the +HEAD is reset to , or if the --onto option was supplied. This has the exact same effect as `git reset --hard ` (or ). ORIG_HEAD is set to point at the tip of the branch before the reset. This is describing an ancient behaviour before 6fd2f5e60d ("rebase: operate on a detached HEAD", 2007-11-08) in v1.5.4 timeframe. We apparently failed to update the description. This depends on the desired technical detail of the description, but a correct rewrite would be "HEAD is detached at , or , and ORIG_HEAD is set to point at the tip of the branch before this happens". Detaching the HEAD at no longer has the same effect as "git reset --hard " (or ), so that sentence must go. It was the primary point of the ancient change at 6fd2f5e60d after all. And then there is a new step (to be numbered 4. in your description in the proposed log message), which updates the tip of the branch to the resulting HEAD (after replaying all these changes) and check the branch out, which needs to be added. Perhaps after "one by one, in order." Oh, the mention of "reapplied to the current branch" also needs to be updated to "reapplied to the detached HEAD", too. On the other hand, if we do not aim for that deep level of technical correctness, but want to tell a white lie to make it easier to understand at the conceptual level to new readers who haven't grasped the detached HEAD, then the current description is fine. By bringing up "HEAD", you seem to be aiming for techincal correctness (which I tend to agree is a good direction to go in this part of the documentation), so the existing text needs a bit more work than your patch to be brought to the modern world. A third option is to start with a short "Conceptually, .." description which gives that summary overview, and then have a few "In detail, .." paragraphs that cover the ORIG_HEAD and Detatched HEAD, etc., that users should at least be aware of (so they know they can come back and read the gory details when they need it;-) -- Philip
Re: [PATCH] Documentation/git-diff: document git diff with 3+ commits
Michael J Gruberwrites: > It did not exist, but even at that point in time, "git log A..B" listed > only commits between the merge base and B, not those which are only in A > and not in B. Whereas "git diff A B" shows the differences between the > endpoints A and B. I think you are speaking with 20/20 hindsight at this point. Do you genuinely think you would have thought that way when there weren't any concept of "merge-base" or "git log A...B" known to you? >>> @@ -12,6 +12,7 @@ SYNOPSIS >>> 'git diff' [options] [] [--] [...] >>> 'git diff' [options] --cached [] [--] [...] >>> 'git diff' [options] [--] [...] >>> +'git diff' [options][...] >> >> Made me wonder "is [...] 0-or-more As or 1-or-more As?". > > 0-or-more, at least that's the way it is used in all lines here. > >> Don't we allow pathspecs in this case? > > Yes, the combinded diff mode kicks in only with no blobs (not: no > pathspec, which I had misread) and N>=3 commits. Maybe I should update > the code comments in builtin/diff.c to describe this, too. I don't know about the code comments, but the above SYNOPSIS needs an update, I would think. That is why I wondered if [...] were 0-or-more or 1-or-moer. You can replace >>> 'git diff' [options] [--] [...] with >>> 'git diff' [options] [...] [--] [...] without adding a new line to cover the new(ly discovered) case.
Re: [PATCH 32/36] pathspec: allow querying for attributes
Stefan Bellerwrites: > The pathspec mechanism is extended via the new > ":(attr:eol=input)pattern/to/match" syntax to filter paths so that it > requires paths to not just match the given pattern but also have the > specified attrs attached for them to be chosen. I was looking at preload-index.c and its history again, because I wanted to ensure that Lars's process filter stuff introduces no unexpected interactions with getting multi-threaded, similar to the problem we had in the earlier incarnations of this step, which we worked around with "pathspec: disable preload-index when attribute pathspec magic is in use". I think that Lars's series is safe, because the only thing among what preload-index.c::preload_thread() does that can go outside the simple stat data and pattern matching with the pathname is the racy-git check in ie_match_stat(), and the object store access in that call was explicitly disabled by 7c4ea599b0 ("Fix index preloading for racy dirty case", 2008-11-17) long time ago. By passing CE_MATCH_RACY_IS_DIRTY option to the call, this caller effectively says "A cache entry whose stat data matches may be actually dirty when the timestamp is racy, in which case we usually compare data to determine if it really is clean, but it is OK to err on the safe and lazy side by declaring it dirty and not marking it up-to-date while we are preloading. Do not bother to go to the object store". The reason why I am bringing this up in this discussion thread on this patch is because I wonder if we would benefit by a similar "let's not do too involved things and be cheap by erring on the safe and lazy side" strategy in the call to ce_path_match() call made in this function to avoid making calls to the attr subsystem. In other words, would it help the system by either simplifying the processing done or reducing the cycle spent in preload_thread() if we could tell ce_path_match() "A pathspec we are checking may require not just the pattern to match but also attributes given to the path to satisfy the criteria, but for the purpose of preloading, pretend that the attribute satisfies the match criteria" (or "pretend that it does not match"), thereby not having to make any call into the attribute subsystem at all from this codepath? The strategy this round takes to make it unnecessary to punt preloading (i.e. dropping "pathspec: disable preload-index when attribute pathspec magic is in use" patch the old series had) is to make the attribute subsystem thread-safe. But that thread-safety in the initial round is based on a single Big Attribute Lock, so it may turn out that the end result performs better for this codepath if we did not to make any call into the attribute subsystem. I am probably being two step ahead of ourselves by saying the above, which is just something to keep in mind as a possible solution if performance in this preload codepath becomes an issue when the pathspec has attributes match specified.
Re: [PATCH] compat: Allow static initializer for pthreads on Windows
Am 27.10.2016 um 19:01 schrieb Stefan Beller: Johannes Sixtwrites: This is the pessimization that I am talking about. I would not mind at all if it were only for the attribute subsystem, but the proposed patch would pessimize *all* uses of pthread_mutex_lock. It would only pessimize *uninitialized* mutexes? For initialized mutexes the added burden is super cheap (one additional condition). It is not possible to mark a mutex uninitialized on Windows without an extra piece of data. A solution would become quite complicated quite quickly, and at the cost of additional operations that are in the same ballpark as an uncontended mutex. I'm not enthused. The positive aspect of this way the patch proposes would be that any future contributor not knowing the details of how to do mutexes right on Windows, would not totally break the whole system, i.e. this seems to be more maintainable in the future as it reduces the friction between pthreads mutexes and the way we can do things in Git in a platform independent way This is a non-argument. Coders have to know their tools. -- Hannes
Re: [PATCH] git-compat-util: move content inside ifdef/endif guards
Jeff Kingwrites: > Commit 3f2e2297b9 (add an extra level of indirection to > main(), 2016-07-01) added a declaration to git-compat-util.h, > but it was accidentally placed after the final #endif that > guards against multiple inclusions. > > This doesn't have any actual impact on the code, since it's > not incorrect to repeat a function declaration in C. But > it's a bad habit, and makes it more likely for somebody else > to make the same mistake. It also defeats gcc's optimization > to avoid opening header files whose contents are completely > guarded. > > Signed-off-by: Jeff King > --- > I just happened to notice this today while doing something unrelated in > the file. Thanks. > > git-compat-util.h | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/git-compat-util.h b/git-compat-util.h > index 91e366d1dd..771ea29f31 100644 > --- a/git-compat-util.h > +++ b/git-compat-util.h > @@ -1042,6 +1042,6 @@ struct tm *git_gmtime_r(const time_t *, struct tm *); > #define getc_unlocked(fh) getc(fh) > #endif > > -#endif > - > extern int cmd_main(int, const char **); > + > +#endif
[PATCH] git-compat-util: move content inside ifdef/endif guards
Commit 3f2e2297b9 (add an extra level of indirection to main(), 2016-07-01) added a declaration to git-compat-util.h, but it was accidentally placed after the final #endif that guards against multiple inclusions. This doesn't have any actual impact on the code, since it's not incorrect to repeat a function declaration in C. But it's a bad habit, and makes it more likely for somebody else to make the same mistake. It also defeats gcc's optimization to avoid opening header files whose contents are completely guarded. Signed-off-by: Jeff King--- I just happened to notice this today while doing something unrelated in the file. git-compat-util.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/git-compat-util.h b/git-compat-util.h index 91e366d1dd..771ea29f31 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -1042,6 +1042,6 @@ struct tm *git_gmtime_r(const time_t *, struct tm *); #define getc_unlocked(fh) getc(fh) #endif -#endif - extern int cmd_main(int, const char **); + +#endif -- 2.10.1.916.g0d2035c
Re: [PATCH] compat: Allow static initializer for pthreads on Windows
On Thu, Oct 27, 2016 at 10:01:02AM -0700, Stefan Beller wrote: > > That function can > > be called from main() of platforms that cannot statically initialize > > mutices, > > By main you mean the main() in common-main.c or cmd_main in git.c ? Any setup or initialization for library functions needs to go in main() of common-main.c. Otherwise, only builtins get it, and external programs are left to segfault (or whatever). > Those both look like the wrong place. Of course it would work adding it > there, but it smells like a maintenance nightmare. I agree it's ugly, but I suspect it would work OK in practice. We don't have that many mutexes and initializing them is cheap. Languages like C++ have non-constant initializers, and the compiler takes care of running the startup code before main() begins. There's no portable way to do that in C, but our cmd_main() is roughly the same thing. I still prefer the lazy initialization if it can work without incurring measurable overhead. That's the normal way to do things in C; the only complication here is the thread safety. -Peff
Re: [PATCH v4 00/14] Mark strings in Perl scripts for translation
> Vasco Almeida (14): > i18n: add--interactive: mark strings for translation > i18n: add--interactive: mark simple here-documents for translation > i18n: add--interactive: mark strings with interpolation for > translation > i18n: clean.c: match string with git-add--interactive.perl > i18n: add--interactive: mark plural strings > i18n: add--interactive: mark patch prompt for translation > i18n: add--interactive: i18n of help_patch_cmd > i18n: add--interactive: mark edit_hunk_manually message for > translation > i18n: add--interactive: remove %patch_modes entries > i18n: add--interactive: mark status words for translation > i18n: send-email: mark strings for translation > i18n: send-email: mark warnings and errors for translation > i18n: send-email: mark string with interpolation for translation > i18n: difftool: mark warnings for translation It seems that nobody has anything to say on this series, either positive or negative? The topic has been in "Waiting for review" pile but I'll mark it as "Will merge to 'next'" unless I hear otherwise within a few days (positives may make it faster, negatives may require a reroll). Thanks.
Re: [PATCH v2 0/2] Stream and fast search
Cc'ed those who touched convert.c or read-cache.c in our relatively recent past with a change that affects the eol conversion codepath. Does any of you (and others on the list) have time and inclination to review this series? Thanks.
Re: [PATCH] compat: Allow static initializer for pthreads on Windows
> Johannes Sixtwrites: > This is the pessimization that I am talking about. I would not mind at all if > it were only for the attribute subsystem, but the proposed patch would > pessimize *all* uses of pthread_mutex_lock. It would only pessimize *uninitialized* mutexes? For initialized mutexes the added burden is super cheap (one additional condition). The positive aspect of this way the patch proposes would be that any future contributor not knowing the details of how to do mutexes right on Windows, would not totally break the whole system, i.e. this seems to be more maintainable in the future as it reduces the friction between pthreads mutexes and the way we can do things in Git in a platform independent way On Wed, Oct 26, 2016 at 11:33 PM, Junio C Hamano wrote: >> >> Lazy on-demand initialization as needed, perhaps? The on-demand >> initialization mechanism may become no-op on some platforms that can >> do static initialization. > > Ah, I think I misunderstood your "please rewrite". Did you mean to > add "void attr_start(void)" helper function to attr.c that does > series of pthread_mutex_init() calls as needed? Well one init for now. > That function can > be called from main() of platforms that cannot statically initialize > mutices, By main you mean the main() in common-main.c or cmd_main in git.c ? Those both look like the wrong place. Of course it would work adding it there, but it smells like a maintenance nightmare. And then we would modify the attr_start command depending on the platform, i.e. #ifdef WIN32 void attr_start(void) { pthread_mutex_init(..); } #else void attr_start(void) { /* nothing as it is statically init'd */ } #endif > while on other platforms it can be a no-op as long as the > variables are statically initialized? If so, that would not pessimize > any platform, I would think. I would think this pessimizes ALL platforms from a maintenance perspective (but what do I know about maintaining stuff as an eager young contributor ;) So I am willing to go that route, though I do not quite understand where exactly you'd expect me to put the initializer as all places I can think of are not the right place. Thanks, Stefan
Re: [PATCH v15 01/27] bisect--helper: use OPT_CMDMODE instead of OPT_BOOL
Cc'ed those who touched either "git-bisect.sh" or "builtin/bisect-helper.c" in our relatively recent past. Does any of you (and others on the list) have time and inclination to review this series? Thanks.
Re: [PATCH] reset: --unmerge
Andreas Schwabwrites: > On Okt 25 2016, Junio C Hamano wrote: > >> Somebody with a bright idea decided that vc-git-resolve-conflicts >> variable should be on by default in Emacs 25.1 X-< > > This is consistent with the behaviour of the other VC backends, where it > isn't even customizable. The problem I had was "M-x save-buffer" (after resolving the conflicts in it manually) running "git add" on the resulting file, robbing from "git diff" an opportunity to help the user to see how the result looks relative to both branches. Do you mean that VC mode broke the same feature equally other SCMs, too? Do other SCM supported by VC backends take advantage of unmerged stages in the index (until you say "$scm add") by allowing you to verify the combined diff with "$scm diff"?
Re: [PATCH v1 16/19] read-cache: unlink old sharedindex files
Duy Nguyenwrites: > Christian, if we assume to go with Junio's suggestion to disable > split-index on temporary files, the only files left we have to take > care of are index and index.lock. I believe pruning here in this code > will have an advantage over in "git gc --auto" because when this is > executed, we know we're holding index.lock, so nobody else is updating > the index, it's race-free. > > All we need to do is peek in $GIT_DIR/index > to see what shared index file it requires and keep it alive too, the > remaining of shared index files can be deleted safely. We don't even > need to fall back to mtime. Yes, that exactly was why I wondered if we can afford to limit splitting only to the primary index, because it makes things a lot simpler. But I suspect that temporary index is where split-index shines most, e.g. while creating a partial commit. The mechanism penalizes the read performance by making the format more complex in order to favor the write performance, which is very much suited for temporary one that is read only once after it is written before it gets discarded (on the other hand, splitting the primary index will penalize reads that happen a lot more than writes). While I still find it attractive at the conceptual level to limit splitting only to the primary index for the resulting simplicity, I doubt it is a good way to go, as I meant to say in > git-gc just can't match this because while it's running, somebody else > may be updating $GIT_DIR/index. Handling races would be a lot harder. It could attempt to take a lock on the primary index while it runs, and refrain to do anything if it can't take the lock ("gc --auto" may want to silently retry), and then the race is no longer an issue, no?
Re: [PATCH] reset: --unmerge
On Okt 25 2016, Junio C Hamanowrote: > Somebody with a bright idea decided that vc-git-resolve-conflicts > variable should be on by default in Emacs 25.1 X-< This is consistent with the behaviour of the other VC backends, where it isn't even customizable. Andreas. -- Andreas Schwab, sch...@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different."
Re: What's cooking in git.git (Oct 2016, #06; Mon, 24)
Hi Hannes, On Tue, 25 Oct 2016, Johannes Sixt wrote: > Am 25.10.2016 um 20:13 schrieb Stefan Beller: > > On Tue, Oct 25, 2016 at 10:15 AM, Junio C Hamanowrote: > > > - the "off-by-one fix" part of sb/submodule-ignore-trailing-slash > > >needs to be in the upcoming release but the "trailing /. in base > > >should not affect the resolution of ../relative/path" part that > > >is still under discussion can wait. Which means we'd need a few > > >more !MINGW prerequisites in the tests by -rc0. > > >[...] > > > > So maybe instead of adding !MINGW we rather want to apply > > https://public-inbox.org/git/2908451e-4273-8826-8989-5572263cc...@kdbg.org/ > > instead for now? > > I was about to submit this very patch again, and only then saw your message. > So, yes, that's what I propose, too. > > Dscho, does this patch fix the test failures that you observed, too? > Unfortunately, it goes against our endeavor to reduce subshells. I am fine with the patch, even if I did not have a chance to test it yet (ran out of time today). Ciao, Dscho
Re: Expanding Includes in .gitignore
On Thu, Oct 27, 2016 at 01:22:43PM +1300, Aaron Pelly wrote: > The use case for this is where I did not write my own rules, but I want > to keep them updated. https://github.com/github/gitignore is a damn good > resource, but I want to pull it and include relevant bits project by > project and/or system wide. I don't want to have to update many projects > manually if that, or any other, repo changes. That seems like a reasonable thing to want. > A very brief look at dir.c would indicate that a recursive call from > add_excludes to itself when it parses some sort of include tag would do > it within a file. I'm sure it'd be pretty straight forward to hook into > something in dir.c to parse directories too. > > I'm thinking something like ". path/to/include/file" in an ignore file, > and/or creating .gitignore.d and/or allowing $HOME/.config/git/ignore > and $GIT_DIR/info/exclude to be directories. Or some sane and consistent > mixture of these things. I'd shy away from an actual include directive, as it raises a lot of complications: - as you noted, cycles in the include graph need to be detected and broken - we parse possibly-hostile .gitignore files from cloned repositories. What happens when I include ask to include /etc/passwd? Probably nothing, but there are setups where it might matter (e.g., something like Travis that auto-builds untrusted repositories, and you could potentially leak the contents of files via error messages). It's nice to avoid the issue entirely. - finding a backwards-compatible syntax Whereas letting any of the user- or repo-level exclude files be a directory, and simply reading all of the files inside, seems simple and obvious. If you go that route, it probably makes sense to teach gitattributes the same trick. > In the case of a directory the plan would be to add links to files > stored/sourced elsewhere. This does pose a precedence question which I > haven't thought about yet, but probably makes it too hard for the > limited value it brings. I think the normal behavior in such "foo.d" directory is to just sort the contents lexically and read them in order, as if they were all concatenated together, and with no recursion. I.e., behave "as if" the user had run "cat $dir/*". That lets you handle precedence via the filenames (or symlink names). It can't handle all cases (some items in "00foo" want precedence over "01bar" and vice versa), but I don't think there's an easy solution. That's a good sign that one or more of the files should be broken up. -Peff
Re: [PATCH v1 16/19] read-cache: unlink old sharedindex files
On Tue, Oct 25, 2016 at 5:43 PM, Duy Nguyenwrote: >> static int write_shared_index(struct index_state *istate, >> @@ -2211,8 +2269,11 @@ static int write_shared_index(struct index_state >> *istate, >> } >> ret = rename_tempfile(_sharedindex, >> git_path("sharedindex.%s", >> sha1_to_hex(si->base->sha1))); >> - if (!ret) >> + if (!ret) { >> hashcpy(si->base_sha1, si->base->sha1); >> + clean_shared_index_files(sha1_to_hex(si->base->sha1)); > > This operation is technically garbage collection and should belong to > "git gc --auto", which is already called automatically in a few > places. Is it not called often enough that we need to do the cleaning > up right after a new shared index is created? Christian, if we assume to go with Junio's suggestion to disable split-index on temporary files, the only files left we have to take care of are index and index.lock. I believe pruning here in this code will have an advantage over in "git gc --auto" because when this is executed, we know we're holding index.lock, so nobody else is updating the index, it's race-free. All we need to do is peek in $GIT_DIR/index to see what shared index file it requires and keep it alive too, the remaining of shared index files can be deleted safely. We don't even need to fall back to mtime. git-gc just can't match this because while it's running, somebody else may be updating $GIT_DIR/index. Handling races would be a lot harder. -- Duy
Re: Expanding Includes in .gitignore
On 27/10/16 21:19, Alexei Lozovsky wrote: >> I'm thinking something like ". path/to/include/file" in an ignore file, >> and/or creating .gitignore.d and/or allowing $HOME/.config/git/ignore >> and $GIT_DIR/info/exclude to be directories. Or some sane and consistent >> mixture of these things. > > I think the rc.d-like approach with directories is better as it > does not add new magical filenames (what if I absolutely do need > to name my directories ". path", with a space? :) Yes. Another alternative I thought of was #include path/to/include/file. That'd be backwards compatible, which is a good thing, but would involve parsing comments, which obviously could start with #include. Or lines like ^Include /path/file$ In the case of finding an invalid file, passing it over and issuing a simple warning should surface any issues with existing gitignores. Anyway, this conversation is why I bring it up on the list. Coming back to this, maybe :(include)/path/file might be more git-like >> In the case of a directory the plan would be to add links to files >> stored/sourced elsewhere. This does pose a precedence question which I >> haven't thought about yet, but probably makes it too hard for the >> limited value it brings. > > Now, if we consider the case of multiple .gitignore files, it > could be unexpected and possibly annoying for negative patterns > in one file to affect the patterns added by some other files. That is a concern. It is non obvious; the worst kind of annoying. > I would find it more conceptually simple to apply individual > .gitignores one by one, as opposed to parsing them all and > creating one giant exclusion rule. (In technical terms, this > means keeping one struct exclude_list for each .gitignore, > not merging them all into one single list.) I agree. I haven't looked, but that sounds like touching significantly more code though. Actually, thinking about it for 20 seconds more, it shouldn't be too hard, should it? > In this case there should be no precendence problems as applied > gitignores only add new ignored files, without un-ignoring > anything previously ignored by other files. Again, I haven't looked yet, but there is still an issue of precedence with other gitignore files in $HOME and the repo. > However, if we allow textual inclusion, then it means that we > can put a gitignore into our gitignore so that we can unignore > while we ignore, which again brings us the question of whether > it is actually needed and expected. Gah! Yes. One way or the other. >> I would like to know the desirability/practicality/stupidity of such a >> feature as I believe it is within my skillset to implement it. > > However, I do not recall any precendent of git using rc.d-like > configs. Many things have adopted this technique recently. Well, the last 15 years. It is common, understood, and fairly simple. I see no issue with it. > And some can argue that your goal can be achieved by > generating the .gitignore by some external means and symlinking > the result into .git/info/exclude, so this is not Git's problem > and we should not be overcomplicating things with something as > simple as a list exclude patterns. This line of argument also > can be used to opposes any textual inclusion as well, because > it can be expanded into 'why don't we add a Turing-complete > programming language then to specify the patterns to ignore'. I know. Another reason to bring the idea to the list. I sort of have this attitude myself. My main objection to it is that I can't think of a hook to automate it with. But: What about some kind of :(exec) that executes a script and returns a gitignore file? You write the script; you're responsible. And the behaviour is obvious. I haven't thought that through. It just came to me then, and might present security issues, but could greatly simplify things.
Re: [PATCH] rebase: add --forget to cleanup rebase, leave HEAD untouched
On Wed, Oct 26, 2016 at 11:51 PM, Junio C Hamanowrote: > Nguyễn Thái Ngọc Duy writes: > >> There are occasions when you decide to abort an in-progress rebase and >> move on to do something else but you forget to do "git rebase --abort" >> first. Or the rebase has been in progress for so long you forgot about >> it. By the time you realize that (e.g. by starting another rebase) >> it's already too late to retrace your steps. The solution is normally >> >> rm -r .git/ >> >> and continue with your life. But there could be two different >> directories for (and it obviously requires some >> knowledge of how rebase works), and the ".git" part could be much >> longer if you are not at top-dir, or in a linked worktree. And >> "rm -r" is very dangerous to do in .git, a mistake in there could >> destroy object database or other important data. >> >> Provide "git rebase --forget" for this exact use case. > > Two and a half comments. > > - The title says "leave HEAD untouched". Are my working tree files >and my index also safe from this operation, or is HEAD the only >thing that is protected? Everything is protected. I will rephrase the title a bit. The option is basically a safe form of "rm -r .git/rebase-{apply,merge}". > - I think I saw a variant of this gotcha for an unconcluded >cherry-pick that was left behind, which the bash-prompt script >did not notice but the next "git cherry-pick" did by complaining >"you are in the middle" or something like that. Perhaps we would >want to have a similarly sounding option to help that case, too, >not in this patch but as another patch on the same theme? That would be nice. I don't put lots of git info on my shell prompt though, so it does not help me. And it's probably difficult to report the right thing too. Sometimes in the middle of rebase I would switch to another branch, look or do stuff, then "git checkout -" back. I don't think we can make the prompt script clever enough to see my intention. > - Would it have helped if bash-prompt were in use? I am not saying >that this patch becomes unnecessary if you use it; I am trying to >see if it helps its users by reminding them what state they are >in. Since I don't use it because I want to keep shell prompt short and light, it doe not help me. But it looks like git-prompt.sh does print rebase in progress and others (only checked the code, didn't test it). -- Duy
Re: [PATCH v3 2/3] sha1_file: open window into packfiles with O_CLOEXEC
On Wed, Oct 26, 2016 at 02:15:33PM -0700, Junio C Hamano wrote: > Jeff Kingwrites: > > > On Wed, Oct 26, 2016 at 10:52:41AM -0700, Junio C Hamano wrote: > > > >> > I actually wonder if it is worth carrying around the O_NOATIME hack at > >> > all. > >> > >> Yes, I share the thought. We no longer have too many loose objects > >> to matter. > >> > >> I do not mind flipping the order, but I'd prefer to cook the result > >> even longer. I am tempted to suggest we take two step route: > >> > >> - ship 2.11 with the "atime has been there and we won't regress it" > >>shape, while cooking the "cloexec is semantically more > >>important" version in 'next' during the feature freeze > >> > >> - immediately after 2.11 merge it to 'master' for 2.12 to make sure > >>there is no fallout. > > > > That sounds reasonable, though I'd consider jumping straight to "NOATIME > > is not worth it; drop it" as the patch for post-2.11. > > That endgame is fine by me too. Thanks for a sanity-check. So here's that endgame patch. My main concern with it was that there might be non-Linux systems that could be affected. But when I dug into it, I found that this code was never activated anywhere besides Linux in the first place. So I really doubt this will have any negative impact at all. I certainly don't mind cooking it until post-2.11, though. +cc Linus as the original author of 144bde78e9 in case there is something subtle I'm missing, but this really just seems like it's an outdated optimization. -- >8 -- Subject: [PATCH] sha1_file: stop opening files with O_NOATIME When we open object files, we try to do so with O_NOATIME. This dates back to 144bde78e9 (Use O_NOATIME when opening the sha1 files., 2005-04-23), which is an optimization to avoid creating a bunch of dirty inodes when we're accessing many objects. But a few things have changed since then: 1. In June 2005, git learned about packfiles, which means we would do a lot fewer atime updates (rather than one per object access, we'd generally get one per packfile). 2. In late 2006, Linux learned about "relatime", which is generally the default on modern installs. So performance around atimes updates is a non-issue there these days. All the world isn't Linux, but as it turns out, Linux is the only platform to implement O_NOATIME in the first place. So it's very unlikely that this code is helping anybody these days. It's not a particularly large amount of code, but the fallback-retry creates complexity. E.g., we do a similar fallback for CLOEXEC; which one should take precedence, or should we try all possible combinations? Dropping O_NOATIME makes those questions go away. Signed-off-by: Jeff King --- sha1_file.c | 15 +-- 1 file changed, 1 insertion(+), 14 deletions(-) diff --git a/sha1_file.c b/sha1_file.c index 09045df1dc..6f02a57d8b 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -27,14 +27,6 @@ #include "list.h" #include "mergesort.h" -#ifndef O_NOATIME -#if defined(__linux__) && (defined(__i386__) || defined(__PPC__)) -#define O_NOATIME 0100 -#else -#define O_NOATIME 0 -#endif -#endif - #define SZ_FMT PRIuMAX static inline uintmax_t sz_fmt(size_t s) { return s; } @@ -1561,7 +1553,7 @@ int check_sha1_signature(const unsigned char *sha1, void *map, int git_open(const char *name) { - static int sha1_file_open_flag = O_NOATIME | O_CLOEXEC; + static int sha1_file_open_flag = O_CLOEXEC; for (;;) { int fd; @@ -1577,11 +1569,6 @@ int git_open(const char *name) continue; } - /* Might the failure be due to O_NOATIME? */ - if (errno != ENOENT && (sha1_file_open_flag & O_NOATIME)) { - sha1_file_open_flag &= ~O_NOATIME; - continue; - } return -1; } } -- 2.10.1.916.g0d2035c
Re: Expanding Includes in .gitignore
On 27/10/16 15:22, Stefan Beller wrote: >> The use case for this is where I did not write my own rules, but I want >> to keep them updated. https://github.com/github/gitignore is a damn good >> resource, but I want to pull it and include relevant bits project by >> project and/or system wide. I don't want to have to update many projects >> manually if that, or any other, repo changes. > > .git/info/exclude could be a (sym)link to an up to date version > of the gitignore repo as a hack? > Using links isn't a bad idea, but you still end up at some stage combining the contents of several files that already exist. Well, in my example, anyway. I accept that I'm being pretty trivial, and once it's set up there's never any pressing need to change anything, but it still irks me. Even with a linked .gitignore, or .git/info/exclude there will be sections that are project, language, editor, machine, whatever, specific. So I still need to copy stuff from one file to another by hand. By allowing includes, I only have to have a link to each file describing the data types for each component of the environment. And they are community maintained, so I don't have to google every time I try a new editor. note to self: reply-to isn't the list.
Re: [PATCH v1 16/19] read-cache: unlink old sharedindex files
On Thu, Oct 27, 2016 at 12:25 PM, Duy Nguyenwrote: > On Tue, Oct 25, 2016 at 5:43 PM, Duy Nguyen wrote: >>> static int write_shared_index(struct index_state *istate, >>> @@ -2211,8 +2269,11 @@ static int write_shared_index(struct index_state >>> *istate, >>> } >>> ret = rename_tempfile(_sharedindex, >>> git_path("sharedindex.%s", >>> sha1_to_hex(si->base->sha1))); >>> - if (!ret) >>> + if (!ret) { >>> hashcpy(si->base_sha1, si->base->sha1); >>> + clean_shared_index_files(sha1_to_hex(si->base->sha1)); >> >> This operation is technically garbage collection and should belong to >> "git gc --auto", which is already called automatically in a few >> places. Is it not called often enough that we need to do the cleaning >> up right after a new shared index is created? > > Christian, if we assume to go with Junio's suggestion to disable > split-index on temporary files, the only files left we have to take > care of are index and index.lock. I believe pruning here in this code > will have an advantage over in "git gc --auto" because when this is > executed, we know we're holding index.lock, so nobody else is updating > the index, it's race-free. All we need to do is peek in $GIT_DIR/index > to see what shared index file it requires and keep it alive too, the > remaining of shared index files can be deleted safely. We don't even > need to fall back to mtime. > > git-gc just can't match this because while it's running, somebody else > may be updating $GIT_DIR/index. Handling races would be a lot harder. Yeah, I agree that if we disable split-index on temporary files and on commands like "git read-tree --index-output=" then it is the right thing to do it in write_shared_index(). (In fact when I wrote the previous RFC series I didn't think about those special cases, and that was the reason why I did it like this. So I just need to go back to the implementation that was in the previous RFC series.) I am just still wondering if disabling split-index on temporary files could not have a bad performance impact for some use cases, but I guess we could always come back to problem again if that happens. Thanks, Christian.
Re: Expanding Includes in .gitignore
> I'm thinking something like ". path/to/include/file" in an ignore file, > and/or creating .gitignore.d and/or allowing $HOME/.config/git/ignore > and $GIT_DIR/info/exclude to be directories. Or some sane and consistent > mixture of these things. I think the rc.d-like approach with directories is better as it does not add new magical filenames (what if I absolutely do need to name my directories ". path", with a space? :) keeping the syntax of gitignores themselves simple, and allowing us to think of files as still being separate entities. This can be useful for the following case: > In the case of a directory the plan would be to add links to files > stored/sourced elsewhere. This does pose a precedence question which I > haven't thought about yet, but probably makes it too hard for the > limited value it brings. As I understand, the precedence only matters for negative patterns (the ones that start with an exclamation mark, !). For example, suppose you have files 'foo1' and 'foo2'. This .gitignore foo* !foo1 will ignore foo2, but will show foo1. However, if the lines are swapped: !foo1 foo* then both foo1 and foo2 will be ignored. Now, if we consider the case of multiple .gitignore files, it could be unexpected and possibly annoying for negative patterns in one file to affect the patterns added by some other files. I would find it more conceptually simple to apply individual .gitignores one by one, as opposed to parsing them all and creating one giant exclusion rule. (In technical terms, this means keeping one struct exclude_list for each .gitignore, not merging them all into one single list.) In this case there should be no precendence problems as applied gitignores only add new ignored files, without un-ignoring anything previously ignored by other files. However, if we allow textual inclusion, then it means that we can put a gitignore into our gitignore so that we can unignore while we ignore, which again brings us the question of whether it is actually needed and expected. > I would like to know the desirability/practicality/stupidity of such a > feature as I believe it is within my skillset to implement it. In my mind, this feature definitely has utility and it can be implemented in backwards-compatible way, so why not. However, I do not recall any precendent of git using rc.d-like configs. And some can argue that your goal can be achieved by generating the .gitignore by some external means and symlinking the result into .git/info/exclude, so this is not Git's problem and we should not be overcomplicating things with something as simple as a list exclude patterns. This line of argument also can be used to opposes any textual inclusion as well, because it can be expanded into 'why don't we add a Turing-complete programming language then to specify the patterns to ignore'.
Re: [PATCH] Documentation/git-diff: document git diff with 3+ commits
Junio C Hamano venit, vidit, dixit 26.10.2016 20:11: > Michael J Gruberwrites: > >> That one is difficult to discover but super useful, so document it: >> Specifying 3 or more commits makes git diff switch to combined diff. >> >> Signed-off-by: Michael J Gruber >> --- >> >> Notes: >> Note that we have the following now: >> ... >> 'git diff A..B' equivalent to 'git diff A B' >> in contrast to 'git log A..B' listing commits between M and B only >> (without the commits between M and A unless they are "in" B). > > The standard answer is: > > Do not use two-dot form with 'git diff', if you find it > confusing. Diff is about two endpoints, not about a range > between two. > > The reason why we do not reject can be easily guessed by any > intelligent person when some historical background is given, I > think. That is very well true. I'm more concerned with the presence, though, that is: How easy to use is git now? Users choose git because they care about the history of their project, not necessarily that of Git ;) > - In the beginning A...B did not exist. A..B was the only "range" >notation. > > - "git log A..B" was in wide use. Remember, "git log A...B" did >not exist. > > - People started mistyping "git diff A..B", which looked as if the >user typed "git diff ^A B" to the internal. > > - Git _could_ have rejected that as a bogus request to diff two >points, ^A (what is that???) and B, but "What else could the user >have meant with 'git diff A..B' other than 'git diff A B'?" was >an argument to favor doing _something_ useful rather than >erroring out. Remember, "A...B" did not exist when this >happened. It did not exist, but even at that point in time, "git log A..B" listed only commits between the merge base and B, not those which are only in A and not in B. Whereas "git diff A B" shows the differences between the endpoints A and B. >> @@ -12,6 +12,7 @@ SYNOPSIS >> 'git diff' [options] [] [--] [...] >> 'git diff' [options] --cached [] [--] [...] >> 'git diff' [options] [--] [...] >> +'git diff' [options][...] > > Made me wonder "is [...] 0-or-more As or 1-or-more As?". 0-or-more, at least that's the way it is used in all lines here. > Don't we allow pathspecs in this case? Yes, the combinded diff mode kicks in only with no blobs (not: no pathspec, which I had misread) and N>=3 commits. Maybe I should update the code comments in builtin/diff.c to describe this, too. Michael
Re: [PATCH] Update git rebase documentation to clarify HEAD behavior
Cody Sehlwrites: > The first few paragraphs in the git-rebase.txt documentation lay out the > steps git takes during a rebase: > 1. everything from `..HEAD` is saved to a temporary area > 2. `HEAD` is set to `` > 3. the changes held in the temporary area are applied one by one in order on > top of the new `HEAD` > > The second step was described using the phrase `The current branch is reset > to `, which is true (because `HEAD` == current branch), but not > clear. > --- Please wrap your lines to reasonable lengths like 70 columns or so. Please sign off your patch. > Documentation/git-rebase.txt | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt > index de222c8..c47ca11 100644 > --- a/Documentation/git-rebase.txt > +++ b/Documentation/git-rebase.txt > @@ -33,7 +33,7 @@ of commits that would be shown by `git log > ..HEAD`; or by > description on `--fork-point` below); or by `git log HEAD`, if the > `--root` option is specified. > > -The current branch is reset to , or if the > +HEAD is reset to , or if the > --onto option was supplied. This has the exact same effect as > `git reset --hard ` (or ). ORIG_HEAD is set > to point at the tip of the branch before the reset. This is describing an ancient behaviour before 6fd2f5e60d ("rebase: operate on a detached HEAD", 2007-11-08) in v1.5.4 timeframe. We apparently failed to update the description. This depends on the desired technical detail of the description, but a correct rewrite would be "HEAD is detached at , or , and ORIG_HEAD is set to point at the tip of the branch before this happens". Detaching the HEAD at no longer has the same effect as "git reset --hard " (or ), so that sentence must go. It was the primary point of the ancient change at 6fd2f5e60d after all. And then there is a new step (to be numbered 4. in your description in the proposed log message), which updates the tip of the branch to the resulting HEAD (after replaying all these changes) and check the branch out, which needs to be added. Perhaps after "one by one, in order." Oh, the mention of "reapplied to the current branch" also needs to be updated to "reapplied to the detached HEAD", too. On the other hand, if we do not aim for that deep level of technical correctness, but want to tell a white lie to make it easier to understand at the conceptual level to new readers who haven't grasped the detached HEAD, then the current description is fine. By bringing up "HEAD", you seem to be aiming for techincal correctness (which I tend to agree is a good direction to go in this part of the documentation), so the existing text needs a bit more work than your patch to be brought to the modern world.
Re: [PATCH] compat: Allow static initializer for pthreads on Windows
Junio C Hamanowrites: > Johannes Sixt writes: > >>> As many codepaths may not even need access to the attributes, I >>> doubt that would be a very productive direction to go. >> >> So, what is productive then? Pessimizing one (not exactly minor) platform? > > Lazy on-demand initialization as needed, perhaps? The on-demand > initialization mechanism may become no-op on some platforms that can > do static initialization. Ah, I think I misunderstood your "please rewrite". Did you mean to add "void attr_start(void)" helper function to attr.c that does series of pthread_mutex_init() calls as needed? That function can be called from main() of platforms that cannot statically initialize mutices, while on other platforms it can be a no-op as long as the variables are statically initialized? If so, that would not pessimize any platform, I would think.
Re: [PATCH] compat: Allow static initializer for pthreads on Windows
Am 27.10.2016 um 08:21 schrieb Junio C Hamano: Johannes Sixtwrites: As many codepaths may not even need access to the attributes, I doubt that would be a very productive direction to go. So, what is productive then? Pessimizing one (not exactly minor) platform? Lazy on-demand initialization as needed, perhaps? The on-demand initialization mechanism may become no-op on some platforms that can do static initialization. This is the pessimization that I am talking about. I would not mind at all if it were only for the attribute subsystem, but the proposed patch would pessimize *all* uses of pthread_mutex_lock. -- Hannes
[PATCH] Update git rebase documentation to clarify HEAD behavior
The first few paragraphs in the git-rebase.txt documentation lay out the steps git takes during a rebase: 1. everything from `..HEAD` is saved to a temporary area 2. `HEAD` is set to `` 3. the changes held in the temporary area are applied one by one in order on top of the new `HEAD` The second step was described using the phrase `The current branch is reset to `, which is true (because `HEAD` == current branch), but not clear. --- Documentation/git-rebase.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt index de222c8..c47ca11 100644 --- a/Documentation/git-rebase.txt +++ b/Documentation/git-rebase.txt @@ -33,7 +33,7 @@ of commits that would be shown by `git log ..HEAD`; or by description on `--fork-point` below); or by `git log HEAD`, if the `--root` option is specified. -The current branch is reset to , or if the +HEAD is reset to , or if the --onto option was supplied. This has the exact same effect as `git reset --hard ` (or ). ORIG_HEAD is set to point at the tip of the branch before the reset. -- https://github.com/git/git/pull/301
Re: [PATCH] compat: Allow static initializer for pthreads on Windows
Johannes Sixtwrites: >> As many codepaths may not even need access to the attributes, I >> doubt that would be a very productive direction to go. > > So, what is productive then? Pessimizing one (not exactly minor) platform? Lazy on-demand initialization as needed, perhaps? The on-demand initialization mechanism may become no-op on some platforms that can do static initialization.
Re: [PATCH] compat: Allow static initializer for pthreads on Windows
Am 27.10.2016 um 08:02 schrieb Junio C Hamano: Johannes Sixtwrites: Am 26.10.2016 um 23:57 schrieb Stefan Beller: In Windows it is not possible to have a static initialized mutex as of now, but that seems to be painful for the upcoming refactoring of the attribute subsystem, as we have no good place to put the initialization of the attr global lock. Please rewrite the attribute system such that it can have a dynamic initialization. If you find a global initialization in main() too gross (I would agree) then setup_git_directory() might be the right place. As many codepaths may not even need access to the attributes, I doubt that would be a very productive direction to go. So, what is productive then? Pessimizing one (not exactly minor) platform? -- Hannes
Re: [PATCH] compat: Allow static initializer for pthreads on Windows
Johannes Sixtwrites: > Am 26.10.2016 um 23:57 schrieb Stefan Beller: >> In Windows it is not possible to have a static initialized mutex as of >> now, but that seems to be painful for the upcoming refactoring of the >> attribute subsystem, as we have no good place to put the initialization >> of the attr global lock. > > Please rewrite the attribute system such that it can have a dynamic > initialization. If you find a global initialization in main() too > gross (I would agree) then setup_git_directory() might be the right > place. As many codepaths may not even need access to the attributes, I doubt that would be a very productive direction to go.
Re: "git subtree --squash" interacts poorly with revert, merge, and rebase
Junio C Hamanowrites: > Peter Williams writes: > >> However, for git commands such as diff/status whose job is to display >> information it would be nice if they had a --recursive option to >> override the default submodule diff/status and show details of the >> changes in the submodules. Sometimes you want to see the big picture >> in detail. > > I won't disagree. My comment was only on this part from the original: > >>> - We have to make separate commits and manage corresponding topic >>> branches for the superproject and subprojects. > > and on this point, we seem to be in agreement. Oh, and as Stefan mentioned, a "git diff" that recurses into the submodules to give you detailed big picture has been in 'next' (perhaps aready in 'master' as of tonight, but I am not sure offhand) to be tested, together with many other fixes and enhancements that all are waiting to be included in future releases. The more people try and give feedback to these branches early, the more solid release with better support for more goodies you'd want we will be able to give you. Early adopters are always appreciated but especially in time like this before the feature freeze for the upcoming release (see tinyurl.com/gitCal for the schedule), they are of great help. Start by cloning from any one of these places git://git.kernel.org/pub/scm/git/git.git/ https://kernel.googlesource.com/pub/scm/git/git git://repo.or.cz/alt-git.git/ https://github.com/git/git/ and then $ git checkout -b next origin/next : read INSTALL to figure out if any custom options are needed : in the following 'make' invocations for your environment $ make && make install $ PATH=$HOME/bin:$PATH to join the fun.