Re: [PATCH v7 1/4] sha1_file.c: support reading from a loose object of unknown type
karthik nayak writes: >> So, it makes me wonder what guarantee we have that this does not >> dereference a NULL here. >> > As per my code, oi->typename is only pointing to something when oi->typep > is ( As oi->typename is currently only used in cat-file.c). > But what you're saying also is true, there is no other guarantee, as a user > may > set oi->typename to point to a struct strbuf and leave out oi->typep. > > if (oi->typename && oi->typep) > strbuf_addstr(oi->typename, typename(*oi->typep)); > > This should suffice. Do you want me to re-roll this? I'd rather avoid the thinking along the lines of "at this moment, there happens to be only one caller that asks for typename and the caller also sets typep, so we will be safe as long as we make sure the caller passed typep before giving him typename back". Somebody else may write new code that wants to learn the typename, forgets to set typep, calls into this codepath, and ends up scratching his head wondering why the typename string is returned to him. Surely the code may not crash at the new code you wrote, but you are not helping him. If it semantically does not make sense to ask for the typename without asking for the type code, then we can and should make that as a new calling convention _all_ callers must follow. In other words, I think it is better to have if (oi->typename && !oi->typep) die("BUG"); somewhere near the beginning of the callchain that takes oi; that will ensure all callers understand the rule. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v7 1/4] sha1_file.c: support reading from a loose object of unknown type
On 04/05/2015 01:16 PM, Junio C Hamano wrote: karthik nayak writes: So, it makes me wonder what guarantee we have that this does not dereference a NULL here. As per my code, oi->typename is only pointing to something when oi->typep is ( As oi->typename is currently only used in cat-file.c). But what you're saying also is true, there is no other guarantee, as a user may set oi->typename to point to a struct strbuf and leave out oi->typep. if (oi->typename && oi->typep) strbuf_addstr(oi->typename, typename(*oi->typep)); This should suffice. Do you want me to re-roll this? I'd rather avoid the thinking along the lines of "at this moment, there happens to be only one caller that asks for typename and the caller also sets typep, so we will be safe as long as we make sure the caller passed typep before giving him typename back". Somebody else may write new code that wants to learn the typename, forgets to set typep, calls into this codepath, and ends up scratching his head wondering why the typename string is returned to him. Surely the code may not crash at the new code you wrote, but you are not helping him. Yes! I do agree with you, If you read my previous reply, I did mention what you said about not having a guarantee that typep and typename are both set and another user might have a bug with this. If it semantically does not make sense to ask for the typename without asking for the type code, then we can and should make that as a new calling convention _all_ callers must follow. In other words, I think it is better to have if (oi->typename && !oi->typep) die("BUG"); somewhere near the beginning of the callchain that takes oi; that will ensure all callers understand the rule. Yes! this is a better approach as it will enforce that typep must be set when typename is set. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Feature request: Update remote url if it is redirected.
Some git hosting services, like Github, a url to a git repository can be changed by changing the name of the repository by the owner. If someone tries to get the repository with the old url, usually the hosting service serves the request with the repository indiciated by the new url. It is very helpful for the users who don't know the url has been changed. In such case, I think it is recommended to update the user's remote url with the new one because the old url possibily does not work if the hosting service does not keep the old url anymore, or worse, works incorrectly if someone makes a repository with the old url. To fix the potential problem, it would be nice if Git client asks the user, while fetching objects from a remote url, to update the url if it is changed. If the client and the server is talking in HTTP, it is easy to know the url is changed if the server responses with 301 Moved Permantely or 308 Permanet Redirect. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
how do I ignore a directory for diff
Hello, I have been looking into ignoring a subdirectory of my tree for diffing with upstream. I'll explain the situation below : My tree is a fork of an upstream repo. There is a specific directory in my tree lets call it foo/bar that i would like to ignore for diff. This directory includes only files that i added to my repo and is therefore irrelevant for diffing (i know all files in there have been added and are not in upstream). Having there in the diff is just making a lot of files to appear and that is confusing to see what is changed from upstream. I have read the docs and found a way mentioning that i should add a line to .gitattributes with : foo/bar/* -diff But this still lists the files in there when i'm diffing. Is there any way to achieve this ? i cant find any clear explanation in the docs. Any help would be greatly appreciated :) Lionel. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: how do I ignore a directory for diff
On Sun, Apr 05, 2015 at 11:31:54AM +, LongChair . wrote: > I have been looking into ignoring a subdirectory of my tree for > diffing with upstream. I'll explain the situation below : > > My tree is a fork of an upstream repo. There is a specific directory > in my tree lets call it foo/bar that i would like to ignore for diff. > This directory includes only files that i added to my repo and is > therefore irrelevant for diffing (i know all files in there have been > added and are not in upstream). Having there in the diff is just > making a lot of files to appear and that is confusing to see what is > changed from upstream. > > I have read the docs and found a way mentioning that i should add a > line to .gitattributes with : foo/bar/* -diff > > But this still lists the files in there when i'm diffing. > > Is there any way to achieve this ? i cant find any clear explanation > in the docs. Since git-diff takes a pathspec you can use the exclude magic to exclude certain directories like this: git diff upstream -- ':(top)' ':(exclude)foo/bar' or equivalently: git diff upstream -- :/ ':!foo/bar' The documentation for the pathspec syntax is in git-glossary(7). -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
04.04.15
Need a Soft, Business, Home or Private Loan? Then Email Us Back as we would be glad to assist. Regards, John Ryder -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/6] address packed-refs speed regressions
Am 05.04.2015 um 03:06 schrieb Jeff King: > As I've mentioned before, I have some repositories with rather large > numbers of refs. The worst one has ~13 million refs, for a 1.6GB > packed-refs file. So I was saddened by this: > >$ time git.v2.0.0 rev-parse refs/heads/foo >/dev/null 2>&1 >real0m6.840s >user0m6.404s >sys 0m0.440s > >$ time git.v2.4.0-rc1 rev-parse refs/heads/foo >/dev/null 2>&1 >real0m19.432s >user0m18.996s >sys 0m0.456s > > The command isn't important; what I'm really measuring is loading the > packed-refs file. And yes, of course this repository is absolutely > ridiculous. But the slowdowns here are linear with the number of refs. > So _every_ git command got a little bit slower, even in less crazy > repositories. We just didn't notice it as much. > > Here are the numbers after this series: > >real0m8.539s >user0m8.052s >sys 0m0.496s > > Much better, but I'm frustrated that they are still 20% slower than the > original. > > The main culprits seem to be d0f810f (which introduced some extra > expensive code for each ref) and my 10c497a, which switched from fgets() > to strbuf_getwholeline. It turns out that strbuf_getwholeline is really > slow. 10c497a changed read_packed_refs(), which reads *all* packed refs. Each is checked for validity. That sounds expensive if the goal is just to look up a single (non-existing) ref. Would it help to defer any checks until a ref is actually accessed? Can a binary search be used instead of reading the whole file? I wonder if pluggable reference backends could help here. Storing refs in a database table indexed by refname should simplify things. Short-term, can we avoid the getc()/strbuf_grow() dance e.g. by mapping the packed refs file? What numbers do you get with the following patch? --- refs.c | 36 1 file changed, 28 insertions(+), 8 deletions(-) diff --git a/refs.c b/refs.c index 47e4e53..144255f 100644 --- a/refs.c +++ b/refs.c @@ -1153,16 +1153,35 @@ static const char *parse_ref_line(struct strbuf *line, unsigned char *sha1) * compatibility with older clients, but we do not require it * (i.e., "peeled" is a no-op if "fully-peeled" is set). */ -static void read_packed_refs(FILE *f, struct ref_dir *dir) +static void read_packed_refs(int fd, struct ref_dir *dir) { struct ref_entry *last = NULL; struct strbuf line = STRBUF_INIT; enum { PEELED_NONE, PEELED_TAGS, PEELED_FULLY } peeled = PEELED_NONE; + struct stat st; + void *map; + size_t mapsz, len; + const char *p; + + fstat(fd, &st); + mapsz = xsize_t(st.st_size); + if (!mapsz) + return; + map = xmmap(NULL, mapsz, PROT_READ, MAP_PRIVATE, fd, 0); - while (strbuf_getwholeline(&line, f, '\n') != EOF) { + for (p = map, len = mapsz; len; ) { unsigned char sha1[20]; const char *refname; const char *traits; + const char *nl; + size_t linelen; + + nl = memchr(p, '\n', len); + linelen = nl ? nl - p + 1 : len; + strbuf_reset(&line); + strbuf_add(&line, p, linelen); + p += linelen; + len -= linelen; if (skip_prefix(line.buf, "# pack-refs with:", &traits)) { if (strstr(traits, " fully-peeled ")) @@ -1204,6 +1223,7 @@ static void read_packed_refs(FILE *f, struct ref_dir *dir) } strbuf_release(&line); + munmap(map, mapsz); } /* @@ -1224,16 +1244,16 @@ static struct packed_ref_cache *get_packed_ref_cache(struct ref_cache *refs) clear_packed_ref_cache(refs); if (!refs->packed) { - FILE *f; + int fd; refs->packed = xcalloc(1, sizeof(*refs->packed)); acquire_packed_ref_cache(refs->packed); refs->packed->root = create_dir_entry(refs, "", 0, 0); - f = fopen(packed_refs_file, "r"); - if (f) { - stat_validity_update(&refs->packed->validity, fileno(f)); - read_packed_refs(f, get_ref_dir(refs->packed->root)); - fclose(f); + fd = open(packed_refs_file, O_RDONLY); + if (fd >= 0) { + stat_validity_update(&refs->packed->validity, fd); + read_packed_refs(fd, get_ref_dir(refs->packed->root)); + close(fd); } } return refs->packed; -- 2.3.5 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: how do I ignore a directory for diff
Hello John, Thanks for the answer. I am also using some GUI client (smartgit). Is there any way to make this part of the repo attributes / configuration so that my git GUI would use it ? Lionel. > Le 5 avr. 2015 à 14:17, John Keeping a écrit : > > On Sun, Apr 05, 2015 at 11:31:54AM +, LongChair . wrote: >> I have been looking into ignoring a subdirectory of my tree for >> diffing with upstream. I'll explain the situation below : >> >> My tree is a fork of an upstream repo. There is a specific directory >> in my tree lets call it foo/bar that i would like to ignore for diff. >> This directory includes only files that i added to my repo and is >> therefore irrelevant for diffing (i know all files in there have been >> added and are not in upstream). Having there in the diff is just >> making a lot of files to appear and that is confusing to see what is >> changed from upstream. >> >> I have read the docs and found a way mentioning that i should add a >> line to .gitattributes with : foo/bar/* -diff >> >> But this still lists the files in there when i'm diffing. >> >> Is there any way to achieve this ? i cant find any clear explanation >> in the docs. > > Since git-diff takes a pathspec you can use the exclude magic to exclude > certain directories like this: > > git diff upstream -- ':(top)' ':(exclude)foo/bar' > > or equivalently: > > git diff upstream -- :/ ':!foo/bar' > > The documentation for the pathspec syntax is in git-glossary(7). -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/6] strbuf_getwholeline: use getc_unlocked
On Sun, Apr 5, 2015 at 11:56 AM, Jeff King wrote: > So we'd have to either: > > 1. Decide that doesn't matter. > > 2. Have callers specify a "damn the NULs, I want it fast" flag. 2+. Avoid FILE* interface and go with syscalls for reading packed-refs? If mmaping the entire file could be a problem for some platform because it's too large, we have code for reading (with bufferring) from fd somewhere, e.g. index-pack. > 3. Find some alternative that is more robust than fgets, and faster > than getc. I don't think there is anything in stdio, but I am not > above dropping in a faster non-portable call if it is available, > and then falling back to the current code otherwise. -- Duy -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: how do I ignore a directory for diff
On Sun, Apr 05, 2015 at 04:19:50PM +0200, Lionel CHAZALLON wrote: > > Le 5 avr. 2015 à 14:17, John Keeping a écrit : > > > > On Sun, Apr 05, 2015 at 11:31:54AM +, LongChair . wrote: > >> I have been looking into ignoring a subdirectory of my tree for > >> diffing with upstream. I'll explain the situation below : > >> > >> My tree is a fork of an upstream repo. There is a specific directory > >> in my tree lets call it foo/bar that i would like to ignore for diff. > >> This directory includes only files that i added to my repo and is > >> therefore irrelevant for diffing (i know all files in there have been > >> added and are not in upstream). Having there in the diff is just > >> making a lot of files to appear and that is confusing to see what is > >> changed from upstream. > >> > >> I have read the docs and found a way mentioning that i should add a > >> line to .gitattributes with : foo/bar/* -diff > >> > >> But this still lists the files in there when i'm diffing. > >> > >> Is there any way to achieve this ? i cant find any clear explanation > >> in the docs. > > > > Since git-diff takes a pathspec you can use the exclude magic to exclude > > certain directories like this: > > > > git diff upstream -- ':(top)' ':(exclude)foo/bar' > > > > or equivalently: > > > > git diff upstream -- :/ ':!foo/bar' > > > > The documentation for the pathspec syntax is in git-glossary(7). > > Thanks for the answer. I am also using some GUI client (smartgit). Is > there any way to make this part of the repo attributes / configuration > so that my git GUI would use it ? I think you'll have to file a feature request with SmartGit if you want support for this in their UI. The standard way to set this up would be to create an alias that does what you want, such as: git config alias.d 'diff -- :/ ":!foo/bar"' and use "git d" instead of "git diff", but there is no way for other programs to inherit that. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
How to send a warning message from git hosting server?
Hello. I am serving a git hosting service for my company. Sometimes I want to send a warning message to users who use my service; e.g. the service will be shutdown tomorrow for a while temporary. I know it is possible to a remote message by hooks or HTTP body if an error occured. But it seems that there is no hooks for git-fetch and git does not print HTTP body if there is no error. I want a way to response a remote message when a client send any kind of request. Is it possible? -EungJun -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/6] strbuf_getwholeline: use getc_unlocked
On Sun, Apr 05, 2015 at 09:36:04PM +0700, Duy Nguyen wrote: > On Sun, Apr 5, 2015 at 11:56 AM, Jeff King wrote: > > So we'd have to either: > > > > 1. Decide that doesn't matter. > > > > 2. Have callers specify a "damn the NULs, I want it fast" flag. > > 2+. Avoid FILE* interface and go with syscalls for reading > packed-refs? If mmaping the entire file could be a problem for some > platform because it's too large, we have code for reading (with > bufferring) from fd somewhere, e.g. index-pack. There's strbuf_getwholeline_fd, but it's horrifically inefficient (one syscall per character). But the other option is to implement your own buffering, and we're generally better off letting stdio do that for us (the exception here is that stdio does not have a good NUL-safe "read until X" function). -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v7 1/4] sha1_file.c: support reading from a loose object of unknown type
Update sha1_loose_object_info() to optionally allow it to read from a loose object file of unknown/bogus type; as the function usually returns the type of the object it read in the form of enum for known types, add an optional "typename" field to receive the name of the type in textual form and a flag to indicate the reading of a loose object file of unknown/bogus type. Add parse_sha1_header_extended() which acts as a wrapper around parse_sha1_header() allowing more information to be obtained. Add unpack_sha1_header_to_strbuf() to unpack sha1 headers of unknown/corrupt objects which have a unknown sha1 header size to a strbuf structure. This was written by Junio C Hamano but tested by me. Helped-by: Junio C Hamano Helped-by: Eric Sunshine Signed-off-by: Karthik Nayak --- cache.h | 2 ++ sha1_file.c | 114 2 files changed, 94 insertions(+), 22 deletions(-) diff --git a/cache.h b/cache.h index c47323e..ea6faf0 100644 --- a/cache.h +++ b/cache.h @@ -881,6 +881,7 @@ extern int is_ntfs_dotgit(const char *name); /* object replacement */ #define LOOKUP_REPLACE_OBJECT 1 +#define LOOKUP_LITERALLY 2 extern void *read_sha1_file_extended(const unsigned char *sha1, enum object_type *type, unsigned long *size, unsigned flag); static inline void *read_sha1_file(const unsigned char *sha1, enum object_type *type, unsigned long *size) { @@ -1349,6 +1350,7 @@ struct object_info { unsigned long *sizep; unsigned long *disk_sizep; unsigned char *delta_base_sha1; + struct strbuf *typename; /* Response */ enum { diff --git a/sha1_file.c b/sha1_file.c index 980ce6b..ac8fffc 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -1564,6 +1564,34 @@ int unpack_sha1_header(git_zstream *stream, unsigned char *map, unsigned long ma return git_inflate(stream, 0); } +static int unpack_sha1_header_to_strbuf(git_zstream *stream, unsigned char *map, + unsigned long mapsize, + struct strbuf *header) +{ + unsigned char buffer[32], *cp; + unsigned long bufsiz = sizeof(buffer); + int status; + + status = unpack_sha1_header(stream, map, mapsize, buffer, bufsiz); + + if (status) { + strbuf_add(header, buffer, stream->next_out - buffer); + return status; + } + + do { + status = git_inflate(stream, 0); + strbuf_add(header, buffer, stream->next_out - buffer); + for (cp = buffer; cp < stream->next_out; cp++) + if (!*cp) + /* Found the NUL at the end of the header */ + return 0; + stream->next_out = buffer; + stream->avail_out = bufsiz; + } while (status != Z_STREAM_END); + return -1; +} + static void *unpack_sha1_rest(git_zstream *stream, void *buffer, unsigned long size, const unsigned char *sha1) { int bytes = strlen(buffer) + 1; @@ -1614,27 +1642,24 @@ static void *unpack_sha1_rest(git_zstream *stream, void *buffer, unsigned long s * too permissive for what we want to check. So do an anal * object header parse by hand. */ -int parse_sha1_header(const char *hdr, unsigned long *sizep) +int parse_sha1_header_extended(const char *hdr, struct object_info *oi, + unsigned int flags) { - char type[10]; - int i; + struct strbuf typename = STRBUF_INIT; unsigned long size; + int type; /* * The type can be at most ten bytes (including the * terminating '\0' that we add), and is followed by * a space. */ - i = 0; for (;;) { char c = *hdr++; if (c == ' ') break; - type[i++] = c; - if (i >= sizeof(type)) - return -1; + strbuf_addch(&typename, c); } - type[i] = 0; /* * The length must follow immediately, and be in canonical @@ -1652,12 +1677,39 @@ int parse_sha1_header(const char *hdr, unsigned long *sizep) size = size * 10 + c; } } - *sizep = size; + + type = type_from_string_gently(typename.buf, typename.len, 1); + if (oi->sizep) + *oi->sizep = size; + if (oi->typename) + strbuf_addbuf(oi->typename, &typename); + strbuf_release(&typename); /* +* Set type to 0 if its an unknown object and +* we're obtaining the type using '--literally' +* option. +*/ + if ((flags & LOOKUP_LITERALLY) && (type == -1)) + type = 0; + else if (type == -1) + die("invalid object type"); + if (oi->typep) + *oi->typep = type; + /* * The length
Re: [PATCH 0/6] address packed-refs speed regressions
On Sun, Apr 05, 2015 at 03:41:39PM +0200, René Scharfe wrote: > > The main culprits seem to be d0f810f (which introduced some extra > > expensive code for each ref) and my 10c497a, which switched from fgets() > > to strbuf_getwholeline. It turns out that strbuf_getwholeline is really > > slow. > > 10c497a changed read_packed_refs(), which reads *all* packed refs. > Each is checked for validity. That sounds expensive if the goal is > just to look up a single (non-existing) ref. > > Would it help to defer any checks until a ref is actually accessed? > Can a binary search be used instead of reading the whole file? Yes, but addressing that is much more invasive. Right now we parse all of the packed-refs file into an in-memory cache, and then do single lookups from that cache. Doing an mmap() and a binary search is way faster (and costs less memory) for doing individual lookups. It relies on the list being sorted. This is generally true, but not something we currently rely on (however, it would be easy to add a "sorted" flag to top of the file and have the readers fall back when the flag is missing). I've played with a patch to do this (it's not entirely trivial, because you jump into the middle of a line, and then have to walk backwards to find the start of the record). For traversals, it's more complicated. Obviously if you are traversing all refs, you have to read the whole thing anyway. If you are traversing a subset of the refs, you can binary-search the start of the subset, and then walk forward. But that's where it gets tricky with the current code. The ref_cache code expects to fill in from outer to inner. So if you have "refs/foo", you should also have filled in all of "refs/" (but not necessarily "refs/bar"). This matches the way we traverse loose ref directories; we opendir "refs/", find out that it has "foo" and "bar", and the descend into "foo", and so forth. But reading a subset of the packed-ref file is "inside out". You fill in all of "refs/foo", but you have no idea what else is in "refs/". So going in that direction would involve some surgery to the ref_cache code. It might even involve throwing it out entirely (i.e., just mmap the packed-refs file and look through it directly, without any kind of in-memory cache; we don't tend to do more than one ref-iteration per program anyway, so I'm not sure the caching is buying us much anyway). My big concern there would be that there are a lot of subtle race issues between packed and loose refs, and the current state is the result of a lot of tweaking. I'd be worried that a heavy rewrite there would risk introducing subtle and rare corruptions. Plus it would be a lot of work, which leads me to... > I wonder if pluggable reference backends could help here. Storing refs > in a database table indexed by refname should simplify things. ...this. I think that effort might be better spent on a ref storage format that's more efficient, simpler (with respect to subtle races and such), and could provide other features (e.g., transactional atomicity). The big plus side of packed-refs improvements is that they "just work" without worrying about compatibility issues. But ref storage is local, so I'm not sure how big a deal that is in practice. > Short-term, can we avoid the getc()/strbuf_grow() dance e.g. by mapping > the packed refs file? What numbers do you get with the following patch? It's about 9% faster than my series + the fgets optimization I posted (or about 25% than using getc). Which is certainly nice, but I was really hoping to just make strbuf_getline faster for all callers, rather than introducing special code for one call-site. Certainly we could generalize the technique (i.e., a struct with the mmap data), but then I feel we are somewhat reinventing stdio. Which is maybe a good thing, because stdio has a lot of rough edges (as seen here), but it does feel a bit like NIH syndrome. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/6] address packed-refs speed regressions
On Sun, Apr 05, 2015 at 02:52:59PM -0400, Jeff King wrote: > Right now we parse all of the packed-refs file into an in-memory cache, > and then do single lookups from that cache. Doing an mmap() and a binary > search is way faster (and costs less memory) for doing individual > lookups. It relies on the list being sorted. This is generally true, but > not something we currently rely on (however, it would be easy to add a > "sorted" flag to top of the file and have the readers fall back when the > flag is missing). I've played with a patch to do this (it's not entirely > trivial, because you jump into the middle of a line, and then have to > walk backwards to find the start of the record). > > For traversals, it's more complicated. Obviously if you are traversing > all refs, you have to read the whole thing anyway. If you are traversing > a subset of the refs, you can binary-search the start of the subset, and > then walk forward. But that's where it gets tricky with the current > code. In case you are curious, here is my proof-of-concept for the packed-refs binary search. You'll note that it's a separate program, and not integrated into refs.c. I wrote this last August, and after trying to integrate it into refs.c, I found the ref_cache problems I described, and I haven't touched it since. I also seem to have saved the patch for stuffing it into refs.c, but I am not sure if it even compiles (I wrote only "horrible wip" in the commit message ;) ). -- >8 -- Subject: [PATCH] add git-quick-list This is a proof of concept for binary-searching the packed-refs file in order to traverse an ordered subset of it. Note that it _only_ reads the packed-refs file currently. To really compare to for-each-ref, it would need to also walk the loose ref area for its prefix. On a mostly-packed repository that shouldn't make a big speed difference, though. And of course we don't _really_ want a separate command here at all. This should be part of refs.c, and everyone who calls for_each_ref should benefit from it. Still, the numbers are promising. Here's are comparisons against for-each-ref on torvalds/linux, which has a 218M packed-refs file: $ time git for-each-ref \ --format='%(objectname) %(refname)' \ refs/remotes/2325298/ | wc -c 44139 real0m1.649s user0m1.332s sys 0m0.304s $ time ~peff/git-quick-list refs/remotes/2325298/ | wc -c 44139 real0m0.012s user0m0.004s sys 0m0.004s --- Makefile | 1 + quick-list.c | 174 +++ 2 files changed, 175 insertions(+) create mode 100644 quick-list.c diff --git a/Makefile b/Makefile index 2457065..aa32598 100644 --- a/Makefile +++ b/Makefile @@ -541,6 +541,7 @@ PROGRAM_OBJS += shell.o PROGRAM_OBJS += show-index.o PROGRAM_OBJS += upload-pack.o PROGRAM_OBJS += remote-testsvn.o +PROGRAM_OBJS += quick-list.o # Binary suffix, set to .exe for Windows builds X = diff --git a/quick-list.c b/quick-list.c new file mode 100644 index 000..e423f1f --- /dev/null +++ b/quick-list.c @@ -0,0 +1,174 @@ +#include "cache.h" +#include "refs.h" + +struct packed_refs_iterator { + const char *start; + const char *end; + + const char *cur; + const char *ref; + const char *eol; + const char *next; +}; + +static void iterator_init(struct packed_refs_iterator *pos, + const char *buf, size_t len) +{ + pos->start = buf; + pos->end = buf + len; + + /* skip past header line */ + if (pos->start < pos->end && *pos->start == '#') { + while (pos->start < pos->end && *pos->start != '\n') + pos->start++; + if (pos->start < pos->end) + pos->start++; + } +} + +static int iterator_cmp(const char *key, struct packed_refs_iterator *pos) +{ + const char *ref = pos->ref; + for (; *key && ref < pos->eol; key++, ref++) + if (*key != *ref) + return (unsigned char)*key - (unsigned char)*ref; + return ref == pos->eol ? *key ? 1 : 0 : -1; +} + +static const char *find_eol(const char *p, const char *end) +{ + p = memchr(p, '\n', end - p); + return p ? p : end; +} + +static void parse_line(struct packed_refs_iterator *pos, const char *p) +{ + pos->cur = p; + if (pos->end - p < 41) + die("truncated packed-refs file"); + p += 41; + + pos->ref = p; + pos->eol = p = find_eol(p, pos->end); + + /* skip newline, and then past any peel records */ + if (p < pos->end) + p++; + while (p < pos->end && *p == '^') { + p = find_eol(p, pos->end); + if (p < pos->end) + p++; + } + pos->next = p; +} + +static void iterator_next(struct packed_refs_iterator *pos) +{ + if (pos->next < pos->end) + parse_line(pos, pos->next); + else
Re: [PATCH 0/2] git-p4: Improve client path detection
On 28/03/15 12:28, Vitor Antunes wrote: I'm adding a test case for a scenario I was confronted with when using branch detection and a client view specification. It is possible that the implemented fix may not cover all possible scenarios, but there is no regression in the available tests. Vitor, one thing I wondered about with this part of the change: -if entry["depotFile"] == depotPath: +if entry["depotFile"].find(depotPath) >= 0: Does this mean that if 'p4 where' produces multiple lines of output that this will get confused, as it's just going to search for an instance of depotPath. The example in the Perforce man page for 'p4 where' would trigger this for example: http://www.perforce.com/perforce/r14.2/manuals/cmdref/p4_where.html -//a/b/file.txt //client/a/b/file.txt //home/user/root/a/b/file.txt //a/b/file.txt //client/b/file.txt /home/user/root/b/file.txt As an experiment, I hacked git-p4 to always use p4Where rather than getClientRoot(), which I would have thought ought to work, but while most of the tests passed, Pete's client-spec torture tests failed. Luke Vitor Antunes (2): git-p4: Check branch detection and client view together git-p4: Improve client path detection when branches are used git-p4.py| 11 -- t/t9801-git-p4-branch.sh | 98 ++ 2 files changed, 105 insertions(+), 4 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v7 1/4] sha1_file.c: support reading from a loose object of unknown type
karthik nayak writes: > On 04/05/2015 01:16 PM, Junio C Hamano wrote: > >> If it semantically does not make sense to ask for the typename >> without asking for the type code, then we can and should make that >> as a new calling convention _all_ callers must follow. >> >> In other words, I think it is better to have >> >> if (oi->typename && !oi->typep) >> die("BUG"); >> >> somewhere near the beginning of the callchain that takes oi; that >> will ensure all callers understand the rule. > > Yes! this is a better approach as it will enforce that typep must be > set when typename is set. Not so fast ;-) The key phrase in what I wrote above is "If it does not make sense", and I am not yet convinced if that is the case or not. If we are to change the calling convention for the callers, the reason why it does not make sense to ask only for typename but not typep must be explained in the log message. And if it turns out that the answer to that question is "it is valid to ask only for typename", then it would be wrong to force everybody who wants to learn the stringified typename to supply typep. And in such a case it might be better to do something like this instead (on top of your patch I am responding to): sha1_file.c | 12 1 file changed, 12 insertions(+) diff --git a/sha1_file.c b/sha1_file.c index f4055dd..26fbb7c 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -2639,6 +2639,7 @@ int sha1_object_info_extended(const unsigned char *sha1, struct object_info *oi, struct cached_object *co; struct pack_entry e; int rtype; + enum object_type real_type; const unsigned char *real = lookup_replace_object_extended(sha1, flags); co = find_cached_object(real); @@ -2670,9 +2671,18 @@ int sha1_object_info_extended(const unsigned char *sha1, struct object_info *oi, return -1; } + /* +* packed_object_info() does not follow the delta chain to +* find out the real type, unless it is given oi->typep. +*/ + if (oi->typename && !oi->typep) + oi->typep = &real_type; + rtype = packed_object_info(e.p, e.offset, oi); if (rtype < 0) { mark_bad_packed_object(e.p, real); + if (oi->typep == &real_type) + oi->typep = NULL; return sha1_object_info_extended(real, oi, 0); } else if (in_delta_base_cache(e.p, e.offset)) { oi->whence = OI_DBCACHED; @@ -2686,6 +2696,8 @@ int sha1_object_info_extended(const unsigned char *sha1, struct object_info *oi, if (oi->typename) strbuf_addstr(oi->typename, typename(*oi->typep)); + if (oi->typep == &real_type) + oi->typep = NULL; return 0; } -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/6] strbuf_getwholeline: use getc_unlocked
Jeff King writes: > So we'd have to either: > > 1. Decide that doesn't matter. > > 2. Have callers specify a "damn the NULs, I want it fast" flag. The callers that used to call fgets and then later rewritten to strbuf_getwholeline(), either of the above obviously should be OK, and because the whole reason why we added strbuf_getline() interface was to avoid having to repeatedly call fgets() and concatenate the result if the fixed-size buffer we would give it is too small, I'd say the callers that want to read lines terminated by LF and have NUL as part of payload would be a tiny minority. It depends on what we would find out after auditing all callers of this function, but I would not be surprised if we decided #1 (i.e. "this is about a _line_; what are you doing by having a NUL on it?"), and the safest would be to do the inverse of #2, i.e. make it fast by default and make oddball callers that care about NULs to pass a flag. Thanks for working on this. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: How to send a warning message from git hosting server?
"Yi, EungJun" writes: > Hello. I am serving a git hosting service for my company. > > Sometimes I want to send a warning message to users who use my > service; e.g. the service will be shutdown tomorrow for a while > temporary. > > I know it is possible to a remote message by hooks or HTTP body if an > error occured. But it seems that there is no hooks for git-fetch and > git does not print HTTP body if there is no error. > > I want a way to response a remote message when a client send any kind > of request. Is it possible? I do not offhand know if there are such hooks, but I would imagine that I'd be mightily annoyed if I were forced to interact with such a server. I may not have a need to pull anything for a few days, working on my changes, and then I'd find out when the service is already down. I may pull many times a day, and for a few days of pre-announcement period, I'd be forced to see the same message over and over. I may have a cron job to fetch down the changes made by coworkers in other timezones while I am sleeping so that I can start my day from an up-to-date state, but it is very likely I would say "fetch --quiet" in the cron job because I want it to be quiet unless there is an error. I'd appreciate if the Gitmasters at the company sent an e-mail addressed to git-us...@mycompany.xz instead. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/6] strbuf_getwholeline: use getc_unlocked
Jeff King writes: > On Sun, Apr 05, 2015 at 01:27:32AM -0400, Jeff King wrote: > >> On Sun, Apr 05, 2015 at 12:56:14AM -0400, Jeff King wrote: >> >> > The big downside is that our input strings are no longer NUL-clean >> > (reading "foo\0bar\n" would yield just "foo". I doubt that matters in >> > the real world, but it does fail a few of the tests (e.g., t7008 tries >> > to read a list of patterns which includes NUL, and we silently truncate >> > the pattern rather than read in the NUL and barf). >> >> So there is this trick: >> >> diff --git a/strbuf.c b/strbuf.c >> index f319d8d..5ceebb7 100644 >> --- a/strbuf.c >> +++ b/strbuf.c >> @@ -445,12 +445,13 @@ int strbuf_getwholeline(struct strbuf *sb, FILE *fp, >> int term) >> strbuf_reset(sb); >> >> if (term == '\n') { >> +long pos = ftell(fp); >> strbuf_grow(sb, 256); >> if (!fgets(sb->buf, sb->alloc - 1, fp)) { >> strbuf_release(sb); >> return EOF; >> } >> -sb->len = strlen(sb->buf); >> +sb->len = ftell(fp) - pos; >> if (sb->buf[sb->len - 1] == '\n') >> return 0; >> } >> >> but much to my surprise it actually runs slower than the strlen version! The later loop you have "oops, the thing turns out to be longer than we thought, so let's do byte-by-byte" is protected with locking, but this part is not, and that suggests me that the ftell-fgets-ftell sequence we see here may have its own locking cost built-in in the stdio library, too, perhaps? >> It also has a 32-bit overflow issue. There's fgetpos() as an >> alternative, but fpos_t is an opaque type, and we might not be able to >> do arithmetic on it (for that matter, I am not sure if arithmetic is >> strictly guaranteed on ftell() results). POSIX gives us ftello(), which >> returns an off_t. That would probably be fine. > > Actually, scratch that idea. ftell() always returns 0 on a non-seekable > file, so we can't use it in the general case. And that probably explains > the performance difference, too, if it is not keeping its own counter > and relies on lseek(fileno(fp)) or similar. Looked so promising, though ;-) X-<. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: rev-list pretty format behavior
Oliver Runge writes: > I'm using git version 2.4.0-rc1. The same behavior exists in 2.1.0. > > Trying the same with rev-list results in: >> git rev-list --pretty=format:"%h ..." HEAD~3...HEAD > commit 826aed50cbb072d8f159e4c8ba0f9bd3df21a234 > 826aed5 ... > commit 915e44c6357f3bd9d5fa498a201872c4367302d3 > 915e44c ... > commit 067178ed8a7822e6bc88ad606b707fc33658e6fc > 067178e ... This is very much the designed behaviour, I would think. IIRC, the user-format support of "rev-list" was designed so that the scripts can customize the output from "rev-list -v", which was how scripts were expected to read various pieces of information for each commit originally. And the 40-hex commit object name and/or a line that begins with "commit ..." when a user format is used are meant to serve as stable record separator (in that sense, having %H or %h in the userformat given to rev-list is redundant) when these scripts are reading output from "rev-list". A new option to tell "rev-list" that "I am designing an output that is a-line-per-commit with the userformat and do not need the default record separator" or "I will arrange record separator myself" would be an acceptable thing to add, provided if many scripts yet to be written would benefit from such a feature, though. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 3/5] Documentation: add git-log --merges= option and log.merges config. var
Koosha Khajehmoogahi writes: > diff --git a/Documentation/rev-list-options.txt > b/Documentation/rev-list-options.txt > index f620ee4..0bb2390 100644 > --- a/Documentation/rev-list-options.txt > +++ b/Documentation/rev-list-options.txt > @@ -96,12 +96,24 @@ if it is part of the log message. > --remove-empty:: > Stop when a given path disappears from the tree. > > +--merges={show|hide|only}:: > ++ > +-- > +`show`: show both merge and non-merge commits > + > +`hide`: only show non-merge commits; same as `--max-parents=1` > + > +`only`: only show merge commits; same as `--min-parents=2` > + > +If `--merges=` is not specified, default value is `show`. > +-- > ++ > + I am not sure if the "default value is `show`" is something we would even want to mention like this. It does not tell the whole story and may even confuse the users, who did git log --merge git log --max-parent=... but did not say any "--merges=". I think the importat point we want to teach users is that this is an option to use when you want to limit what is output (and by default, we show all but nothing else in the manpage says we hide things, so...). And it would be beneficial to highlight that 'show' is only there to defeat an unusual log.merges setting in users' config. Also the formatting of this part looks rather unusual. I would have expected that these three items to be listed as a true AsciiDoc enumeration, not three hand-crafted enumration-looking separate paragraphs. Taking both points together, we may want to do something more like this, perhaps? --merges={show|hide|only}:: Limit the output by type of commits. `hide`;; Hide merge commits from the output. `only`;; Hide non-merge commits from the output (i.e showing only merge commits). `show`;; Do not hide either merge or non-merge commits. This is primarily useful when the user has non-standard setting of `log.merges` configuration variable that needs to be overriden from the command line. Thanks. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/6] address packed-refs speed regressions
Am 05.04.2015 um 20:52 schrieb Jeff King: On Sun, Apr 05, 2015 at 03:41:39PM +0200, René Scharfe wrote: I wonder if pluggable reference backends could help here. Storing refs in a database table indexed by refname should simplify things. ...this. I think that effort might be better spent on a ref storage format that's more efficient, simpler (with respect to subtle races and such), and could provide other features (e.g., transactional atomicity). Such as a DBMS? :-) Leaving storage details to SQLite or whatever sounds attractive to me because I'm lazy. The big plus side of packed-refs improvements is that they "just work" without worrying about compatibility issues. But ref storage is local, so I'm not sure how big a deal that is in practice. Adding a dependency is a big step, admittedly, so native improvements might be a better fit. There's a chance that we'd run into issues already solved by specialized database engines long ago, though. Short-term, can we avoid the getc()/strbuf_grow() dance e.g. by mapping the packed refs file? What numbers do you get with the following patch? It's about 9% faster than my series + the fgets optimization I posted (or about 25% than using getc). Which is certainly nice, but I was really hoping to just make strbuf_getline faster for all callers, rather than introducing special code for one call-site. Certainly we could generalize the technique (i.e., a struct with the mmap data), but then I feel we are somewhat reinventing stdio. Which is maybe a good thing, because stdio has a lot of rough edges (as seen here), but it does feel a bit like NIH syndrome. Forgot to say: I like your changes. But if strbuf_getline can only be made fast enough beyond that by duplicating stdio buffering then I feel it's better to take a different way. E.g. dropping the requirement to handle NUL chars and basing it on fgets as Junio suggested in his reply to patch 3 sounds good. In any case, the packed refs file seems special enough to receive special treatment. Using mmap would make the most sense if we could also avoid copying lines to a strbuf for parsing, though. René -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [ANNOUNCE] Git Merge Contributors Summit, April 8th, Paris
On Tue, Feb 24, 2015 at 11:09 PM, Jeff King wrote: > I wanted to make one more announcement about this, since a few more > details have been posted at: > > http://git-merge.com/ > > since my last announcement. Specifically, I wanted to call attention to > the contributor's summit on the 8th. Basically, there will be a space > that can hold up to 50 people, it's open only to git (and JGit and > libgit2) devs, and there isn't a planned agenda. So I want to: > > 1. Encourage developers to come. You might meet some folks in person > you've worked with online. And you can see how beautiful we all > are. > > 2. Get people thinking about what they would like to talk about. In > past GitTogethers, it's been a mix of people with prepared things > to talk about, group discussions of areas, and general kibitzing. > We can be spontaneous on the day of the event, but if you have a > topic you want to bring up, you may want to give it some thought > beforehand. > > If you are a git dev and want to come, please RSVP to Chris Kelly > who is organizing the event. If you would like > to come, but finances make it hard (either for travel, or for the > conference fee), please talk to me off-list, and we may be able to help. > > If you have questions, please feel free to ask me, and I'll try to get > answers from the GitHub folks who are organizing the event. > I'll be arriving around 11 am on the 8th, if anyone wants to record something for the GitMinutes podcast [1]. Send me an email directly, or just walk up to me at the conference and say hi! I'll hopefully be hanging around the contributor's summit area with some microphones, but I've been unable to get any feedback from GitHub about whether this is OK, so.. I guess we'll just wing it when I get there. [1] http://www.gitminutes.com/ -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/2] git-p4: Improve client path detection
Luke Diamand wrote on Sun, 05 Apr 2015 20:27:11 +0100 > On 28/03/15 12:28, Vitor Antunes wrote: > > I'm adding a test case for a scenario I was confronted with when using > > branch > > detection and a client view specification. It is possible that the > > implemented > > fix may not cover all possible scenarios, but there is no regression in the > > available tests. > > Vitor, one thing I wondered about with this part of the change: > > -if entry["depotFile"] == depotPath: > +if entry["depotFile"].find(depotPath) >= 0: > > Does this mean that if 'p4 where' produces multiple lines of output that > this will get confused, as it's just going to search for an instance of > depotPath. The reason why I introduced that was because in the test case I implemented (and which reflects a scenario I am confronted with in my workplace) the branches have a base directory that is removed in the client view mapping. As such, we will have a situation where depotPath is //depot/branch1/ while runninng "p4 where" will result in //depot/branch1/base/. To overcome this I used find() instead of a direct comparison. Now that I think about that, I could probably have used the simpler `if depotPath in entry["depotFile"]`... > The example in the Perforce man page for 'p4 where' would trigger this > for example: > > http://www.perforce.com/perforce/r14.2/manuals/cmdref/p4_where.html > > -//a/b/file.txt //client/a/b/file.txt //home/user/root/a/b/file.txt > //a/b/file.txt //client/b/file.txt /home/user/root/b/file.txt These are examples where a simple comparison as was implemented would work. > As an experiment, I hacked git-p4 to always use p4Where rather than > getClientRoot(), which I would have thought ought to work, but while > most of the tests passed, Pete's client-spec torture tests failed. That was exactly my first approach and got to the same conclusion. I would have investigated it further but since I haven't had much free time to invest in solving this problem I decided to implement an intermediary solution that would not introduce any regressions. Vitor -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/6] address packed-refs speed regressions
Am 05.04.2015 um 20:59 schrieb Jeff King: Still, the numbers are promising. Here's are comparisons against for-each-ref on torvalds/linux, which has a 218M packed-refs file: $ time git for-each-ref \ --format='%(objectname) %(refname)' \ refs/remotes/2325298/ | wc -c 44139 real0m1.649s user0m1.332s sys 0m0.304s $ time ~peff/git-quick-list refs/remotes/2325298/ | wc -c 44139 real0m0.012s user0m0.004s sys 0m0.004s Sweet numbers. :-P I'm not familiar with refs.c, but its sheer size alone suggests that it won't be easy to integrate this prototype code there. :-/ René -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH V3] t9814: Guarantee only one source exists in git-p4 copy tests
By using a tree with multiple identical files and allowing copy detection to choose any one of them, the check in the test is unnecessarily complex. We can simplify by: * Modify source file (file2) before copying the file. * Check that only file2 is the source in the output of "p4 filelog". * Remove all "case" statements and replace them with simple tests to check that source is "file2". Signed-off-by: Vitor Antunes Acked-by: Luke Diamand --- t/t9814-git-p4-rename.sh | 46 +++--- 1 file changed, 31 insertions(+), 15 deletions(-) diff --git a/t/t9814-git-p4-rename.sh b/t/t9814-git-p4-rename.sh index 8b9c295..99bb71b 100755 --- a/t/t9814-git-p4-rename.sh +++ b/t/t9814-git-p4-rename.sh @@ -132,6 +132,9 @@ test_expect_success 'detect copies' ' cd "$git" && git config git-p4.skipSubmitEdit true && + echo "file8" >>file2 && + git commit -a -m "Differentiate file2" && + git p4 submit && cp file2 file8 && git add file8 && git commit -a -m "Copy file2 to file8" && @@ -140,6 +143,10 @@ test_expect_success 'detect copies' ' p4 filelog //depot/file8 && p4 filelog //depot/file8 | test_must_fail grep -q "branch from" && + echo "file9" >>file2 && + git commit -a -m "Differentiate file2" && + git p4 submit && + cp file2 file9 && git add file9 && git commit -a -m "Copy file2 to file9" && @@ -149,28 +156,39 @@ test_expect_success 'detect copies' ' p4 filelog //depot/file9 && p4 filelog //depot/file9 | test_must_fail grep -q "branch from" && + echo "file10" >>file2 && + git commit -a -m "Differentiate file2" && + git p4 submit && + echo "file2" >>file2 && cp file2 file10 && git add file2 file10 && git commit -a -m "Modify and copy file2 to file10" && git diff-tree -r -C HEAD && + src=$(git diff-tree -r -C HEAD | sed 1d | sed 2d | cut -f2) && + test "$src" = file2 && git p4 submit && p4 filelog //depot/file10 && - p4 filelog //depot/file10 | grep -q "branch from //depot/file" && + p4 filelog //depot/file10 | grep -q "branch from //depot/file2" && + + echo "file11" >>file2 && + git commit -a -m "Differentiate file2" && + git p4 submit && cp file2 file11 && git add file11 && git commit -a -m "Copy file2 to file11" && git diff-tree -r -C --find-copies-harder HEAD && src=$(git diff-tree -r -C --find-copies-harder HEAD | sed 1d | cut -f2) && - case "$src" in - file2 | file10) : ;; # happy - *) false ;; # not - esac && + test "$src" = file2 && git config git-p4.detectCopiesHarder true && git p4 submit && p4 filelog //depot/file11 && - p4 filelog //depot/file11 | grep -q "branch from //depot/file" && + p4 filelog //depot/file11 | grep -q "branch from //depot/file2" && + + echo "file12" >>file2 && + git commit -a -m "Differentiate file2" && + git p4 submit && cp file2 file12 && echo "some text" >>file12 && @@ -180,15 +198,16 @@ test_expect_success 'detect copies' ' level=$(git diff-tree -r -C --find-copies-harder HEAD | sed 1d | cut -f1 | cut -d" " -f5 | sed "s/C0*//") && test -n "$level" && test "$level" -gt 0 && test "$level" -lt 98 && src=$(git diff-tree -r -C --find-copies-harder HEAD | sed 1d | cut -f2) && - case "$src" in - file10 | file11) : ;; # happy - *) false ;; # not - esac && + test "$src" = file2 && git config git-p4.detectCopies $(($level + 2)) && git p4 submit && p4 filelog //depot/file12 && p4 filelog //depot/file12 | test_must_fail grep -q "branch from" && + echo "file13" >>file2 && + git commit -a -m "Differentiate file2" && + git p4 submit && + cp file2 file13 && echo "different text" >>file13 && git add file13 && @@ -197,14 +216,11 @@ test_expect_success 'detect copies' ' level=$(git diff-tree -r -C --find-copies-harder HEAD | sed 1d | cut -f1 | cut -d" " -f5 | sed "s/C0*//") && test -n "$level" && test "$level" -gt 2 && test "$level" -lt 100 && src=$(git diff-tree -r -C --find-copies-harder HEAD | sed 1d | cut
Re: [PATCH 4/6] strbuf: add an optimized 1-character strbuf_grow
On Sat, Apr 4, 2015 at 9:11 PM, Jeff King wrote: > We have to call strbuf_grow anytime we are going to add data > to a strbuf. In most cases, it's a noop (since we grow the > buffer aggressively), and the cost of the function call and > size check is dwarfed by the actual buffer operation. > > For a tight loop of single-character additions, though, this > overhead is noticeable. Furthermore, the single-character > case is much easier to check; since the "extra" parameter is > 1, we can do it without worrying about overflow. > > This patch adds a simple inline function for checking > single-character growth. For the growth case, it just calls > into the regular strbuf_grow(). This is redundant, as > strbuf_grow will check again whether we need to grow. But it > keeps our inline code simple, and most calls will not need > to grow, so it's OK to treat this as a rare "slow path". > > We apply the new function to strbuf_getwholeline. [...] > > Signed-off-by: Jeff King > --- > diff --git a/strbuf.c b/strbuf.c > index af2bad4..2facd5f 100644 > --- a/strbuf.c > +++ b/strbuf.c > @@ -445,7 +445,7 @@ int strbuf_getwholeline(struct strbuf *sb, FILE *fp, int > term) > strbuf_reset(sb); > flockfile(fp); > while ((ch = getc_unlocked(fp)) != EOF) { > - strbuf_grow(sb, 1); > + strbuf_grow_ch(sb); strbuf_grow_ch() seems overly special-case. What about instead taking advantage of inline strbuf_avail() to do something like this? if (!strbuf_avail()) strbuf_grow(sb, 1); (Minor tangent: The 1 is still slightly magical and potentially confusing for someone who doesn't know that the buffer is grown aggressively, so changing it to a larger number might make it more obvious to the casual reader that the buffer is in fact not being grown on every iteration.) > sb->buf[sb->len++] = ch; > if (ch == term) > break; > diff --git a/strbuf.h b/strbuf.h > index 1883494..ef41151 100644 > --- a/strbuf.h > +++ b/strbuf.h > @@ -137,6 +137,15 @@ static inline size_t strbuf_avail(const struct strbuf > *sb) > */ > extern void strbuf_grow(struct strbuf *, size_t); > > +/* > + * An optimized version of strbuf_grow() for a single character. > + */ > +static inline void strbuf_grow_ch(struct strbuf *sb) > +{ > + if (!sb->alloc || sb->alloc - 1 <= sb->len) > + strbuf_grow(sb, 1); > +} > + > /** > * Set the length of the buffer to a given value. This function does *not* > * allocate new memory, so you should not perform a `strbuf_setlen()` to a > -- > 2.4.0.rc0.363.gf9f328b -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Why can't I stash submodule changes?
I’m having trouble understanding why I cannot stash changes to a submodule. When adding a submodule to a repository (`git submodule add ./sub-repo`), I can then run `git stash` and `git stash pop` with expected results—the submodule disappears and reappears in the working tree. However, when I try stashing an update to a submodule, `git stash` reports “No local changes to save”. The following shell script illustrates this behavior: # Create repo mkdir test-repo cd test-repo git init git commit --allow-empty -m "Initial commit" # Create submodule mkdir sub-repo cd sub-repo git init git commit --allow-empty -m "Initial commit" cd - # Add submodule git submodule add ./sub-repo git commit -m "Add submodule" # Modify submodule cd sub-repo touch foo git add foo git commit -m "Submodule changed" cd - # Stash submodule change git stash # <---Displays "No local changes to save” I’m trying to wrap my head around why this is the current behavior, as I suspect this is intentional but it seems unexpected. If anyone can shed any light on this, I would really appreciate it! Thanks, Shane -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: gitk drawing bug
On Fri, Apr 03, 2015 at 09:28:00PM -0400, Martin d'Anjou wrote: > On 15-04-03 07:05 PM, Alex Henrie wrote: > >2015-02-18 12:27 GMT-07:00 Martin d'Anjou : > >>It appears I have uncovered inconsistent behaviour in gitk. Looks like > >>a bug. I have a picture here: > >>https://docs.google.com/document/d/19TTzGD94B9EEIrVU5mRMjfJFvF5Ar3MlPblRJfP5OdQ/edit?usp=sharing > >> > >>Essentially, when I hit shift-F5, it sometimes draw the history > >>differently (still valid, but drawn differently). There is no change > >>in the repository between the shift-F5 keystrokes. That's not a bug, it's a consequence of the fact that gitk is designed to be fast. It only lays out as much of the graph is visible plus a little more, not the whole graph, and it doesn't use any global analysis. The reason for that is speed. Gitk is usable on a repository with half a million commits, such as the linux kernel, and to achieve that we can't afford to do wait until we have all the commits read in and then do some computation over the whole topology; it all has to be done incrementally. Also, the underlying git log sometimes gives gitk a parent commit before one of its children, and when that happens the topology has to be modified and thus the graph does too, if any topology that has already been drawn gets modified. As long as the graph correctly shows the relationships between commits, it has achieved its purpose. If you (or anyone) can come up with improvements that make it look nicer, that's great, and I'll consider them as long as they don't slow down gitk on large repositories to any noticeable extent. Paul. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] gitk: Remove mc parameter from proc show_error
On Thu, Apr 02, 2015 at 03:05:35PM -0600, Alex Henrie wrote: > This partially reverts commit 8d849957d81fc0480a52570d66cc3c2a688ecb1b. ... and brings back the bug that 8d849957d81f solves, as far as I can see. If that's not the case then you need to explain that in the patch description. Paul. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] gitk: Rearrange window title to be more conventional.
On Mon, Mar 23, 2015 at 10:18:16AM -0400, Marc Branchaud wrote: > Signed-off-by: Marc Branchaud Thanks, applied. Paul. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH 2/2] gitk: Show the rev(s) the user specified on the command line in the window title.
On Tue, Jan 06, 2015 at 12:52:00PM -0500, Marc Branchaud wrote: > Signed-off-by: Marc Branchaud > --- > > I often open multiple gitk windows in the same working directory to examine > other branches or refs in the repo. This change allows me to distinguish > which window is showing what. > > This is an RFC because it doesn't behave great with views. I don't use views > at all, so this doesn't bother me. I'm not too sure what should be displayed > if the user changes views -- probably the view name, but I'm not sure how to > get a that in the code. The view name is in $viewname($curview). If that is "Command line" you probably want to show some selected command line arguments. Using $vrevs($curview) seems about right, though it will contain "--gitk-symmetric-diff-marker" in some situations, which is an internal thing that we don't want to show externally. You should probably filter it out. The patch will need a proper description before I can apply it, of course. Paul. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] gitk: Fix bad English grammar "Matches none Commit Info"
On Thu, Apr 02, 2015 at 03:05:06PM -0600, Alex Henrie wrote: > Signed-off-by: Alex Henrie > --- > gitk | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/gitk b/gitk > index 9a2daf3..30fcd30 100755 > --- a/gitk > +++ b/gitk > @@ -4066,7 +4066,7 @@ set known_view_options { > {committer t15 . "--committer=*" {mc "Committer:"}} > {loginfo t15 .. "--grep=*" {mc "Commit Message:"}} > {allmatch b.. "--all-match"{mc "Matches all Commit Info > criteria"}} > -{igrep b.. "--invert-grep" {mc "Matches none Commit Info > criteria"}} > +{igrep b.. "--invert-grep" {mc "Matches no Commit Info > criteria"}} Thanks, applied. Paul. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] gitk: sv.po: Update Swedish translation (305t0f0u)
On Fri, Mar 27, 2015 at 10:34:25AM +0100, Peter Krefting wrote: > Please find attached (for text encoding reasons) an update to the Swedish > translation for gitk. Thanks, applied. Paul. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/6] address packed-refs speed regressions
On Mon, Apr 06, 2015 at 12:39:15AM +0200, René Scharfe wrote: > >...this. I think that effort might be better spent on a ref storage > >format that's more efficient, simpler (with respect to subtle races and > >such), and could provide other features (e.g., transactional atomicity). > > Such as a DBMS? :-) Leaving storage details to SQLite or whatever sounds > attractive to me because I'm lazy. Exactly. Though I think some folks were worried about the extra dependency (e.g., I think SQLite is hard for JGit, because there's no pure-java implementation, which makes Eclipse unhappy). With pluggable backends we can make something like a SQLite backend optional. I.e., use it if you want the benefits and can accept the portability downsides. But that also risks fracturing the community, and people on the "old" format being left behind. > Forgot to say: I like your changes. But if strbuf_getline can only be made > fast enough beyond that by duplicating stdio buffering then I feel it's > better to take a different way. E.g. dropping the requirement to handle NUL > chars and basing it on fgets as Junio suggested in his reply to patch 3 > sounds good. Yeah, though we probably need to either audit the callers, or provide a flag for each caller to turn on the speed-over-NULs behavior. I'll look into that, but it may not be this week, as I'll be traveling starting tomorrow. > In any case, the packed refs file seems special enough to receive special > treatment. Using mmap would make the most sense if we could also avoid > copying lines to a strbuf for parsing, though. I had a similar thought. Below is hacky patch, on top of your mmap patch, that does this. It does shave off another 300ms (around 5%). I think we may be getting into a useless area of micro-optimizing here, though. The results are noticeable on this ridiculous repository, but probably not so much on real ones. The low-hanging fruit (e.g., dropping time in half by using getc_unlocked) seems to provide the most bang for the buck. --- diff --git a/refs.c b/refs.c index 144255f..708b49b 100644 --- a/refs.c +++ b/refs.c @@ -334,27 +334,40 @@ static int refname_is_safe(const char *refname) return 1; } -static struct ref_entry *create_ref_entry(const char *refname, - const unsigned char *sha1, int flag, - int check_name) +static struct ref_entry *create_ref_entry_len(const char *refname, size_t len, + const unsigned char *sha1, int flag, + int check_name) { - int len; struct ref_entry *ref; + /* +* allocate before checking, since the check functions require +* a NUL-terminated refname. And since we die() anyway if +* the check fails, the overhead of the extra malloc is negligible +*/ + ref = xmalloc(sizeof(struct ref_entry) + len + 1); + hashcpy(ref->u.value.sha1, sha1); + hashclr(ref->u.value.peeled); + memcpy(ref->name, refname, len); + ref->name[len] = '\0'; + ref->flag = flag; + if (check_name && check_refname_format(refname, REFNAME_ALLOW_ONELEVEL)) die("Reference has invalid format: '%s'", refname); if (!check_name && !refname_is_safe(refname)) die("Reference has invalid name: '%s'", refname); - len = strlen(refname) + 1; - ref = xmalloc(sizeof(struct ref_entry) + len); - hashcpy(ref->u.value.sha1, sha1); - hashclr(ref->u.value.peeled); - memcpy(ref->name, refname, len); - ref->flag = flag; return ref; } +static struct ref_entry *create_ref_entry(const char *refname, +const unsigned char *sha1, int flag, +int check_name) +{ + return create_ref_entry_len(refname, strlen(refname), sha1, flag, + check_name); +} + static void clear_ref_dir(struct ref_dir *dir); static void free_ref_entry(struct ref_entry *entry) @@ -1095,7 +1108,9 @@ static const char PACKED_REFS_HEADER[] = * Return a pointer to the refname within the line (null-terminated), * or NULL if there was a problem. */ -static const char *parse_ref_line(struct strbuf *line, unsigned char *sha1) +static const char *parse_ref_line(const char *line, int len, + unsigned char *sha1, + size_t *refname_len) { const char *ref; @@ -1107,22 +1122,22 @@ static const char *parse_ref_line(struct strbuf *line, unsigned char *sha1) * +1 (space in between hex and name) * +1 (newline at the end of the line) */ - if (line->len <= 42) + if (len <= 42) return NULL; - if (get_sha1_hex(line->buf, sha1) < 0) + if (get_sha1_hex(line, sha1) < 0) return
Re: [PATCH 4/6] strbuf: add an optimized 1-character strbuf_grow
On Sun, Apr 05, 2015 at 10:13:21PM -0400, Eric Sunshine wrote: > > - strbuf_grow(sb, 1); > > + strbuf_grow_ch(sb); > > strbuf_grow_ch() seems overly special-case. What about instead taking > advantage of inline strbuf_avail() to do something like this? > > if (!strbuf_avail()) > strbuf_grow(sb, 1); Thanks, I somehow missed that function (despite it being a few line above the one I added!). I agree that strbuf_avail is a much better generic interface, and it turns out to be just as fast (actually, a tiny bit faster in my tests). I'll use that in the re-roll. > (Minor tangent: The 1 is still slightly magical and potentially > confusing for someone who doesn't know that the buffer is grown > aggressively, so changing it to a larger number might make it more > obvious to the casual reader that the buffer is in fact not being > grown on every iteration.) I agree this is slightly confusing (and I had to double-check how strbuf_grow worked while writing this series). OTOH, this is not so much about the "1" here as about how strbufs work. We care about the amortized asymptotic cost. strbuf_add() has the same issue; we add more bytes in each chunk, but we would still want to make sure that there is a sub-linear relationship between the number of adds and the number of allocations). -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2] gitk: Remove mc parameter from proc show_error
This is a better fix for 8d849957d81fc0480a52570d66cc3c2a688ecb1b. All that was required to fix the original issue was to remove the extra mc call, i.e. change [mc "Sorry, gitk cannot run..."] to simply "Sorry, gitk cannot run..." Changing the signature of proc show_error was unnecessary and introduced two new bugs: It made "OK" untranslatable and "mc" translatable when the opposite should be true. This new fix makes the string "OK" translatable and the string "mc" not translatable, while leaving the string "Sorry, gitk cannot run..." not translatable. It will take effect the next time `make update-po` is run. Signed-off-by: Alex Henrie --- gitk | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/gitk b/gitk index 30fcd30..096389f 100755 --- a/gitk +++ b/gitk @@ -1894,13 +1894,13 @@ proc make_transient {window origin} { } } -proc show_error {w top msg {mc mc}} { +proc show_error {w top msg} { global NS if {![info exists NS]} {set NS ""} if {[wm state $top] eq "withdrawn"} { wm deiconify $top } message $w.m -text $msg -justify center -aspect 400 pack $w.m -side top -fill x -padx 20 -pady 20 -${NS}::button $w.ok -default active -text [$mc OK] -command "destroy $top" +${NS}::button $w.ok -default active -text [mc OK] -command "destroy $top" pack $w.ok -side bottom -fill x bind $top "grab $top; focus $top" bind $top "destroy $top" @@ -12011,7 +12011,7 @@ proc get_path_encoding {path} { # First check that Tcl/Tk is recent enough if {[catch {package require Tk 8.4} err]} { show_error {} . "Sorry, gitk cannot run with this version of Tcl/Tk.\n\ -Gitk requires at least Tcl/Tk 8.4." list +Gitk requires at least Tcl/Tk 8.4." exit 1 } -- 2.3.5 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html