Re: [PATCH] builtin/add: add detail to a 'cannot chmod' error message
Ramsay Joneswrites: > In addition to adding the missing newline, add the x-ecutable bit > 'mode change' character to the error message. This message now has > the same form as similar messages output by 'update-index'. > > Signed-off-by: Ramsay Jones > --- > > Hi Junio, > > This is v2 of the earlier "add a newline" patch. Thanks! Thanks; here is me reminding myself to apply this with Jonathan's reviewed-by in the morning. > > ATB, > Ramsay Jones > > builtin/add.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/builtin/add.c b/builtin/add.c > index e888fb8c5..5d5773d5c 100644 > --- a/builtin/add.c > +++ b/builtin/add.c > @@ -32,7 +32,7 @@ struct update_callback_data { > int add_errors; > }; > > -static void chmod_pathspec(struct pathspec *pathspec, int force_mode) > +static void chmod_pathspec(struct pathspec *pathspec, char flip) > { > int i; > > @@ -42,8 +42,8 @@ static void chmod_pathspec(struct pathspec *pathspec, int > force_mode) > if (pathspec && !ce_path_match(ce, pathspec, NULL)) > continue; > > - if (chmod_cache_entry(ce, force_mode) < 0) > - fprintf(stderr, "cannot chmod '%s'", ce->name); > + if (chmod_cache_entry(ce, flip) < 0) > + fprintf(stderr, "cannot chmod %cx '%s'\n", flip, > ce->name); > } > }
Re: [PATCH] t1200: remove t1200-tutorial.sh
Jonathan Niederwrites: > Correction: the tutorial is now called gitcore-tutorial and mostly > survives. A search for -p --root failed because of v1.5.5.1~19^2 > (core-tutorial.txt: Fix showing the current behaviour, 2008-04-10). Yeah, I was wondering why neither of you find it while reading your exchanges on a phone. > That said, the conclusion that this test has mostly bitrotted as far > as its original purpose goes still applies. > > An alternative method of addressing the goal you described would be to > fuzz object-id shaped things out of the sample output. I don't have a > strong preference, given how little this test contributes to coverage > (as you mentioned) and how difficult it is to make it continue to > match the documentation it is trying to test. > > Thanks and sorry for the confusion, OK, so can one of you give the final version of the patch with updated description? It _would_ be nice to have an executable tutorial/documentation or docs with built-in tests like some other systems have, but I realize that it would be a different matter. We'd need some toolchain to mark up tests in our tutorial, extract and run them, and compare the result, which we currently do not have and it won't fit under our test framework very well anyway. Thanks.
Re: [PATCH] builtin/add: add a missing newline to an stderr message
Jonathan Niederwrites: > I don't believe the force_mode without an 'x' provides a clear signal > to the end user. Perhaps you meant %cx? Indeed you are right. I think I saw Ramsay's v2 that has the 'x', so let's use that version. Thanks.
Re: [PATCH] sha1_file: avoid comparison if no packed hash matches the first byte
Jeff Kingwrites: > On Tue, Aug 08, 2017 at 06:52:31PM -0400, Jeff King wrote: > >> > Interesting. I see that we still have the conditional code to call >> > out to sha1-lookup.c::sha1_entry_pos(). Do we need a similar change >> > over there, I wonder? Alternatively, as we have had the experimental >> > sha1-lookup.c::sha1_entry_pos() long enough without anybody using it, >> > perhaps we should write it off as a failed experiment and retire it? >> >> There is also sha1_pos(), which seems to have the same problem (and is >> used in several places). > > Actually, I take it back. The problem happens when we enter the loop > with no entries to look at. But both sha1_pos() and sha1_entry_pos() > return early before hitting their do-while loops in that case. Ah, I was not looking at that part of the code. Thanks. I still wonder if we want to retire that conditional invocation of sha1_entry_pos(), though.
Re: [PATCH] t4062: stop using repetition in regex
René Scharfewrites: > Am 09.08.2017 um 00:26 schrieb Junio C Hamano: >> ... but in the meantime, I think replacing the test with "0$" to >> force the scanner to find either the end of line or the end of the >> buffer may be a good workaround. We do not have to care how many of >> random bytes are in front of the last "0" in order to ensure that >> the regexec_buf() does not overstep to 4097th byte, while seeing >> that regexec() that does not know how long the haystack is has to do >> so, no? > > Our regexec() calls strlen() (see my other reply). > > Using "0$" looks like the best option to me. Yeah, it seems that way. If we want to be close/faithful to the original, we could do "^0*$", but the part that is essential to trigger the old bug is not the "we have many zeroes" (or "we have 4096 zeroes") part, but "zero is at the end of the string" part, so "0$" would be the minimal pattern that also would work for OBSD. Dscho?
[PATCH v2 10/25] pack: move install_packed_git()
Signed-off-by: Jonathan Tan--- cache.h | 1 - pack.h | 4 ++-- packfile.c | 11 ++- sha1_file.c | 9 - 4 files changed, 12 insertions(+), 13 deletions(-) diff --git a/cache.h b/cache.h index bf93477e8..41562dc0b 100644 --- a/cache.h +++ b/cache.h @@ -1611,7 +1611,6 @@ extern void (*report_garbage)(unsigned seen_bits, const char *path); extern void prepare_packed_git(void); extern void reprepare_packed_git(void); -extern void install_packed_git(struct packed_git *pack); /* * Give a rough count of objects in the repository. This sacrifices accuracy diff --git a/pack.h b/pack.h index c1f3ff32d..576c4fc7c 100644 --- a/pack.h +++ b/pack.h @@ -124,8 +124,6 @@ extern char *sha1_pack_name(const unsigned char *sha1); */ extern char *sha1_pack_index_name(const unsigned char *sha1); -extern unsigned int pack_open_fds; - extern void pack_report(void); /* @@ -152,4 +150,6 @@ extern unsigned char *use_pack(struct packed_git *, struct pack_window **, off_t extern void unuse_pack(struct pack_window **); extern struct packed_git *add_packed_git(const char *path, size_t path_len, int local); +extern void install_packed_git(struct packed_git *pack); + #endif diff --git a/packfile.c b/packfile.c index efe0ed3e8..4eb65e460 100644 --- a/packfile.c +++ b/packfile.c @@ -28,7 +28,7 @@ static unsigned int pack_used_ctr; static unsigned int pack_mmap_calls; static unsigned int peak_pack_open_windows; static unsigned int pack_open_windows; -unsigned int pack_open_fds; +static unsigned int pack_open_fds; static unsigned int pack_max_fds; static size_t peak_pack_mapped; static size_t pack_mapped; @@ -658,3 +658,12 @@ struct packed_git *add_packed_git(const char *path, size_t path_len, int local) hashclr(p->sha1); return p; } + +void install_packed_git(struct packed_git *pack) +{ + if (pack->pack_fd != -1) + pack_open_fds++; + + pack->next = packed_git; + packed_git = pack; +} diff --git a/sha1_file.c b/sha1_file.c index 7f12b1ee0..b956ca0c9 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -717,15 +717,6 @@ void *xmmap(void *start, size_t length, return ret; } -void install_packed_git(struct packed_git *pack) -{ - if (pack->pack_fd != -1) - pack_open_fds++; - - pack->next = packed_git; - packed_git = pack; -} - void (*report_garbage)(unsigned seen_bits, const char *path); static void report_helper(const struct string_list *list, -- 2.14.0.434.g98096fd7a8-goog
[PATCH v2 06/25] pack: move pack-closing functions
The function close_pack_fd() needs to be temporarily made global. Its scope will be restored to static in a subsequent commit. Signed-off-by: Jonathan Tan--- builtin/am.c| 1 + builtin/clone.c | 1 + builtin/fetch.c | 1 + builtin/merge.c | 1 + cache.h | 8 pack.h | 9 + packfile.c | 54 ++ sha1_file.c | 55 --- 8 files changed, 67 insertions(+), 63 deletions(-) diff --git a/builtin/am.c b/builtin/am.c index c973bd96d..c38dd10a3 100644 --- a/builtin/am.c +++ b/builtin/am.c @@ -31,6 +31,7 @@ #include "mailinfo.h" #include "apply.h" #include "string-list.h" +#include "pack.h" /** * Returns 1 if the file is empty or does not exist, 0 otherwise. diff --git a/builtin/clone.c b/builtin/clone.c index 08b5cc433..53410a45d 100644 --- a/builtin/clone.c +++ b/builtin/clone.c @@ -25,6 +25,7 @@ #include "remote.h" #include "run-command.h" #include "connected.h" +#include "pack.h" /* * Overall FIXMEs: diff --git a/builtin/fetch.c b/builtin/fetch.c index c87e59f3b..196a3bfc4 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -17,6 +17,7 @@ #include "connected.h" #include "argv-array.h" #include "utf8.h" +#include "pack.h" static const char * const builtin_fetch_usage[] = { N_("git fetch [] [ [...]]"), diff --git a/builtin/merge.c b/builtin/merge.c index 900bafdb4..9cff4b276 100644 --- a/builtin/merge.c +++ b/builtin/merge.c @@ -32,6 +32,7 @@ #include "gpg-interface.h" #include "sequencer.h" #include "string-list.h" +#include "pack.h" #define DEFAULT_TWOHEAD (1<<0) #define DEFAULT_OCTOPUS (1<<1) diff --git a/cache.h b/cache.h index 5d6839525..25a21a61f 100644 --- a/cache.h +++ b/cache.h @@ -1637,15 +1637,7 @@ extern int odb_mkstemp(struct strbuf *template, const char *pattern); */ extern int odb_pack_keep(const char *name); -/* - * munmap the index file for the specified packfile (if it is - * currently mmapped). - */ -extern void close_pack_index(struct packed_git *); - extern unsigned char *use_pack(struct packed_git *, struct pack_window **, off_t, unsigned long *); -extern void close_pack_windows(struct packed_git *); -extern void close_all_packs(void); extern void unuse_pack(struct pack_window **); extern void clear_delta_base_cache(void); extern struct packed_git *add_packed_git(const char *path, size_t path_len, int local); diff --git a/pack.h b/pack.h index c16220586..fd4668528 100644 --- a/pack.h +++ b/pack.h @@ -147,4 +147,13 @@ extern int unuse_one_window(struct packed_git *current); extern void release_pack_memory(size_t); +extern void close_pack_windows(struct packed_git *); +extern int close_pack_fd(struct packed_git *); +/* + * munmap the index file for the specified packfile (if it is + * currently mmapped). + */ +extern void close_pack_index(struct packed_git *); +extern void close_all_packs(void); + #endif diff --git a/packfile.c b/packfile.c index 8daa74ad1..c8e2dbdee 100644 --- a/packfile.c +++ b/packfile.c @@ -257,3 +257,57 @@ void release_pack_memory(size_t need) while (need >= (cur - pack_mapped) && unuse_one_window(NULL)) ; /* nothing */ } + +void close_pack_windows(struct packed_git *p) +{ + while (p->windows) { + struct pack_window *w = p->windows; + + if (w->inuse_cnt) + die("pack '%s' still has open windows to it", + p->pack_name); + munmap(w->base, w->len); + pack_mapped -= w->len; + pack_open_windows--; + p->windows = w->next; + free(w); + } +} + +int close_pack_fd(struct packed_git *p) +{ + if (p->pack_fd < 0) + return 0; + + close(p->pack_fd); + pack_open_fds--; + p->pack_fd = -1; + + return 1; +} + +void close_pack_index(struct packed_git *p) +{ + if (p->index_data) { + munmap((void *)p->index_data, p->index_size); + p->index_data = NULL; + } +} + +static void close_pack(struct packed_git *p) +{ + close_pack_windows(p); + close_pack_fd(p); + close_pack_index(p); +} + +void close_all_packs(void) +{ + struct packed_git *p; + + for (p = packed_git; p; p = p->next) + if (p->do_not_close) + die("BUG: want to close pack marked 'do-not-close'"); + else + close_pack(p); +} diff --git a/sha1_file.c b/sha1_file.c index 644876e4e..e2927244f 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -717,53 +717,6 @@ void *xmmap(void *start, size_t length, return ret; } -void close_pack_windows(struct packed_git *p) -{ - while (p->windows) { - struct pack_window *w = p->windows; - - if (w->inuse_cnt) - die("pack '%s' still has open windows to
[PATCH v2 11/25] pack: move {,re}prepare_packed_git and approximate_object_count
Signed-off-by: Jonathan Tan--- builtin/gc.c | 1 + cache.h| 15 http-backend.c | 1 + pack.h | 15 packfile.c | 216 + path.c | 1 + server-info.c | 1 + sha1_file.c| 214 8 files changed, 235 insertions(+), 229 deletions(-) diff --git a/builtin/gc.c b/builtin/gc.c index e6b84475a..f4fe023d3 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -19,6 +19,7 @@ #include "sigchain.h" #include "argv-array.h" #include "commit.h" +#include "pack.h" #define FAILED_RUN "failed to run %s" diff --git a/cache.h b/cache.h index 41562dc0b..f020dfade 100644 --- a/cache.h +++ b/cache.h @@ -1603,21 +1603,6 @@ struct pack_entry { struct packed_git *p; }; -/* A hook to report invalid files in pack directory */ -#define PACKDIR_FILE_PACK 1 -#define PACKDIR_FILE_IDX 2 -#define PACKDIR_FILE_GARBAGE 4 -extern void (*report_garbage)(unsigned seen_bits, const char *path); - -extern void prepare_packed_git(void); -extern void reprepare_packed_git(void); - -/* - * Give a rough count of objects in the repository. This sacrifices accuracy - * for speed. - */ -unsigned long approximate_object_count(void); - extern struct packed_git *find_sha1_pack(const unsigned char *sha1, struct packed_git *packs); diff --git a/http-backend.c b/http-backend.c index 519025d2c..12f7d421f 100644 --- a/http-backend.c +++ b/http-backend.c @@ -9,6 +9,7 @@ #include "string-list.h" #include "url.h" #include "argv-array.h" +#include "pack.h" static const char content_type[] = "Content-Type"; static const char content_length[] = "Content-Length"; diff --git a/pack.h b/pack.h index 576c4fc7c..cad5ed488 100644 --- a/pack.h +++ b/pack.h @@ -152,4 +152,19 @@ extern struct packed_git *add_packed_git(const char *path, size_t path_len, int extern void install_packed_git(struct packed_git *pack); +/* A hook to report invalid files in pack directory */ +#define PACKDIR_FILE_PACK 1 +#define PACKDIR_FILE_IDX 2 +#define PACKDIR_FILE_GARBAGE 4 +extern void (*report_garbage)(unsigned seen_bits, const char *path); + +extern void prepare_packed_git(void); +extern void reprepare_packed_git(void); + +/* + * Give a rough count of objects in the repository. This sacrifices accuracy + * for speed. + */ +unsigned long approximate_object_count(void); + #endif diff --git a/packfile.c b/packfile.c index 4eb65e460..a517172f7 100644 --- a/packfile.c +++ b/packfile.c @@ -1,6 +1,8 @@ #include "cache.h" #include "mru.h" #include "pack.h" +#include "dir.h" +#include "mergesort.h" char *odb_pack_name(struct strbuf *buf, const unsigned char *sha1, @@ -667,3 +669,217 @@ void install_packed_git(struct packed_git *pack) pack->next = packed_git; packed_git = pack; } + +void (*report_garbage)(unsigned seen_bits, const char *path); + +static void report_helper(const struct string_list *list, + int seen_bits, int first, int last) +{ + if (seen_bits == (PACKDIR_FILE_PACK|PACKDIR_FILE_IDX)) + return; + + for (; first < last; first++) + report_garbage(seen_bits, list->items[first].string); +} + +static void report_pack_garbage(struct string_list *list) +{ + int i, baselen = -1, first = 0, seen_bits = 0; + + if (!report_garbage) + return; + + string_list_sort(list); + + for (i = 0; i < list->nr; i++) { + const char *path = list->items[i].string; + if (baselen != -1 && + strncmp(path, list->items[first].string, baselen)) { + report_helper(list, seen_bits, first, i); + baselen = -1; + seen_bits = 0; + } + if (baselen == -1) { + const char *dot = strrchr(path, '.'); + if (!dot) { + report_garbage(PACKDIR_FILE_GARBAGE, path); + continue; + } + baselen = dot - path + 1; + first = i; + } + if (!strcmp(path + baselen, "pack")) + seen_bits |= 1; + else if (!strcmp(path + baselen, "idx")) + seen_bits |= 2; + } + report_helper(list, seen_bits, first, list->nr); +} + +static void prepare_packed_git_one(char *objdir, int local) +{ + struct strbuf path = STRBUF_INIT; + size_t dirnamelen; + DIR *dir; + struct dirent *de; + struct string_list garbage = STRING_LIST_INIT_DUP; + + strbuf_addstr(, objdir); + strbuf_addstr(, "/pack"); + dir = opendir(path.buf); + if (!dir) { + if (errno != ENOENT) + error_errno("unable to open
[PATCH v2 12/25] pack: move unpack_object_header()
Signed-off-by: Jonathan Tan--- cache.h | 1 - pack.h | 2 ++ packfile.c | 25 + sha1_file.c | 25 - 4 files changed, 27 insertions(+), 26 deletions(-) diff --git a/cache.h b/cache.h index f020dfade..9c70759a6 100644 --- a/cache.h +++ b/cache.h @@ -1661,7 +1661,6 @@ extern off_t find_pack_entry_one(const unsigned char *sha1, struct packed_git *) extern int is_pack_valid(struct packed_git *); extern void *unpack_entry(struct packed_git *, off_t, enum object_type *, unsigned long *); -extern unsigned long unpack_object_header_buffer(const unsigned char *buf, unsigned long len, enum object_type *type, unsigned long *sizep); extern unsigned long get_size_from_delta(struct packed_git *, struct pack_window **, off_t); extern int unpack_object_header(struct packed_git *, struct pack_window **, off_t *, unsigned long *); diff --git a/pack.h b/pack.h index cad5ed488..4a7f88a38 100644 --- a/pack.h +++ b/pack.h @@ -167,4 +167,6 @@ extern void reprepare_packed_git(void); */ unsigned long approximate_object_count(void); +extern unsigned long unpack_object_header_buffer(const unsigned char *buf, unsigned long len, enum object_type *type, unsigned long *sizep); + #endif diff --git a/packfile.c b/packfile.c index a517172f7..6e4f1c6e3 100644 --- a/packfile.c +++ b/packfile.c @@ -883,3 +883,28 @@ void reprepare_packed_git(void) prepare_packed_git_run_once = 0; prepare_packed_git(); } + +unsigned long unpack_object_header_buffer(const unsigned char *buf, + unsigned long len, enum object_type *type, unsigned long *sizep) +{ + unsigned shift; + unsigned long size, c; + unsigned long used = 0; + + c = buf[used++]; + *type = (c >> 4) & 7; + size = c & 15; + shift = 4; + while (c & 0x80) { + if (len <= used || bitsizeof(long) <= shift) { + error("bad object header"); + size = used = 0; + break; + } + c = buf[used++]; + size += (c & 0x7f) << shift; + shift += 7; + } + *sizep = size; + return used; +} diff --git a/sha1_file.c b/sha1_file.c index bbce60f1c..1f4b4ba2c 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -913,31 +913,6 @@ void *map_sha1_file(const unsigned char *sha1, unsigned long *size) return map_sha1_file_1(NULL, sha1, size); } -unsigned long unpack_object_header_buffer(const unsigned char *buf, - unsigned long len, enum object_type *type, unsigned long *sizep) -{ - unsigned shift; - unsigned long size, c; - unsigned long used = 0; - - c = buf[used++]; - *type = (c >> 4) & 7; - size = c & 15; - shift = 4; - while (c & 0x80) { - if (len <= used || bitsizeof(long) <= shift) { - error("bad object header"); - size = used = 0; - break; - } - c = buf[used++]; - size += (c & 0x7f) << shift; - shift += 7; - } - *sizep = size; - return used; -} - static int unpack_sha1_short_header(git_zstream *stream, unsigned char *map, unsigned long mapsize, void *buffer, unsigned long bufsiz) -- 2.14.0.434.g98096fd7a8-goog
[PATCH v2 09/25] pack: move add_packed_git()
Signed-off-by: Jonathan Tan--- cache.h | 1 - connected.c | 1 + pack.h | 1 + packfile.c | 53 + sha1_file.c | 61 - 5 files changed, 55 insertions(+), 62 deletions(-) diff --git a/cache.h b/cache.h index 4812f3a63..bf93477e8 100644 --- a/cache.h +++ b/cache.h @@ -1638,7 +1638,6 @@ extern int odb_mkstemp(struct strbuf *template, const char *pattern); extern int odb_pack_keep(const char *name); extern void clear_delta_base_cache(void); -extern struct packed_git *add_packed_git(const char *path, size_t path_len, int local); /* * Make sure that a pointer access into an mmap'd index file is within bounds, diff --git a/connected.c b/connected.c index 136c2ac16..3e3f0148c 100644 --- a/connected.c +++ b/connected.c @@ -3,6 +3,7 @@ #include "sigchain.h" #include "connected.h" #include "transport.h" +#include "pack.h" /* * If we feed all the commits we want to verify to this command diff --git a/pack.h b/pack.h index 3876e9ae6..c1f3ff32d 100644 --- a/pack.h +++ b/pack.h @@ -150,5 +150,6 @@ extern int open_packed_git(struct packed_git *p); extern unsigned char *use_pack(struct packed_git *, struct pack_window **, off_t, unsigned long *); extern void unuse_pack(struct pack_window **); +extern struct packed_git *add_packed_git(const char *path, size_t path_len, int local); #endif diff --git a/packfile.c b/packfile.c index 93526ea7b..efe0ed3e8 100644 --- a/packfile.c +++ b/packfile.c @@ -605,3 +605,56 @@ void unuse_pack(struct pack_window **w_cursor) *w_cursor = NULL; } } + +static void try_to_free_pack_memory(size_t size) +{ + release_pack_memory(size); +} + +struct packed_git *add_packed_git(const char *path, size_t path_len, int local) +{ + static int have_set_try_to_free_routine; + struct stat st; + size_t alloc; + struct packed_git *p; + + if (!have_set_try_to_free_routine) { + have_set_try_to_free_routine = 1; + set_try_to_free_routine(try_to_free_pack_memory); + } + + /* +* Make sure a corresponding .pack file exists and that +* the index looks sane. +*/ + if (!strip_suffix_mem(path, _len, ".idx")) + return NULL; + + /* +* ".pack" is long enough to hold any suffix we're adding (and +* the use xsnprintf double-checks that) +*/ + alloc = st_add3(path_len, strlen(".pack"), 1); + p = alloc_packed_git(alloc); + memcpy(p->pack_name, path, path_len); + + xsnprintf(p->pack_name + path_len, alloc - path_len, ".keep"); + if (!access(p->pack_name, F_OK)) + p->pack_keep = 1; + + xsnprintf(p->pack_name + path_len, alloc - path_len, ".pack"); + if (stat(p->pack_name, ) || !S_ISREG(st.st_mode)) { + free(p); + return NULL; + } + + /* ok, it looks sane as far as we can check without +* actually mapping the pack file. +*/ + p->pack_size = st.st_size; + p->pack_local = local; + p->mtime = st.st_mtime; + if (path_len < 40 || get_sha1_hex(path + path_len - 40, p->sha1)) + hashclr(p->sha1); + return p; +} diff --git a/sha1_file.c b/sha1_file.c index 12501ef06..7f12b1ee0 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -717,67 +717,6 @@ void *xmmap(void *start, size_t length, return ret; } -static struct packed_git *alloc_packed_git(int extra) -{ - struct packed_git *p = xmalloc(st_add(sizeof(*p), extra)); - memset(p, 0, sizeof(*p)); - p->pack_fd = -1; - return p; -} - -static void try_to_free_pack_memory(size_t size) -{ - release_pack_memory(size); -} - -struct packed_git *add_packed_git(const char *path, size_t path_len, int local) -{ - static int have_set_try_to_free_routine; - struct stat st; - size_t alloc; - struct packed_git *p; - - if (!have_set_try_to_free_routine) { - have_set_try_to_free_routine = 1; - set_try_to_free_routine(try_to_free_pack_memory); - } - - /* -* Make sure a corresponding .pack file exists and that -* the index looks sane. -*/ - if (!strip_suffix_mem(path, _len, ".idx")) - return NULL; - - /* -* ".pack" is long enough to hold any suffix we're adding (and -* the use xsnprintf double-checks that) -*/ - alloc = st_add3(path_len, strlen(".pack"), 1); - p = alloc_packed_git(alloc); - memcpy(p->pack_name, path, path_len); - - xsnprintf(p->pack_name + path_len, alloc - path_len, ".keep"); - if (!access(p->pack_name, F_OK)) - p->pack_keep = 1; - - xsnprintf(p->pack_name + path_len, alloc - path_len, ".pack"); - if (stat(p->pack_name, ) || !S_ISREG(st.st_mode)) { -
[PATCH v2 20/25] pack: move find_pack_entry_one(), is_pack_valid()
Signed-off-by: Jonathan Tan--- cache.h | 8 -- pack.h | 10 ++-- packfile.c | 85 + sha1_file.c | 84 4 files changed, 93 insertions(+), 94 deletions(-) diff --git a/cache.h b/cache.h index 7686ccb30..b944aca69 100644 --- a/cache.h +++ b/cache.h @@ -1618,14 +1618,6 @@ extern int odb_mkstemp(struct strbuf *template, const char *pattern); */ extern int odb_pack_keep(const char *name); -/* - * If the object named sha1 is present in the specified packfile, - * return its offset within the packfile; otherwise, return 0. - */ -extern off_t find_pack_entry_one(const unsigned char *sha1, struct packed_git *); - -extern int is_pack_valid(struct packed_git *); - /* * Iterate over the files in the loose-object parts of the object * directory "path", triggering the following callbacks: diff --git a/pack.h b/pack.h index e0e206e3c..f5bd94813 100644 --- a/pack.h +++ b/pack.h @@ -144,8 +144,6 @@ extern void close_pack_windows(struct packed_git *); extern void close_pack_index(struct packed_git *); extern void close_all_packs(void); -extern int open_packed_git(struct packed_git *p); - extern unsigned char *use_pack(struct packed_git *, struct pack_window **, off_t, unsigned long *); extern void unuse_pack(struct pack_window **); extern struct packed_git *add_packed_git(const char *path, size_t path_len, int local); @@ -212,4 +210,12 @@ extern void check_pack_index_ptr(const struct packed_git *p, const void *ptr); */ extern off_t nth_packed_object_offset(const struct packed_git *, uint32_t n); +/* + * If the object named sha1 is present in the specified packfile, + * return its offset within the packfile; otherwise, return 0. + */ +extern off_t find_pack_entry_one(const unsigned char *sha1, struct packed_git *); + +extern int is_pack_valid(struct packed_git *); + #endif diff --git a/packfile.c b/packfile.c index 94c8af991..71017d2ec 100644 --- a/packfile.c +++ b/packfile.c @@ -6,6 +6,7 @@ #include "delta.h" #include "list.h" #include "streaming.h" +#include "sha1-lookup.h" char *odb_pack_name(struct strbuf *buf, const unsigned char *sha1, @@ -1698,3 +1699,87 @@ off_t nth_packed_object_offset(const struct packed_git *p, uint32_t n) ntohl(*((uint32_t *)(index + 4))); } } + +off_t find_pack_entry_one(const unsigned char *sha1, + struct packed_git *p) +{ + const uint32_t *level1_ofs = p->index_data; + const unsigned char *index = p->index_data; + unsigned hi, lo, stride; + static int use_lookup = -1; + static int debug_lookup = -1; + + if (debug_lookup < 0) + debug_lookup = !!getenv("GIT_DEBUG_LOOKUP"); + + if (!index) { + if (open_pack_index(p)) + return 0; + level1_ofs = p->index_data; + index = p->index_data; + } + if (p->index_version > 1) { + level1_ofs += 2; + index += 8; + } + index += 4 * 256; + hi = ntohl(level1_ofs[*sha1]); + lo = ((*sha1 == 0x0) ? 0 : ntohl(level1_ofs[*sha1 - 1])); + if (p->index_version > 1) { + stride = 20; + } else { + stride = 24; + index += 4; + } + + if (debug_lookup) + printf("%02x%02x%02x... lo %u hi %u nr %"PRIu32"\n", + sha1[0], sha1[1], sha1[2], lo, hi, p->num_objects); + + if (use_lookup < 0) + use_lookup = !!getenv("GIT_USE_LOOKUP"); + if (use_lookup) { + int pos = sha1_entry_pos(index, stride, 0, +lo, hi, p->num_objects, sha1); + if (pos < 0) + return 0; + return nth_packed_object_offset(p, pos); + } + + do { + unsigned mi = (lo + hi) / 2; + int cmp = hashcmp(index + mi * stride, sha1); + + if (debug_lookup) + printf("lo %u hi %u rg %u mi %u\n", + lo, hi, hi - lo, mi); + if (!cmp) + return nth_packed_object_offset(p, mi); + if (cmp > 0) + hi = mi; + else + lo = mi+1; + } while (lo < hi); + return 0; +} + +int is_pack_valid(struct packed_git *p) +{ + /* An already open pack is known to be valid. */ + if (p->pack_fd != -1) + return 1; + + /* If the pack has one window completely covering the +* file size, the pack is known to be valid even if +* the descriptor is not currently open. +*/ + if (p->windows) { + struct pack_window *w = p->windows; + + if (!w->offset && w->len ==
[PATCH v2 21/25] pack: move find_sha1_pack()
Signed-off-by: Jonathan Tan--- cache.h | 3 --- http-push.c | 1 + http-walker.c | 1 + pack.h| 3 +++ packfile.c| 13 + sha1_file.c | 13 - 6 files changed, 18 insertions(+), 16 deletions(-) diff --git a/cache.h b/cache.h index b944aca69..06a8caae6 100644 --- a/cache.h +++ b/cache.h @@ -1600,9 +1600,6 @@ struct pack_entry { struct packed_git *p; }; -extern struct packed_git *find_sha1_pack(const unsigned char *sha1, -struct packed_git *packs); - /* * Create a temporary file rooted in the object database directory, or * die on failure. The filename is taken from "pattern", which should have the diff --git a/http-push.c b/http-push.c index c91f40a61..4e8a227d1 100644 --- a/http-push.c +++ b/http-push.c @@ -11,6 +11,7 @@ #include "list-objects.h" #include "sigchain.h" #include "argv-array.h" +#include "pack.h" #ifdef EXPAT_NEEDS_XMLPARSE_H #include diff --git a/http-walker.c b/http-walker.c index ee049cb13..d6f0af944 100644 --- a/http-walker.c +++ b/http-walker.c @@ -4,6 +4,7 @@ #include "http.h" #include "list.h" #include "transport.h" +#include "pack.h" struct alt_base { char *base; diff --git a/pack.h b/pack.h index f5bd94813..0517d6542 100644 --- a/pack.h +++ b/pack.h @@ -218,4 +218,7 @@ extern off_t find_pack_entry_one(const unsigned char *sha1, struct packed_git *) extern int is_pack_valid(struct packed_git *); +extern struct packed_git *find_sha1_pack(const unsigned char *sha1, +struct packed_git *packs); + #endif diff --git a/packfile.c b/packfile.c index 71017d2ec..f16b56262 100644 --- a/packfile.c +++ b/packfile.c @@ -1783,3 +1783,16 @@ int is_pack_valid(struct packed_git *p) /* Force the pack to open to prove its valid. */ return !open_packed_git(p); } + +struct packed_git *find_sha1_pack(const unsigned char *sha1, + struct packed_git *packs) +{ + struct packed_git *p; + + for (p = packs; p; p = p->next) { + if (find_pack_entry_one(sha1, p)) + return p; + } + return NULL; + +} diff --git a/sha1_file.c b/sha1_file.c index 75b9ceb39..229358663 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -1126,19 +1126,6 @@ static int find_pack_entry(const unsigned char *sha1, struct pack_entry *e) return 0; } -struct packed_git *find_sha1_pack(const unsigned char *sha1, - struct packed_git *packs) -{ - struct packed_git *p; - - for (p = packs; p; p = p->next) { - if (find_pack_entry_one(sha1, p)) - return p; - } - return NULL; - -} - static int sha1_loose_object_info(const unsigned char *sha1, struct object_info *oi, int flags) -- 2.14.0.434.g98096fd7a8-goog
[PATCH v2 19/25] pack: move check_pack_index_ptr(), nth_packed_object_offset()
Signed-off-by: Jonathan Tan--- cache.h | 16 pack.h | 16 packfile.c | 33 + sha1_file.c | 33 - 4 files changed, 49 insertions(+), 49 deletions(-) diff --git a/cache.h b/cache.h index f083d532e..7686ccb30 100644 --- a/cache.h +++ b/cache.h @@ -1618,22 +1618,6 @@ extern int odb_mkstemp(struct strbuf *template, const char *pattern); */ extern int odb_pack_keep(const char *name); -/* - * Make sure that a pointer access into an mmap'd index file is within bounds, - * and can provide at least 8 bytes of data. - * - * Note that this is only necessary for variable-length segments of the file - * (like the 64-bit extended offset table), as we compare the size to the - * fixed-length parts when we open the file. - */ -extern void check_pack_index_ptr(const struct packed_git *p, const void *ptr); - -/* - * Return the offset of the nth object within the specified packfile. - * The index must already be opened. - */ -extern off_t nth_packed_object_offset(const struct packed_git *, uint32_t n); - /* * If the object named sha1 is present in the specified packfile, * return its offset within the packfile; otherwise, return 0. diff --git a/pack.h b/pack.h index 023c97b37..e0e206e3c 100644 --- a/pack.h +++ b/pack.h @@ -196,4 +196,20 @@ extern const unsigned char *nth_packed_object_sha1(struct packed_git *, uint32_t */ extern const struct object_id *nth_packed_object_oid(struct object_id *, struct packed_git *, uint32_t n); +/* + * Make sure that a pointer access into an mmap'd index file is within bounds, + * and can provide at least 8 bytes of data. + * + * Note that this is only necessary for variable-length segments of the file + * (like the 64-bit extended offset table), as we compare the size to the + * fixed-length parts when we open the file. + */ +extern void check_pack_index_ptr(const struct packed_git *p, const void *ptr); + +/* + * Return the offset of the nth object within the specified packfile. + * The index must already be opened. + */ +extern off_t nth_packed_object_offset(const struct packed_git *, uint32_t n); + #endif diff --git a/packfile.c b/packfile.c index b16cf648a..94c8af991 100644 --- a/packfile.c +++ b/packfile.c @@ -1665,3 +1665,36 @@ const struct object_id *nth_packed_object_oid(struct object_id *oid, hashcpy(oid->hash, hash); return oid; } + +void check_pack_index_ptr(const struct packed_git *p, const void *vptr) +{ + const unsigned char *ptr = vptr; + const unsigned char *start = p->index_data; + const unsigned char *end = start + p->index_size; + if (ptr < start) + die(_("offset before start of pack index for %s (corrupt index?)"), + p->pack_name); + /* No need to check for underflow; .idx files must be at least 8 bytes */ + if (ptr >= end - 8) + die(_("offset beyond end of pack index for %s (truncated index?)"), + p->pack_name); +} + +off_t nth_packed_object_offset(const struct packed_git *p, uint32_t n) +{ + const unsigned char *index = p->index_data; + index += 4 * 256; + if (p->index_version == 1) { + return ntohl(*((uint32_t *)(index + 24 * n))); + } else { + uint32_t off; + index += 8 + p->num_objects * (20 + 4); + off = ntohl(*((uint32_t *)(index + 4 * n))); + if (!(off & 0x8000)) + return off; + index += p->num_objects * 4 + (off & 0x7fff) * 8; + check_pack_index_ptr(p, index); + return (((uint64_t)ntohl(*((uint32_t *)(index + 0 << 32) | + ntohl(*((uint32_t *)(index + 4))); + } +} diff --git a/sha1_file.c b/sha1_file.c index 4cd2b1809..0f4d68c5a 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -1073,39 +1073,6 @@ int parse_sha1_header(const char *hdr, unsigned long *sizep) return parse_sha1_header_extended(hdr, , 0); } -void check_pack_index_ptr(const struct packed_git *p, const void *vptr) -{ - const unsigned char *ptr = vptr; - const unsigned char *start = p->index_data; - const unsigned char *end = start + p->index_size; - if (ptr < start) - die(_("offset before start of pack index for %s (corrupt index?)"), - p->pack_name); - /* No need to check for underflow; .idx files must be at least 8 bytes */ - if (ptr >= end - 8) - die(_("offset beyond end of pack index for %s (truncated index?)"), - p->pack_name); -} - -off_t nth_packed_object_offset(const struct packed_git *p, uint32_t n) -{ - const unsigned char *index = p->index_data; - index += 4 * 256; - if (p->index_version == 1) { - return ntohl(*((uint32_t *)(index + 24 * n))); - } else { -
[PATCH v2 16/25] sha1_file: remove read_packed_sha1()
Use read_object() in its place instead. This avoids duplication of code. This makes force_object_loose() slightly slower (because of a redundant check of loose object storage), but only in the error case. Signed-off-by: Jonathan Tan--- sha1_file.c | 26 +- 1 file changed, 1 insertion(+), 25 deletions(-) diff --git a/sha1_file.c b/sha1_file.c index 9eadda388..9e5444334 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -2091,30 +2091,6 @@ int sha1_object_info(const unsigned char *sha1, unsigned long *sizep) return type; } -static void *read_packed_sha1(const unsigned char *sha1, - enum object_type *type, unsigned long *size) -{ - struct pack_entry e; - void *data; - - if (!find_pack_entry(sha1, )) - return NULL; - data = cache_or_unpack_entry(e.p, e.offset, size, type); - if (!data) { - /* -* We're probably in deep shit, but let's try to fetch -* the required object anyway from another pack or loose. -* This should happen only in the presence of a corrupted -* pack, and is better than failing outright. -*/ - error("failed to read object %s at offset %"PRIuMAX" from %s", - sha1_to_hex(sha1), (uintmax_t)e.offset, e.p->pack_name); - mark_bad_packed_object(e.p, sha1); - data = read_object(sha1, type, size); - } - return data; -} - int pretend_sha1_file(void *buf, unsigned long len, enum object_type type, unsigned char *sha1) { @@ -2497,7 +2473,7 @@ int force_object_loose(const unsigned char *sha1, time_t mtime) if (has_loose_object(sha1)) return 0; - buf = read_packed_sha1(sha1, , ); + buf = read_object(sha1, , ); if (!buf) return error("cannot read sha1_file for %s", sha1_to_hex(sha1)); hdrlen = xsnprintf(hdr, sizeof(hdr), "%s %lu", typename(type), len) + 1; -- 2.14.0.434.g98096fd7a8-goog
[PATCH v2 18/25] pack: move nth_packed_object_{sha1,oid}
Signed-off-by: Jonathan Tan--- cache.h | 14 -- pack.h | 14 ++ packfile.c | 31 +++ sha1_file.c | 31 --- 4 files changed, 45 insertions(+), 45 deletions(-) diff --git a/cache.h b/cache.h index b14098bf1..f083d532e 100644 --- a/cache.h +++ b/cache.h @@ -1628,20 +1628,6 @@ extern int odb_pack_keep(const char *name); */ extern void check_pack_index_ptr(const struct packed_git *p, const void *ptr); -/* - * Return the SHA-1 of the nth object within the specified packfile. - * Open the index if it is not already open. The return value points - * at the SHA-1 within the mmapped index. Return NULL if there is an - * error. - */ -extern const unsigned char *nth_packed_object_sha1(struct packed_git *, uint32_t n); -/* - * Like nth_packed_object_sha1, but write the data into the object specified by - * the the first argument. Returns the first argument on success, and NULL on - * error. - */ -extern const struct object_id *nth_packed_object_oid(struct object_id *, struct packed_git *, uint32_t n); - /* * Return the offset of the nth object within the specified packfile. * The index must already be opened. diff --git a/pack.h b/pack.h index 2e6f357c3..023c97b37 100644 --- a/pack.h +++ b/pack.h @@ -182,4 +182,18 @@ extern void *unpack_entry(struct packed_git *, off_t, enum object_type *, unsign extern void mark_bad_packed_object(struct packed_git *p, const unsigned char *sha1); extern const struct packed_git *has_packed_and_bad(const unsigned char *sha1); +/* + * Return the SHA-1 of the nth object within the specified packfile. + * Open the index if it is not already open. The return value points + * at the SHA-1 within the mmapped index. Return NULL if there is an + * error. + */ +extern const unsigned char *nth_packed_object_sha1(struct packed_git *, uint32_t n); +/* + * Like nth_packed_object_sha1, but write the data into the object specified by + * the the first argument. Returns the first argument on success, and NULL on + * error. + */ +extern const struct object_id *nth_packed_object_oid(struct object_id *, struct packed_git *, uint32_t n); + #endif diff --git a/packfile.c b/packfile.c index a3745f9df..b16cf648a 100644 --- a/packfile.c +++ b/packfile.c @@ -1634,3 +1634,34 @@ void *unpack_entry(struct packed_git *p, off_t obj_offset, return data; } + +const unsigned char *nth_packed_object_sha1(struct packed_git *p, + uint32_t n) +{ + const unsigned char *index = p->index_data; + if (!index) { + if (open_pack_index(p)) + return NULL; + index = p->index_data; + } + if (n >= p->num_objects) + return NULL; + index += 4 * 256; + if (p->index_version == 1) { + return index + 24 * n + 4; + } else { + index += 8; + return index + 20 * n; + } +} + +const struct object_id *nth_packed_object_oid(struct object_id *oid, + struct packed_git *p, + uint32_t n) +{ + const unsigned char *hash = nth_packed_object_sha1(p, n); + if (!hash) + return NULL; + hashcpy(oid->hash, hash); + return oid; +} diff --git a/sha1_file.c b/sha1_file.c index fe7e0db76..4cd2b1809 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -1073,37 +1073,6 @@ int parse_sha1_header(const char *hdr, unsigned long *sizep) return parse_sha1_header_extended(hdr, , 0); } -const unsigned char *nth_packed_object_sha1(struct packed_git *p, - uint32_t n) -{ - const unsigned char *index = p->index_data; - if (!index) { - if (open_pack_index(p)) - return NULL; - index = p->index_data; - } - if (n >= p->num_objects) - return NULL; - index += 4 * 256; - if (p->index_version == 1) { - return index + 24 * n + 4; - } else { - index += 8; - return index + 20 * n; - } -} - -const struct object_id *nth_packed_object_oid(struct object_id *oid, - struct packed_git *p, - uint32_t n) -{ - const unsigned char *hash = nth_packed_object_sha1(p, n); - if (!hash) - return NULL; - hashcpy(oid->hash, hash); - return oid; -} - void check_pack_index_ptr(const struct packed_git *p, const void *vptr) { const unsigned char *ptr = vptr; -- 2.14.0.434.g98096fd7a8-goog
[PATCH v2 15/25] sha1_file: set whence in storage-specific info fn
Move the setting of oi->whence to sha1_loose_object_info() and packed_object_info(). This allows sha1_object_info_extended() to not need to know about the delta base cache. Signed-off-by: Jonathan Tan--- sha1_file.c | 13 ++--- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/sha1_file.c b/sha1_file.c index f3bcdae17..9eadda388 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -1473,6 +1473,9 @@ int packed_object_info(struct packed_git *p, off_t obj_offset, hashclr(oi->delta_base_sha1); } + oi->whence = in_delta_base_cache(p, obj_offset) ? OI_DBCACHED : + OI_PACKED; + out: unuse_pack(_curs); return type; @@ -2002,6 +2005,7 @@ static int sha1_loose_object_info(const unsigned char *sha1, if (oi->sizep == _scratch) oi->sizep = NULL; strbuf_release(); + oi->whence = OI_LOOSE; return (status < 0) ? status : 0; } @@ -2039,10 +2043,8 @@ int sha1_object_info_extended(const unsigned char *sha1, struct object_info *oi, if (!find_pack_entry(real, )) { /* Most likely it's a loose object. */ - if (!sha1_loose_object_info(real, oi, flags)) { - oi->whence = OI_LOOSE; + if (!sha1_loose_object_info(real, oi, flags)) return 0; - } /* Not a loose object; someone else may have just packed it. */ if (flags & OBJECT_INFO_QUICK) { @@ -2065,10 +2067,7 @@ int sha1_object_info_extended(const unsigned char *sha1, struct object_info *oi, if (rtype < 0) { mark_bad_packed_object(e.p, real); return sha1_object_info_extended(real, oi, 0); - } else if (in_delta_base_cache(e.p, e.offset)) { - oi->whence = OI_DBCACHED; - } else { - oi->whence = OI_PACKED; + } else if (oi->whence == OI_PACKED) { oi->u.packed.offset = e.offset; oi->u.packed.pack = e.p; oi->u.packed.is_delta = (rtype == OBJ_REF_DELTA || -- 2.14.0.434.g98096fd7a8-goog
[PATCH v2 17/25] pack: move packed_object_info(), unpack_entry()
Both sha1_file.c and packfile.c now need read_object(), so a copy of read_object() was created in packfile.c. This patch makes both mark_bad_packed_object() and has_packed_and_bad() global. Unlike most of the other patches in this series, these 2 functions need to remain global. Signed-off-by: Jonathan Tan--- cache.h | 7 - pack.h | 11 + packfile.c | 660 ++ sha1_file.c | 676 ++-- 4 files changed, 685 insertions(+), 669 deletions(-) diff --git a/cache.h b/cache.h index 1aea0..b14098bf1 100644 --- a/cache.h +++ b/cache.h @@ -1186,9 +1186,6 @@ extern void *map_sha1_file(const unsigned char *sha1, unsigned long *size); extern int unpack_sha1_header(git_zstream *stream, unsigned char *map, unsigned long mapsize, void *buffer, unsigned long bufsiz); extern int parse_sha1_header(const char *hdr, unsigned long *sizep); -/* global flag to enable extra checks when accessing packed objects */ -extern int do_check_packed_object_crc; - extern int check_sha1_signature(const unsigned char *sha1, void *buf, unsigned long size, const char *type); extern int finalize_object_file(const char *tmpfile, const char *filename); @@ -1621,8 +1618,6 @@ extern int odb_mkstemp(struct strbuf *template, const char *pattern); */ extern int odb_pack_keep(const char *name); -extern void clear_delta_base_cache(void); - /* * Make sure that a pointer access into an mmap'd index file is within bounds, * and can provide at least 8 bytes of data. @@ -1660,7 +1655,6 @@ extern off_t nth_packed_object_offset(const struct packed_git *, uint32_t n); extern off_t find_pack_entry_one(const unsigned char *sha1, struct packed_git *); extern int is_pack_valid(struct packed_git *); -extern void *unpack_entry(struct packed_git *, off_t, enum object_type *, unsigned long *); /* * Iterate over the files in the loose-object parts of the object @@ -1771,7 +1765,6 @@ struct object_info { /* Do not retry packed storage after checking packed and loose storage */ #define OBJECT_INFO_QUICK 8 extern int sha1_object_info_extended(const unsigned char *, struct object_info *, unsigned flags); -extern int packed_object_info(struct packed_git *pack, off_t offset, struct object_info *); /* Dumb servers support */ extern int update_server_info(int); diff --git a/pack.h b/pack.h index 5e3552392..2e6f357c3 100644 --- a/pack.h +++ b/pack.h @@ -171,4 +171,15 @@ extern unsigned long unpack_object_header_buffer(const unsigned char *buf, unsig extern unsigned long get_size_from_delta(struct packed_git *, struct pack_window **, off_t); extern int unpack_object_header(struct packed_git *, struct pack_window **, off_t *, unsigned long *); +extern void clear_delta_base_cache(void); + +/* global flag to enable extra checks when accessing packed objects */ +extern int do_check_packed_object_crc; + +extern int packed_object_info(struct packed_git *pack, off_t offset, struct object_info *); +extern void *unpack_entry(struct packed_git *, off_t, enum object_type *, unsigned long *); + +extern void mark_bad_packed_object(struct packed_git *p, const unsigned char *sha1); +extern const struct packed_git *has_packed_and_bad(const unsigned char *sha1); + #endif diff --git a/packfile.c b/packfile.c index a4db78ea0..a3745f9df 100644 --- a/packfile.c +++ b/packfile.c @@ -4,6 +4,8 @@ #include "dir.h" #include "mergesort.h" #include "delta.h" +#include "list.h" +#include "streaming.h" char *odb_pack_name(struct strbuf *buf, const unsigned char *sha1, @@ -974,3 +976,661 @@ int unpack_object_header(struct packed_git *p, return type; } + +void mark_bad_packed_object(struct packed_git *p, const unsigned char *sha1) +{ + unsigned i; + for (i = 0; i < p->num_bad_objects; i++) + if (!hashcmp(sha1, p->bad_object_sha1 + GIT_SHA1_RAWSZ * i)) + return; + p->bad_object_sha1 = xrealloc(p->bad_object_sha1, + st_mult(GIT_MAX_RAWSZ, + st_add(p->num_bad_objects, 1))); + hashcpy(p->bad_object_sha1 + GIT_SHA1_RAWSZ * p->num_bad_objects, sha1); + p->num_bad_objects++; +} + +const struct packed_git *has_packed_and_bad(const unsigned char *sha1) +{ + struct packed_git *p; + unsigned i; + + for (p = packed_git; p; p = p->next) + for (i = 0; i < p->num_bad_objects; i++) + if (!hashcmp(sha1, p->bad_object_sha1 + 20 * i)) + return p; + return NULL; +} + +static off_t get_delta_base(struct packed_git *p, + struct pack_window **w_curs, + off_t *curpos, + enum object_type type, + off_t delta_obj_offset) +{ +
[PATCH v2 23/25] pack: move has_sha1_pack()
Signed-off-by: Jonathan Tan--- builtin/prune-packed.c | 1 + cache.h| 2 -- diff.c | 1 + pack.h | 2 ++ packfile.c | 6 ++ revision.c | 1 + sha1_file.c| 6 -- 7 files changed, 11 insertions(+), 8 deletions(-) diff --git a/builtin/prune-packed.c b/builtin/prune-packed.c index ac978ad40..79130aa2e 100644 --- a/builtin/prune-packed.c +++ b/builtin/prune-packed.c @@ -2,6 +2,7 @@ #include "cache.h" #include "progress.h" #include "parse-options.h" +#include "pack.h" static const char * const prune_packed_usage[] = { N_("git prune-packed [-n | --dry-run] [-q | --quiet]"), diff --git a/cache.h b/cache.h index 06a8caae6..d96d36d50 100644 --- a/cache.h +++ b/cache.h @@ -1190,8 +1190,6 @@ extern int check_sha1_signature(const unsigned char *sha1, void *buf, unsigned l extern int finalize_object_file(const char *tmpfile, const char *filename); -extern int has_sha1_pack(const unsigned char *sha1); - /* * Open the loose object at path, check its sha1, and return the contents, * type, and size. If the object is a blob, then "contents" may return NULL, diff --git a/diff.c b/diff.c index 85e714f6c..6bbc46326 100644 --- a/diff.c +++ b/diff.c @@ -20,6 +20,7 @@ #include "string-list.h" #include "argv-array.h" #include "graph.h" +#include "pack.h" #ifdef NO_FAST_WORKING_DIRECTORY #define FAST_WORKING_DIRECTORY 0 diff --git a/pack.h b/pack.h index 1021a781c..ce0e15deb 100644 --- a/pack.h +++ b/pack.h @@ -223,4 +223,6 @@ extern struct packed_git *find_sha1_pack(const unsigned char *sha1, extern int find_pack_entry(const unsigned char *sha1, struct pack_entry *e); +extern int has_sha1_pack(const unsigned char *sha1); + #endif diff --git a/packfile.c b/packfile.c index 0f1e3338b..507f65236 100644 --- a/packfile.c +++ b/packfile.c @@ -1849,3 +1849,9 @@ int find_pack_entry(const unsigned char *sha1, struct pack_entry *e) } return 0; } + +int has_sha1_pack(const unsigned char *sha1) +{ + struct pack_entry e; + return find_pack_entry(sha1, ); +} diff --git a/revision.c b/revision.c index 6603af944..2868c4fc8 100644 --- a/revision.c +++ b/revision.c @@ -19,6 +19,7 @@ #include "dir.h" #include "cache-tree.h" #include "bisect.h" +#include "pack.h" volatile show_early_output_fn_t show_early_output; diff --git a/sha1_file.c b/sha1_file.c index 1a505eae5..2610ea057 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -1629,12 +1629,6 @@ int has_pack_index(const unsigned char *sha1) return 1; } -int has_sha1_pack(const unsigned char *sha1) -{ - struct pack_entry e; - return find_pack_entry(sha1, ); -} - int has_sha1_file_with_flags(const unsigned char *sha1, int flags) { if (!startup_info->have_repository) -- 2.14.0.434.g98096fd7a8-goog
[PATCH v2 24/25] pack: move has_pack_index()
Signed-off-by: Jonathan Tan--- cache.h | 2 -- pack.h | 2 ++ packfile.c | 8 sha1_file.c | 8 4 files changed, 10 insertions(+), 10 deletions(-) diff --git a/cache.h b/cache.h index d96d36d50..656b39d51 100644 --- a/cache.h +++ b/cache.h @@ -1225,8 +1225,6 @@ extern int has_object_file_with_flags(const struct object_id *oid, int flags); */ extern int has_loose_object_nonlocal(const unsigned char *sha1); -extern int has_pack_index(const unsigned char *sha1); - extern void assert_sha1_type(const unsigned char *sha1, enum object_type expect); /* Helper to check and "touch" a file */ diff --git a/pack.h b/pack.h index ce0e15deb..2c2a347ba 100644 --- a/pack.h +++ b/pack.h @@ -225,4 +225,6 @@ extern int find_pack_entry(const unsigned char *sha1, struct pack_entry *e); extern int has_sha1_pack(const unsigned char *sha1); +extern int has_pack_index(const unsigned char *sha1); + #endif diff --git a/packfile.c b/packfile.c index 507f65236..28a16206c 100644 --- a/packfile.c +++ b/packfile.c @@ -1855,3 +1855,11 @@ int has_sha1_pack(const unsigned char *sha1) struct pack_entry e; return find_pack_entry(sha1, ); } + +int has_pack_index(const unsigned char *sha1) +{ + struct stat st; + if (stat(sha1_pack_index_name(sha1), )) + return 0; + return 1; +} diff --git a/sha1_file.c b/sha1_file.c index 2610ea057..8584f6cf2 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -1621,14 +1621,6 @@ int force_object_loose(const unsigned char *sha1, time_t mtime) return ret; } -int has_pack_index(const unsigned char *sha1) -{ - struct stat st; - if (stat(sha1_pack_index_name(sha1), )) - return 0; - return 1; -} - int has_sha1_file_with_flags(const unsigned char *sha1, int flags) { if (!startup_info->have_repository) -- 2.14.0.434.g98096fd7a8-goog
[PATCH v2 25/25] pack: move for_each_packed_object()
Signed-off-by: Jonathan Tan--- builtin/cat-file.c | 1 + cache.h| 7 +-- pack.h | 11 +++ packfile.c | 40 reachable.c| 1 + sha1_file.c| 40 6 files changed, 54 insertions(+), 46 deletions(-) diff --git a/builtin/cat-file.c b/builtin/cat-file.c index 96b786e48..316ef5c98 100644 --- a/builtin/cat-file.c +++ b/builtin/cat-file.c @@ -12,6 +12,7 @@ #include "streaming.h" #include "tree-walk.h" #include "sha1-array.h" +#include "pack.h" struct batch_options { int enabled; diff --git a/cache.h b/cache.h index 656b39d51..6c3822783 100644 --- a/cache.h +++ b/cache.h @@ -1660,17 +1660,12 @@ int for_each_loose_file_in_objdir_buf(struct strbuf *path, void *data); /* - * Iterate over loose and packed objects in both the local + * Iterate over loose objects in both the local * repository and any alternates repositories (unless the * LOCAL_ONLY flag is set). */ #define FOR_EACH_OBJECT_LOCAL_ONLY 0x1 -typedef int each_packed_object_fn(const struct object_id *oid, - struct packed_git *pack, - uint32_t pos, - void *data); extern int for_each_loose_object(each_loose_object_fn, void *, unsigned flags); -extern int for_each_packed_object(each_packed_object_fn, void *, unsigned flags); struct object_info { /* Request */ diff --git a/pack.h b/pack.h index 2c2a347ba..905b05be5 100644 --- a/pack.h +++ b/pack.h @@ -227,4 +227,15 @@ extern int has_sha1_pack(const unsigned char *sha1); extern int has_pack_index(const unsigned char *sha1); +/* + * Iterate over packed objects in both the local + * repository and any alternates repositories (unless the + * FOR_EACH_OBJECT_LOCAL_ONLY flag, defined in cache.h, is set). + */ +typedef int each_packed_object_fn(const struct object_id *oid, + struct packed_git *pack, + uint32_t pos, + void *data); +extern int for_each_packed_object(each_packed_object_fn, void *, unsigned flags); + #endif diff --git a/packfile.c b/packfile.c index 28a16206c..031a40828 100644 --- a/packfile.c +++ b/packfile.c @@ -1863,3 +1863,43 @@ int has_pack_index(const unsigned char *sha1) return 0; return 1; } + +static int for_each_object_in_pack(struct packed_git *p, each_packed_object_fn cb, void *data) +{ + uint32_t i; + int r = 0; + + for (i = 0; i < p->num_objects; i++) { + struct object_id oid; + + if (!nth_packed_object_oid(, p, i)) + return error("unable to get sha1 of object %u in %s", +i, p->pack_name); + + r = cb(, p, i, data); + if (r) + break; + } + return r; +} + +int for_each_packed_object(each_packed_object_fn cb, void *data, unsigned flags) +{ + struct packed_git *p; + int r = 0; + int pack_errors = 0; + + prepare_packed_git(); + for (p = packed_git; p; p = p->next) { + if ((flags & FOR_EACH_OBJECT_LOCAL_ONLY) && !p->pack_local) + continue; + if (open_pack_index(p)) { + pack_errors = 1; + continue; + } + r = for_each_object_in_pack(p, cb, data); + if (r) + break; + } + return r ? r : pack_errors; +} diff --git a/reachable.c b/reachable.c index c62efbfd4..ef606ae17 100644 --- a/reachable.c +++ b/reachable.c @@ -9,6 +9,7 @@ #include "cache-tree.h" #include "progress.h" #include "list-objects.h" +#include "pack.h" struct connectivity_progress { struct progress *progress; diff --git a/sha1_file.c b/sha1_file.c index 8584f6cf2..3f3f9174f 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -2013,46 +2013,6 @@ int for_each_loose_object(each_loose_object_fn cb, void *data, unsigned flags) return foreach_alt_odb(loose_from_alt_odb, ); } -static int for_each_object_in_pack(struct packed_git *p, each_packed_object_fn cb, void *data) -{ - uint32_t i; - int r = 0; - - for (i = 0; i < p->num_objects; i++) { - struct object_id oid; - - if (!nth_packed_object_oid(, p, i)) - return error("unable to get sha1 of object %u in %s", -i, p->pack_name); - - r = cb(, p, i, data); - if (r) - break; - } - return r; -} - -int for_each_packed_object(each_packed_object_fn cb, void *data, unsigned flags) -{ - struct packed_git *p; - int r = 0; - int pack_errors = 0; - - prepare_packed_git(); -
[PATCH v2 22/25] pack: move find_pack_entry() and make it global
This function needs to be global as it is used by sha1_file.c and will be used by packfile.c. Signed-off-by: Jonathan Tan--- pack.h | 2 ++ packfile.c | 53 + sha1_file.c | 53 - 3 files changed, 55 insertions(+), 53 deletions(-) diff --git a/pack.h b/pack.h index 0517d6542..1021a781c 100644 --- a/pack.h +++ b/pack.h @@ -221,4 +221,6 @@ extern int is_pack_valid(struct packed_git *); extern struct packed_git *find_sha1_pack(const unsigned char *sha1, struct packed_git *packs); +extern int find_pack_entry(const unsigned char *sha1, struct pack_entry *e); + #endif diff --git a/packfile.c b/packfile.c index f16b56262..0f1e3338b 100644 --- a/packfile.c +++ b/packfile.c @@ -1796,3 +1796,56 @@ struct packed_git *find_sha1_pack(const unsigned char *sha1, return NULL; } + +static int fill_pack_entry(const unsigned char *sha1, + struct pack_entry *e, + struct packed_git *p) +{ + off_t offset; + + if (p->num_bad_objects) { + unsigned i; + for (i = 0; i < p->num_bad_objects; i++) + if (!hashcmp(sha1, p->bad_object_sha1 + 20 * i)) + return 0; + } + + offset = find_pack_entry_one(sha1, p); + if (!offset) + return 0; + + /* +* We are about to tell the caller where they can locate the +* requested object. We better make sure the packfile is +* still here and can be accessed before supplying that +* answer, as it may have been deleted since the index was +* loaded! +*/ + if (!is_pack_valid(p)) + return 0; + e->offset = offset; + e->p = p; + hashcpy(e->sha1, sha1); + return 1; +} + +/* + * Iff a pack file contains the object named by sha1, return true and + * store its location to e. + */ +int find_pack_entry(const unsigned char *sha1, struct pack_entry *e) +{ + struct mru_entry *p; + + prepare_packed_git(); + if (!packed_git) + return 0; + + for (p = packed_git_mru->head; p; p = p->next) { + if (fill_pack_entry(sha1, e, p->item)) { + mru_mark(packed_git_mru, p); + return 1; + } + } + return 0; +} diff --git a/sha1_file.c b/sha1_file.c index 229358663..1a505eae5 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -1073,59 +1073,6 @@ int parse_sha1_header(const char *hdr, unsigned long *sizep) return parse_sha1_header_extended(hdr, , 0); } -static int fill_pack_entry(const unsigned char *sha1, - struct pack_entry *e, - struct packed_git *p) -{ - off_t offset; - - if (p->num_bad_objects) { - unsigned i; - for (i = 0; i < p->num_bad_objects; i++) - if (!hashcmp(sha1, p->bad_object_sha1 + 20 * i)) - return 0; - } - - offset = find_pack_entry_one(sha1, p); - if (!offset) - return 0; - - /* -* We are about to tell the caller where they can locate the -* requested object. We better make sure the packfile is -* still here and can be accessed before supplying that -* answer, as it may have been deleted since the index was -* loaded! -*/ - if (!is_pack_valid(p)) - return 0; - e->offset = offset; - e->p = p; - hashcpy(e->sha1, sha1); - return 1; -} - -/* - * Iff a pack file contains the object named by sha1, return true and - * store its location to e. - */ -static int find_pack_entry(const unsigned char *sha1, struct pack_entry *e) -{ - struct mru_entry *p; - - prepare_packed_git(); - if (!packed_git) - return 0; - - for (p = packed_git_mru->head; p; p = p->next) { - if (fill_pack_entry(sha1, e, p->item)) { - mru_mark(packed_git_mru, p); - return 1; - } - } - return 0; -} - static int sha1_loose_object_info(const unsigned char *sha1, struct object_info *oi, int flags) -- 2.14.0.434.g98096fd7a8-goog
[PATCH v2 01/25] pack: move pack name-related functions
Currently, sha1_file.c and cache.h contain many functions, both related to and unrelated to packfiles. This makes both files very large and causes an unclear separation of concerns. Create a new file, packfile.c, to hold all packfile-related functions currently in sha1_file.c, and designate pack.h to hold these packfile-related functions. In this commit, the pack name-related functions are moved. Subsequent commits will move the other functions. Signed-off-by: Jonathan Tan--- Makefile | 1 + builtin/pack-redundant.c | 1 + cache.h | 23 --- pack.h | 23 +++ packfile.c | 23 +++ sha1_file.c | 22 -- 6 files changed, 48 insertions(+), 45 deletions(-) create mode 100644 packfile.c diff --git a/Makefile b/Makefile index 461c845d3..5cdecaa17 100644 --- a/Makefile +++ b/Makefile @@ -816,6 +816,7 @@ LIB_OBJS += notes-merge.o LIB_OBJS += notes-utils.o LIB_OBJS += object.o LIB_OBJS += oidset.o +LIB_OBJS += packfile.o LIB_OBJS += pack-bitmap.o LIB_OBJS += pack-bitmap-write.o LIB_OBJS += pack-check.o diff --git a/builtin/pack-redundant.c b/builtin/pack-redundant.c index cb1df1c76..df36d10e7 100644 --- a/builtin/pack-redundant.c +++ b/builtin/pack-redundant.c @@ -7,6 +7,7 @@ */ #include "builtin.h" +#include "pack.h" #define BLKSIZE 512 diff --git a/cache.h b/cache.h index 71fe09264..1f0f47819 100644 --- a/cache.h +++ b/cache.h @@ -902,20 +902,6 @@ extern void check_repository_format(void); */ extern const char *sha1_file_name(const unsigned char *sha1); -/* - * Return the name of the (local) packfile with the specified sha1 in - * its name. The return value is a pointer to memory that is - * overwritten each time this function is called. - */ -extern char *sha1_pack_name(const unsigned char *sha1); - -/* - * Return the name of the (local) pack index file with the specified - * sha1 in its name. The return value is a pointer to memory that is - * overwritten each time this function is called. - */ -extern char *sha1_pack_index_name(const unsigned char *sha1); - /* * Return an abbreviated sha1 unique within this repository's object database. * The result will be at least `len` characters long, and will be NUL @@ -1648,15 +1634,6 @@ extern void pack_report(void); */ extern int odb_mkstemp(struct strbuf *template, const char *pattern); -/* - * Generate the filename to be used for a pack file with checksum "sha1" and - * extension "ext". The result is written into the strbuf "buf", overwriting - * any existing contents. A pointer to buf->buf is returned as a convenience. - * - * Example: odb_pack_name(out, sha1, "idx") => ".git/objects/pack/pack-1234..idx" - */ -extern char *odb_pack_name(struct strbuf *buf, const unsigned char *sha1, const char *ext); - /* * Create a pack .keep file named "name" (which should generally be the output * of odb_pack_name). Returns a file descriptor opened for writing, or -1 on diff --git a/pack.h b/pack.h index 8294341af..63bfde00c 100644 --- a/pack.h +++ b/pack.h @@ -101,4 +101,27 @@ extern int read_pack_header(int fd, struct pack_header *); extern struct sha1file *create_tmp_packfile(char **pack_tmp_name); extern void finish_tmp_packfile(struct strbuf *name_buffer, const char *pack_tmp_name, struct pack_idx_entry **written_list, uint32_t nr_written, struct pack_idx_option *pack_idx_opts, unsigned char sha1[]); +/* + * Generate the filename to be used for a pack file with checksum "sha1" and + * extension "ext". The result is written into the strbuf "buf", overwriting + * any existing contents. A pointer to buf->buf is returned as a convenience. + * + * Example: odb_pack_name(out, sha1, "idx") => ".git/objects/pack/pack-1234..idx" + */ +extern char *odb_pack_name(struct strbuf *buf, const unsigned char *sha1, const char *ext); + +/* + * Return the name of the (local) packfile with the specified sha1 in + * its name. The return value is a pointer to memory that is + * overwritten each time this function is called. + */ +extern char *sha1_pack_name(const unsigned char *sha1); + +/* + * Return the name of the (local) pack index file with the specified + * sha1 in its name. The return value is a pointer to memory that is + * overwritten each time this function is called. + */ +extern char *sha1_pack_index_name(const unsigned char *sha1); + #endif diff --git a/packfile.c b/packfile.c new file mode 100644 index 0..0d191dfd6 --- /dev/null +++ b/packfile.c @@ -0,0 +1,23 @@ +#include "cache.h" + +char *odb_pack_name(struct strbuf *buf, + const unsigned char *sha1, + const char *ext) +{ + strbuf_reset(buf); + strbuf_addf(buf, "%s/pack/pack-%s.%s", get_object_directory(), + sha1_to_hex(sha1), ext); + return buf->buf; +} + +char *sha1_pack_name(const unsigned char *sha1) +{ +
[PATCH v2 02/25] pack: move static state variables
sha1_file.c declares some static variables that store packfile-related state. Move them to packfile.c. They are temporarily made global, but subsequent commits will restore their scope back to static. Signed-off-by: Jonathan Tan--- pack.h | 9 + packfile.c | 14 ++ sha1_file.c | 13 - 3 files changed, 23 insertions(+), 13 deletions(-) diff --git a/pack.h b/pack.h index 63bfde00c..7fcd45f7b 100644 --- a/pack.h +++ b/pack.h @@ -124,4 +124,13 @@ extern char *sha1_pack_name(const unsigned char *sha1); */ extern char *sha1_pack_index_name(const unsigned char *sha1); +extern unsigned int pack_used_ctr; +extern unsigned int pack_mmap_calls; +extern unsigned int peak_pack_open_windows; +extern unsigned int pack_open_windows; +extern unsigned int pack_open_fds; +extern unsigned int pack_max_fds; +extern size_t peak_pack_mapped; +extern size_t pack_mapped; + #endif diff --git a/packfile.c b/packfile.c index 0d191dfd6..0f46e0617 100644 --- a/packfile.c +++ b/packfile.c @@ -1,4 +1,5 @@ #include "cache.h" +#include "mru.h" char *odb_pack_name(struct strbuf *buf, const unsigned char *sha1, @@ -21,3 +22,16 @@ char *sha1_pack_index_name(const unsigned char *sha1) static struct strbuf buf = STRBUF_INIT; return odb_pack_name(, sha1, "idx"); } + +unsigned int pack_used_ctr; +unsigned int pack_mmap_calls; +unsigned int peak_pack_open_windows; +unsigned int pack_open_windows; +unsigned int pack_open_fds; +unsigned int pack_max_fds; +size_t peak_pack_mapped; +size_t pack_mapped; +struct packed_git *packed_git; + +static struct mru packed_git_mru_storage; +struct mru *packed_git_mru = _git_mru_storage; diff --git a/sha1_file.c b/sha1_file.c index 7e511ce9e..4d95e21eb 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -682,19 +682,6 @@ static int has_loose_object(const unsigned char *sha1) return check_and_freshen(sha1, 0); } -static unsigned int pack_used_ctr; -static unsigned int pack_mmap_calls; -static unsigned int peak_pack_open_windows; -static unsigned int pack_open_windows; -static unsigned int pack_open_fds; -static unsigned int pack_max_fds; -static size_t peak_pack_mapped; -static size_t pack_mapped; -struct packed_git *packed_git; - -static struct mru packed_git_mru_storage; -struct mru *packed_git_mru = _git_mru_storage; - void pack_report(void) { fprintf(stderr, -- 2.14.0.434.g98096fd7a8-goog
[PATCH v2 00/25] Move exported packfile funcs to its own file
Here is the complete patch set. I have only moved the exported functions that operate with packfiles and their static helpers - for example, static functions like freshen_packed_object() that are used only by non-pack-specific functions are not moved. In the end, 3 functions needed to be made global. They are find_pack_entry(), mark_bad_packed_object(), and has_packed_and_bad(). Of the 3, find_pack_entry() is probably legitimately promoted. But I think that the latter two functions needing to be accessed from sha1_file.c points to a design that could be improved - they are only used when packed_object_info() detects corruption, and used for marking as bad and printing messages to the user respectively, which packed_object_info() should probably do itself. But I have not made this change in this patch set. (Other than the 3 functions above, there are some variables and functions that are temporarily made global, but reduced back to static when the wide scope is no longer needed.) Jonathan Tan (25): pack: move pack name-related functions pack: move static state variables pack: move pack_report() pack: move open_pack_index(), parse_pack_index() pack: move release_pack_memory() pack: move pack-closing functions pack: move use_pack() pack: move unuse_pack() pack: move add_packed_git() pack: move install_packed_git() pack: move {,re}prepare_packed_git and approximate_object_count pack: move unpack_object_header() pack: move get_size_from_delta() pack: move unpack_object_header() sha1_file: set whence in storage-specific info fn sha1_file: remove read_packed_sha1() pack: move packed_object_info(), unpack_entry() pack: move nth_packed_object_{sha1,oid} pack: move check_pack_index_ptr(), nth_packed_object_offset() pack: move find_pack_entry_one(), is_pack_valid() pack: move find_sha1_pack() pack: move find_pack_entry() and make it global pack: move has_sha1_pack() pack: move has_pack_index() pack: move for_each_packed_object() Makefile |1 + builtin/am.c |1 + builtin/cat-file.c |1 + builtin/clone.c |1 + builtin/count-objects.c |1 + builtin/fetch.c |1 + builtin/gc.c |1 + builtin/merge.c |1 + builtin/pack-redundant.c |1 + builtin/prune-packed.c |1 + cache.h | 122 +-- connected.c |1 + diff.c |1 + git-compat-util.h|2 - http-backend.c |1 + http-push.c |1 + http-walker.c|1 + pack.h | 137 +++ packfile.c | 1905 +++ path.c |1 + reachable.c |1 + revision.c |1 + server-info.c|1 + sha1_file.c | 2484 ++ sha1_name.c |1 + streaming.c |1 + 26 files changed, 2350 insertions(+), 2321 deletions(-) create mode 100644 packfile.c -- 2.14.0.434.g98096fd7a8-goog
[PATCH v2 08/25] pack: move unuse_pack()
Signed-off-by: Jonathan Tan--- cache.h | 1 - pack.h | 1 + packfile.c | 9 + sha1_file.c | 9 - 4 files changed, 10 insertions(+), 10 deletions(-) diff --git a/cache.h b/cache.h index dd9f9a9ae..4812f3a63 100644 --- a/cache.h +++ b/cache.h @@ -1637,7 +1637,6 @@ extern int odb_mkstemp(struct strbuf *template, const char *pattern); */ extern int odb_pack_keep(const char *name); -extern void unuse_pack(struct pack_window **); extern void clear_delta_base_cache(void); extern struct packed_git *add_packed_git(const char *path, size_t path_len, int local); diff --git a/pack.h b/pack.h index bf2b99bf9..3876e9ae6 100644 --- a/pack.h +++ b/pack.h @@ -149,5 +149,6 @@ extern void close_all_packs(void); extern int open_packed_git(struct packed_git *p); extern unsigned char *use_pack(struct packed_git *, struct pack_window **, off_t, unsigned long *); +extern void unuse_pack(struct pack_window **); #endif diff --git a/packfile.c b/packfile.c index 85cb65558..93526ea7b 100644 --- a/packfile.c +++ b/packfile.c @@ -596,3 +596,12 @@ unsigned char *use_pack(struct packed_git *p, *left = win->len - xsize_t(offset); return win->base + offset; } + +void unuse_pack(struct pack_window **w_cursor) +{ + struct pack_window *w = *w_cursor; + if (w) { + w->inuse_cnt--; + *w_cursor = NULL; + } +} diff --git a/sha1_file.c b/sha1_file.c index 8f17a07e9..12501ef06 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -717,15 +717,6 @@ void *xmmap(void *start, size_t length, return ret; } -void unuse_pack(struct pack_window **w_cursor) -{ - struct pack_window *w = *w_cursor; - if (w) { - w->inuse_cnt--; - *w_cursor = NULL; - } -} - static struct packed_git *alloc_packed_git(int extra) { struct packed_git *p = xmalloc(st_add(sizeof(*p), extra)); -- 2.14.0.434.g98096fd7a8-goog
[PATCH v2 14/25] pack: move unpack_object_header()
Signed-off-by: Jonathan Tan--- cache.h | 1 - pack.h | 1 + packfile.c | 26 ++ sha1_file.c | 26 -- 4 files changed, 27 insertions(+), 27 deletions(-) diff --git a/cache.h b/cache.h index e29918c75..1aea0 100644 --- a/cache.h +++ b/cache.h @@ -1661,7 +1661,6 @@ extern off_t find_pack_entry_one(const unsigned char *sha1, struct packed_git *) extern int is_pack_valid(struct packed_git *); extern void *unpack_entry(struct packed_git *, off_t, enum object_type *, unsigned long *); -extern int unpack_object_header(struct packed_git *, struct pack_window **, off_t *, unsigned long *); /* * Iterate over the files in the loose-object parts of the object diff --git a/pack.h b/pack.h index 69c92d8d2..5e3552392 100644 --- a/pack.h +++ b/pack.h @@ -169,5 +169,6 @@ unsigned long approximate_object_count(void); extern unsigned long unpack_object_header_buffer(const unsigned char *buf, unsigned long len, enum object_type *type, unsigned long *sizep); extern unsigned long get_size_from_delta(struct packed_git *, struct pack_window **, off_t); +extern int unpack_object_header(struct packed_git *, struct pack_window **, off_t *, unsigned long *); #endif diff --git a/packfile.c b/packfile.c index 511afad85..a4db78ea0 100644 --- a/packfile.c +++ b/packfile.c @@ -948,3 +948,29 @@ unsigned long get_size_from_delta(struct packed_git *p, /* Read the result size */ return get_delta_hdr_size(, delta_head+sizeof(delta_head)); } + +int unpack_object_header(struct packed_git *p, +struct pack_window **w_curs, +off_t *curpos, +unsigned long *sizep) +{ + unsigned char *base; + unsigned long left; + unsigned long used; + enum object_type type; + + /* use_pack() assures us we have [base, base + 20) available +* as a range that we can look at. (Its actually the hash +* size that is assured.) With our object header encoding +* the maximum deflated object size is 2^137, which is just +* insane, so we know won't exceed what we have been given. +*/ + base = use_pack(p, w_curs, *curpos, ); + used = unpack_object_header_buffer(base, left, , sizep); + if (!used) { + type = OBJ_BAD; + } else + *curpos += used; + + return type; +} diff --git a/sha1_file.c b/sha1_file.c index 7d354d9b6..f3bcdae17 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -1170,32 +1170,6 @@ static const unsigned char *get_delta_base_sha1(struct packed_git *p, return NULL; } -int unpack_object_header(struct packed_git *p, -struct pack_window **w_curs, -off_t *curpos, -unsigned long *sizep) -{ - unsigned char *base; - unsigned long left; - unsigned long used; - enum object_type type; - - /* use_pack() assures us we have [base, base + 20) available -* as a range that we can look at. (Its actually the hash -* size that is assured.) With our object header encoding -* the maximum deflated object size is 2^137, which is just -* insane, so we know won't exceed what we have been given. -*/ - base = use_pack(p, w_curs, *curpos, ); - used = unpack_object_header_buffer(base, left, , sizep); - if (!used) { - type = OBJ_BAD; - } else - *curpos += used; - - return type; -} - static int retry_bad_packed_offset(struct packed_git *p, off_t obj_offset) { int type; -- 2.14.0.434.g98096fd7a8-goog
[PATCH v2 03/25] pack: move pack_report()
Signed-off-by: Jonathan Tan--- cache.h | 2 -- pack.h | 2 ++ packfile.c | 24 sha1_file.c | 24 4 files changed, 26 insertions(+), 26 deletions(-) diff --git a/cache.h b/cache.h index 1f0f47819..c7f802e4a 100644 --- a/cache.h +++ b/cache.h @@ -1624,8 +1624,6 @@ unsigned long approximate_object_count(void); extern struct packed_git *find_sha1_pack(const unsigned char *sha1, struct packed_git *packs); -extern void pack_report(void); - /* * Create a temporary file rooted in the object database directory, or * die on failure. The filename is taken from "pattern", which should have the diff --git a/pack.h b/pack.h index 7fcd45f7b..6098bfe40 100644 --- a/pack.h +++ b/pack.h @@ -133,4 +133,6 @@ extern unsigned int pack_max_fds; extern size_t peak_pack_mapped; extern size_t pack_mapped; +extern void pack_report(void); + #endif diff --git a/packfile.c b/packfile.c index 0f46e0617..60d9fc3b0 100644 --- a/packfile.c +++ b/packfile.c @@ -35,3 +35,27 @@ struct packed_git *packed_git; static struct mru packed_git_mru_storage; struct mru *packed_git_mru = _git_mru_storage; + +#define SZ_FMT PRIuMAX +static inline uintmax_t sz_fmt(size_t s) { return s; } + +void pack_report(void) +{ + fprintf(stderr, + "pack_report: getpagesize()= %10" SZ_FMT "\n" + "pack_report: core.packedGitWindowSize = %10" SZ_FMT "\n" + "pack_report: core.packedGitLimit = %10" SZ_FMT "\n", + sz_fmt(getpagesize()), + sz_fmt(packed_git_window_size), + sz_fmt(packed_git_limit)); + fprintf(stderr, + "pack_report: pack_used_ctr= %10u\n" + "pack_report: pack_mmap_calls = %10u\n" + "pack_report: pack_open_windows= %10u / %10u\n" + "pack_report: pack_mapped = " + "%10" SZ_FMT " / %10" SZ_FMT "\n", + pack_used_ctr, + pack_mmap_calls, + pack_open_windows, peak_pack_open_windows, + sz_fmt(pack_mapped), sz_fmt(peak_pack_mapped)); +} diff --git a/sha1_file.c b/sha1_file.c index 4d95e21eb..0de39f480 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -29,9 +29,6 @@ #include "mergesort.h" #include "quote.h" -#define SZ_FMT PRIuMAX -static inline uintmax_t sz_fmt(size_t s) { return s; } - const unsigned char null_sha1[20]; const struct object_id null_oid; const struct object_id empty_tree_oid = { @@ -682,27 +679,6 @@ static int has_loose_object(const unsigned char *sha1) return check_and_freshen(sha1, 0); } -void pack_report(void) -{ - fprintf(stderr, - "pack_report: getpagesize()= %10" SZ_FMT "\n" - "pack_report: core.packedGitWindowSize = %10" SZ_FMT "\n" - "pack_report: core.packedGitLimit = %10" SZ_FMT "\n", - sz_fmt(getpagesize()), - sz_fmt(packed_git_window_size), - sz_fmt(packed_git_limit)); - fprintf(stderr, - "pack_report: pack_used_ctr= %10u\n" - "pack_report: pack_mmap_calls = %10u\n" - "pack_report: pack_open_windows= %10u / %10u\n" - "pack_report: pack_mapped = " - "%10" SZ_FMT " / %10" SZ_FMT "\n", - pack_used_ctr, - pack_mmap_calls, - pack_open_windows, peak_pack_open_windows, - sz_fmt(pack_mapped), sz_fmt(peak_pack_mapped)); -} - /* * Open and mmap the index file at path, perform a couple of * consistency checks, then record its information to p. Return 0 on -- 2.14.0.434.g98096fd7a8-goog
[PATCH v2 05/25] pack: move release_pack_memory()
The function unuse_one_window() needs to be temporarily made global. Its scope will be restored to static in a subsequent commit. Signed-off-by: Jonathan Tan--- git-compat-util.h | 2 -- pack.h| 4 packfile.c| 49 + sha1_file.c | 49 - 4 files changed, 53 insertions(+), 51 deletions(-) diff --git a/git-compat-util.h b/git-compat-util.h index db9c22de7..201056e2d 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -749,8 +749,6 @@ const char *inet_ntop(int af, const void *src, char *dst, size_t size); extern int git_atexit(void (*handler)(void)); #endif -extern void release_pack_memory(size_t); - typedef void (*try_to_free_t)(size_t); extern try_to_free_t set_try_to_free_routine(try_to_free_t); diff --git a/pack.h b/pack.h index 5be0ed42a..c16220586 100644 --- a/pack.h +++ b/pack.h @@ -143,4 +143,8 @@ extern int open_pack_index(struct packed_git *); extern struct packed_git *parse_pack_index(unsigned char *sha1, const char *idx_path); +extern int unuse_one_window(struct packed_git *current); + +extern void release_pack_memory(size_t); + #endif diff --git a/packfile.c b/packfile.c index 6edc43228..8daa74ad1 100644 --- a/packfile.c +++ b/packfile.c @@ -208,3 +208,52 @@ struct packed_git *parse_pack_index(unsigned char *sha1, const char *idx_path) return p; } + +static void scan_windows(struct packed_git *p, + struct packed_git **lru_p, + struct pack_window **lru_w, + struct pack_window **lru_l) +{ + struct pack_window *w, *w_l; + + for (w_l = NULL, w = p->windows; w; w = w->next) { + if (!w->inuse_cnt) { + if (!*lru_w || w->last_used < (*lru_w)->last_used) { + *lru_p = p; + *lru_w = w; + *lru_l = w_l; + } + } + w_l = w; + } +} + +int unuse_one_window(struct packed_git *current) +{ + struct packed_git *p, *lru_p = NULL; + struct pack_window *lru_w = NULL, *lru_l = NULL; + + if (current) + scan_windows(current, _p, _w, _l); + for (p = packed_git; p; p = p->next) + scan_windows(p, _p, _w, _l); + if (lru_p) { + munmap(lru_w->base, lru_w->len); + pack_mapped -= lru_w->len; + if (lru_l) + lru_l->next = lru_w->next; + else + lru_p->windows = lru_w->next; + free(lru_w); + pack_open_windows--; + return 1; + } + return 0; +} + +void release_pack_memory(size_t need) +{ + size_t cur = pack_mapped; + while (need >= (cur - pack_mapped) && unuse_one_window(NULL)) + ; /* nothing */ +} diff --git a/sha1_file.c b/sha1_file.c index 2e414f5f5..644876e4e 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -679,55 +679,6 @@ static int has_loose_object(const unsigned char *sha1) return check_and_freshen(sha1, 0); } -static void scan_windows(struct packed_git *p, - struct packed_git **lru_p, - struct pack_window **lru_w, - struct pack_window **lru_l) -{ - struct pack_window *w, *w_l; - - for (w_l = NULL, w = p->windows; w; w = w->next) { - if (!w->inuse_cnt) { - if (!*lru_w || w->last_used < (*lru_w)->last_used) { - *lru_p = p; - *lru_w = w; - *lru_l = w_l; - } - } - w_l = w; - } -} - -static int unuse_one_window(struct packed_git *current) -{ - struct packed_git *p, *lru_p = NULL; - struct pack_window *lru_w = NULL, *lru_l = NULL; - - if (current) - scan_windows(current, _p, _w, _l); - for (p = packed_git; p; p = p->next) - scan_windows(p, _p, _w, _l); - if (lru_p) { - munmap(lru_w->base, lru_w->len); - pack_mapped -= lru_w->len; - if (lru_l) - lru_l->next = lru_w->next; - else - lru_p->windows = lru_w->next; - free(lru_w); - pack_open_windows--; - return 1; - } - return 0; -} - -void release_pack_memory(size_t need) -{ - size_t cur = pack_mapped; - while (need >= (cur - pack_mapped) && unuse_one_window(NULL)) - ; /* nothing */ -} - static void mmap_limit_check(size_t length) { static size_t limit = 0; -- 2.14.0.434.g98096fd7a8-goog
[PATCH v2 13/25] pack: move get_size_from_delta()
Signed-off-by: Jonathan Tan--- cache.h | 1 - pack.h | 1 + packfile.c | 40 sha1_file.c | 39 --- 4 files changed, 41 insertions(+), 40 deletions(-) diff --git a/cache.h b/cache.h index 9c70759a6..e29918c75 100644 --- a/cache.h +++ b/cache.h @@ -1661,7 +1661,6 @@ extern off_t find_pack_entry_one(const unsigned char *sha1, struct packed_git *) extern int is_pack_valid(struct packed_git *); extern void *unpack_entry(struct packed_git *, off_t, enum object_type *, unsigned long *); -extern unsigned long get_size_from_delta(struct packed_git *, struct pack_window **, off_t); extern int unpack_object_header(struct packed_git *, struct pack_window **, off_t *, unsigned long *); /* diff --git a/pack.h b/pack.h index 4a7f88a38..69c92d8d2 100644 --- a/pack.h +++ b/pack.h @@ -168,5 +168,6 @@ extern void reprepare_packed_git(void); unsigned long approximate_object_count(void); extern unsigned long unpack_object_header_buffer(const unsigned char *buf, unsigned long len, enum object_type *type, unsigned long *sizep); +extern unsigned long get_size_from_delta(struct packed_git *, struct pack_window **, off_t); #endif diff --git a/packfile.c b/packfile.c index 6e4f1c6e3..511afad85 100644 --- a/packfile.c +++ b/packfile.c @@ -3,6 +3,7 @@ #include "pack.h" #include "dir.h" #include "mergesort.h" +#include "delta.h" char *odb_pack_name(struct strbuf *buf, const unsigned char *sha1, @@ -908,3 +909,42 @@ unsigned long unpack_object_header_buffer(const unsigned char *buf, *sizep = size; return used; } + +unsigned long get_size_from_delta(struct packed_git *p, + struct pack_window **w_curs, + off_t curpos) +{ + const unsigned char *data; + unsigned char delta_head[20], *in; + git_zstream stream; + int st; + + memset(, 0, sizeof(stream)); + stream.next_out = delta_head; + stream.avail_out = sizeof(delta_head); + + git_inflate_init(); + do { + in = use_pack(p, w_curs, curpos, _in); + stream.next_in = in; + st = git_inflate(, Z_FINISH); + curpos += stream.next_in - in; + } while ((st == Z_OK || st == Z_BUF_ERROR) && +stream.total_out < sizeof(delta_head)); + git_inflate_end(); + if ((st != Z_STREAM_END) && stream.total_out != sizeof(delta_head)) { + error("delta data unpack-initial failed"); + return 0; + } + + /* Examine the initial part of the delta to figure out +* the result size. +*/ + data = delta_head; + + /* ignore base size */ + get_delta_hdr_size(, delta_head+sizeof(delta_head)); + + /* Read the result size */ + return get_delta_hdr_size(, delta_head+sizeof(delta_head)); +} diff --git a/sha1_file.c b/sha1_file.c index 1f4b4ba2c..7d354d9b6 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -1099,45 +1099,6 @@ int parse_sha1_header(const char *hdr, unsigned long *sizep) return parse_sha1_header_extended(hdr, , 0); } -unsigned long get_size_from_delta(struct packed_git *p, - struct pack_window **w_curs, - off_t curpos) -{ - const unsigned char *data; - unsigned char delta_head[20], *in; - git_zstream stream; - int st; - - memset(, 0, sizeof(stream)); - stream.next_out = delta_head; - stream.avail_out = sizeof(delta_head); - - git_inflate_init(); - do { - in = use_pack(p, w_curs, curpos, _in); - stream.next_in = in; - st = git_inflate(, Z_FINISH); - curpos += stream.next_in - in; - } while ((st == Z_OK || st == Z_BUF_ERROR) && -stream.total_out < sizeof(delta_head)); - git_inflate_end(); - if ((st != Z_STREAM_END) && stream.total_out != sizeof(delta_head)) { - error("delta data unpack-initial failed"); - return 0; - } - - /* Examine the initial part of the delta to figure out -* the result size. -*/ - data = delta_head; - - /* ignore base size */ - get_delta_hdr_size(, delta_head+sizeof(delta_head)); - - /* Read the result size */ - return get_delta_hdr_size(, delta_head+sizeof(delta_head)); -} - static off_t get_delta_base(struct packed_git *p, struct pack_window **w_curs, off_t *curpos, -- 2.14.0.434.g98096fd7a8-goog
[PATCH v2 04/25] pack: move open_pack_index(), parse_pack_index()
alloc_packed_git() in packfile.c is duplicated from sha1_file.c. In a subsequent commit, alloc_packed_git() will be removed from sha1_file.c. Signed-off-by: Jonathan Tan--- builtin/count-objects.c | 1 + cache.h | 8 --- pack.h | 8 +++ packfile.c | 149 sha1_file.c | 140 - sha1_name.c | 1 + 6 files changed, 159 insertions(+), 148 deletions(-) diff --git a/builtin/count-objects.c b/builtin/count-objects.c index 1d82e61f2..185d3190a 100644 --- a/builtin/count-objects.c +++ b/builtin/count-objects.c @@ -10,6 +10,7 @@ #include "builtin.h" #include "parse-options.h" #include "quote.h" +#include "pack.h" static unsigned long garbage; static off_t size_garbage; diff --git a/cache.h b/cache.h index c7f802e4a..5d6839525 100644 --- a/cache.h +++ b/cache.h @@ -1603,8 +1603,6 @@ struct pack_entry { struct packed_git *p; }; -extern struct packed_git *parse_pack_index(unsigned char *sha1, const char *idx_path); - /* A hook to report invalid files in pack directory */ #define PACKDIR_FILE_PACK 1 #define PACKDIR_FILE_IDX 2 @@ -1639,12 +1637,6 @@ extern int odb_mkstemp(struct strbuf *template, const char *pattern); */ extern int odb_pack_keep(const char *name); -/* - * mmap the index file for the specified packfile (if it is not - * already mmapped). Return 0 on success. - */ -extern int open_pack_index(struct packed_git *); - /* * munmap the index file for the specified packfile (if it is * currently mmapped). diff --git a/pack.h b/pack.h index 6098bfe40..5be0ed42a 100644 --- a/pack.h +++ b/pack.h @@ -135,4 +135,12 @@ extern size_t pack_mapped; extern void pack_report(void); +/* + * mmap the index file for the specified packfile (if it is not + * already mmapped). Return 0 on success. + */ +extern int open_pack_index(struct packed_git *); + +extern struct packed_git *parse_pack_index(unsigned char *sha1, const char *idx_path); + #endif diff --git a/packfile.c b/packfile.c index 60d9fc3b0..6edc43228 100644 --- a/packfile.c +++ b/packfile.c @@ -1,5 +1,6 @@ #include "cache.h" #include "mru.h" +#include "pack.h" char *odb_pack_name(struct strbuf *buf, const unsigned char *sha1, @@ -59,3 +60,151 @@ void pack_report(void) pack_open_windows, peak_pack_open_windows, sz_fmt(pack_mapped), sz_fmt(peak_pack_mapped)); } + +/* + * Open and mmap the index file at path, perform a couple of + * consistency checks, then record its information to p. Return 0 on + * success. + */ +static int check_packed_git_idx(const char *path, struct packed_git *p) +{ + void *idx_map; + struct pack_idx_header *hdr; + size_t idx_size; + uint32_t version, nr, i, *index; + int fd = git_open(path); + struct stat st; + + if (fd < 0) + return -1; + if (fstat(fd, )) { + close(fd); + return -1; + } + idx_size = xsize_t(st.st_size); + if (idx_size < 4 * 256 + 20 + 20) { + close(fd); + return error("index file %s is too small", path); + } + idx_map = xmmap(NULL, idx_size, PROT_READ, MAP_PRIVATE, fd, 0); + close(fd); + + hdr = idx_map; + if (hdr->idx_signature == htonl(PACK_IDX_SIGNATURE)) { + version = ntohl(hdr->idx_version); + if (version < 2 || version > 2) { + munmap(idx_map, idx_size); + return error("index file %s is version %"PRIu32 +" and is not supported by this binary" +" (try upgrading GIT to a newer version)", +path, version); + } + } else + version = 1; + + nr = 0; + index = idx_map; + if (version > 1) + index += 2; /* skip index header */ + for (i = 0; i < 256; i++) { + uint32_t n = ntohl(index[i]); + if (n < nr) { + munmap(idx_map, idx_size); + return error("non-monotonic index %s", path); + } + nr = n; + } + + if (version == 1) { + /* +* Total size: +* - 256 index entries 4 bytes each +* - 24-byte entries * nr (20-byte sha1 + 4-byte offset) +* - 20-byte SHA1 of the packfile +* - 20-byte SHA1 file checksum +*/ + if (idx_size != 4*256 + nr * 24 + 20 + 20) { + munmap(idx_map, idx_size); + return error("wrong index v1 file size in %s", path); + } + } else if (version == 2) { + /* +* Minimum size: +
[PATCH v2 07/25] pack: move use_pack()
The function open_packed_git() needs to be temporarily made global. Its scope will be restored to static in a subsequent commit. Signed-off-by: Jonathan Tan--- cache.h | 1 - pack.h | 14 +-- packfile.c | 303 ++-- sha1_file.c | 285 streaming.c | 1 + 5 files changed, 299 insertions(+), 305 deletions(-) diff --git a/cache.h b/cache.h index 25a21a61f..dd9f9a9ae 100644 --- a/cache.h +++ b/cache.h @@ -1637,7 +1637,6 @@ extern int odb_mkstemp(struct strbuf *template, const char *pattern); */ extern int odb_pack_keep(const char *name); -extern unsigned char *use_pack(struct packed_git *, struct pack_window **, off_t, unsigned long *); extern void unuse_pack(struct pack_window **); extern void clear_delta_base_cache(void); extern struct packed_git *add_packed_git(const char *path, size_t path_len, int local); diff --git a/pack.h b/pack.h index fd4668528..bf2b99bf9 100644 --- a/pack.h +++ b/pack.h @@ -124,14 +124,7 @@ extern char *sha1_pack_name(const unsigned char *sha1); */ extern char *sha1_pack_index_name(const unsigned char *sha1); -extern unsigned int pack_used_ctr; -extern unsigned int pack_mmap_calls; -extern unsigned int peak_pack_open_windows; -extern unsigned int pack_open_windows; extern unsigned int pack_open_fds; -extern unsigned int pack_max_fds; -extern size_t peak_pack_mapped; -extern size_t pack_mapped; extern void pack_report(void); @@ -143,12 +136,9 @@ extern int open_pack_index(struct packed_git *); extern struct packed_git *parse_pack_index(unsigned char *sha1, const char *idx_path); -extern int unuse_one_window(struct packed_git *current); - extern void release_pack_memory(size_t); extern void close_pack_windows(struct packed_git *); -extern int close_pack_fd(struct packed_git *); /* * munmap the index file for the specified packfile (if it is * currently mmapped). @@ -156,4 +146,8 @@ extern int close_pack_fd(struct packed_git *); extern void close_pack_index(struct packed_git *); extern void close_all_packs(void); +extern int open_packed_git(struct packed_git *p); + +extern unsigned char *use_pack(struct packed_git *, struct pack_window **, off_t, unsigned long *); + #endif diff --git a/packfile.c b/packfile.c index c8e2dbdee..85cb65558 100644 --- a/packfile.c +++ b/packfile.c @@ -24,14 +24,14 @@ char *sha1_pack_index_name(const unsigned char *sha1) return odb_pack_name(, sha1, "idx"); } -unsigned int pack_used_ctr; -unsigned int pack_mmap_calls; -unsigned int peak_pack_open_windows; -unsigned int pack_open_windows; +static unsigned int pack_used_ctr; +static unsigned int pack_mmap_calls; +static unsigned int peak_pack_open_windows; +static unsigned int pack_open_windows; unsigned int pack_open_fds; -unsigned int pack_max_fds; -size_t peak_pack_mapped; -size_t pack_mapped; +static unsigned int pack_max_fds; +static size_t peak_pack_mapped; +static size_t pack_mapped; struct packed_git *packed_git; static struct mru packed_git_mru_storage; @@ -228,7 +228,7 @@ static void scan_windows(struct packed_git *p, } } -int unuse_one_window(struct packed_git *current) +static int unuse_one_window(struct packed_git *current) { struct packed_git *p, *lru_p = NULL; struct pack_window *lru_w = NULL, *lru_l = NULL; @@ -274,7 +274,7 @@ void close_pack_windows(struct packed_git *p) } } -int close_pack_fd(struct packed_git *p) +static int close_pack_fd(struct packed_git *p) { if (p->pack_fd < 0) return 0; @@ -311,3 +311,288 @@ void close_all_packs(void) else close_pack(p); } + +/* + * The LRU pack is the one with the oldest MRU window, preferring packs + * with no used windows, or the oldest mtime if it has no windows allocated. + */ +static void find_lru_pack(struct packed_git *p, struct packed_git **lru_p, struct pack_window **mru_w, int *accept_windows_inuse) +{ + struct pack_window *w, *this_mru_w; + int has_windows_inuse = 0; + + /* +* Reject this pack if it has windows and the previously selected +* one does not. If this pack does not have windows, reject +* it if the pack file is newer than the previously selected one. +*/ + if (*lru_p && !*mru_w && (p->windows || p->mtime > (*lru_p)->mtime)) + return; + + for (w = this_mru_w = p->windows; w; w = w->next) { + /* +* Reject this pack if any of its windows are in use, +* but the previously selected pack did not have any +* inuse windows. Otherwise, record that this pack +* has windows in use. +*/ + if (w->inuse_cnt) { + if (*accept_windows_inuse) + has_windows_inuse = 1; + else +
[PATCH] builtin/add: add detail to a 'cannot chmod' error message
In addition to adding the missing newline, add the x-ecutable bit 'mode change' character to the error message. This message now has the same form as similar messages output by 'update-index'. Signed-off-by: Ramsay Jones--- Hi Junio, This is v2 of the earlier "add a newline" patch. Thanks! ATB, Ramsay Jones builtin/add.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/builtin/add.c b/builtin/add.c index e888fb8c5..5d5773d5c 100644 --- a/builtin/add.c +++ b/builtin/add.c @@ -32,7 +32,7 @@ struct update_callback_data { int add_errors; }; -static void chmod_pathspec(struct pathspec *pathspec, int force_mode) +static void chmod_pathspec(struct pathspec *pathspec, char flip) { int i; @@ -42,8 +42,8 @@ static void chmod_pathspec(struct pathspec *pathspec, int force_mode) if (pathspec && !ce_path_match(ce, pathspec, NULL)) continue; - if (chmod_cache_entry(ce, force_mode) < 0) - fprintf(stderr, "cannot chmod '%s'", ce->name); + if (chmod_cache_entry(ce, flip) < 0) + fprintf(stderr, "cannot chmod %cx '%s'\n", flip, ce->name); } } -- 2.14.0
Re: [PATCH] t1200: remove t1200-tutorial.sh
Jonathan Nieder wrote: > Stefan Beller wrote: >>> Stefan Beller wrote: Nowadays there are better tutorials out there such as "Git from bottom up" or others, easily found online. Additionally to that a tutorial in our test suite is not as easy to discover as e.g. online tutorials. [...] >> 2ae6c70674 (Adapt tutorial to cygwin and add test case, 2005-10-13) >> seemed to imply that it was testing some part for Documentation/tutorial.txt >> though. > > Oh, good point. > > v1.2.0~121 (New tutorial, 2006-01-22) means that the test is no longer > testing what is in the tutorial in any meaningful sense. That's why > my search for "git whatchanged -p --root" in manpages didn't find > anything. Correction: the tutorial is now called gitcore-tutorial and mostly survives. A search for -p --root failed because of v1.5.5.1~19^2 (core-tutorial.txt: Fix showing the current behaviour, 2008-04-10). That said, the conclusion that this test has mostly bitrotted as far as its original purpose goes still applies. An alternative method of addressing the goal you described would be to fuzz object-id shaped things out of the sample output. I don't have a strong preference, given how little this test contributes to coverage (as you mentioned) and how difficult it is to make it continue to match the documentation it is trying to test. Thanks and sorry for the confusion, Jonathan
Re: [PATCH] t1200: remove t1200-tutorial.sh
Stefan Beller wrote: > On Tue, Aug 8, 2017 at 5:07 PM, Jonathan Niederwrote: >> Stefan Beller wrote: >>> Nowadays there are better tutorials out there such as "Git from bottom up" >>> or others, easily found online. Additionally to that a tutorial in our >>> test suite is not as easy to discover as e.g. online tutorials. [...] >>> Signed-off-by: Stefan Beller >>> --- >>> t/t1200-tutorial.sh | 268 >>> >>> 1 file changed, 268 deletions(-) >>> delete mode 100755 t/t1200-tutorial.sh >> >> Interesting. When I first saw the diffstat I assumed you were talking >> about a test that validates the examples in some manpage are correct. >> But this is not that. [...] > 2ae6c70674 (Adapt tutorial to cygwin and add test case, 2005-10-13) > seemed to imply that it was testing some part for Documentation/tutorial.txt > though. Oh, good point. v1.2.0~121 (New tutorial, 2006-01-22) means that the test is no longer testing what is in the tutorial in any meaningful sense. That's why my search for "git whatchanged -p --root" in manpages didn't find anything. So what your patch does still seems reasonable (we have lived fine without a test validating the examples in that tutorial, and if we really want a test validating the examples then we should find a way to automatically extract it), but the description is misleading. With a corrected description, my Reviewed-by would apply. Thanks for catching it. Jonathan
Re: [PATCH] t1200: remove t1200-tutorial.sh
On Tue, Aug 8, 2017 at 5:07 PM, Jonathan Niederwrote: > Hi, > > Stefan Beller wrote: > >> Nowadays there are better tutorials out there such as "Git from bottom up" >> or others, easily found online. Additionally to that a tutorial in our >> test suite is not as easy to discover as e.g. online tutorials. >> >> This test/tutorial was discovered by the patch author in the effort to >> migrate our tests in preparation to switch the hashing function. >> Transforming this tutorial to be agnostic of the underlying hash function >> would hurt its readability, hence being even less useful as a tutorial. >> >> Instead delete this test as >> (a) the functionality is tested elsewhere as well and >> (b) reducing the test suite to its core improves performance, which >> aids developers in keeping their development velocity. >> >> Signed-off-by: Stefan Beller >> --- >> t/t1200-tutorial.sh | 268 >> >> 1 file changed, 268 deletions(-) >> delete mode 100755 t/t1200-tutorial.sh > > Interesting. When I first saw the diffstat I assumed you were talking > about a test that validates the examples in some manpage are correct. > But this is not that. > > There indeed appear to be other good tests for these commands, even > "git whatchanged", so for what it's worth, > > Reviewed-by: Jonathan Nieder > > Thanks. 2ae6c70674 (Adapt tutorial to cygwin and add test case, 2005-10-13) seemed to imply that it was testing some part for Documentation/tutorial.txt though.
Re: [PATCH] t1200: remove t1200-tutorial.sh
Hi, Stefan Beller wrote: > Nowadays there are better tutorials out there such as "Git from bottom up" > or others, easily found online. Additionally to that a tutorial in our > test suite is not as easy to discover as e.g. online tutorials. > > This test/tutorial was discovered by the patch author in the effort to > migrate our tests in preparation to switch the hashing function. > Transforming this tutorial to be agnostic of the underlying hash function > would hurt its readability, hence being even less useful as a tutorial. > > Instead delete this test as > (a) the functionality is tested elsewhere as well and > (b) reducing the test suite to its core improves performance, which > aids developers in keeping their development velocity. > > Signed-off-by: Stefan Beller> --- > t/t1200-tutorial.sh | 268 > > 1 file changed, 268 deletions(-) > delete mode 100755 t/t1200-tutorial.sh Interesting. When I first saw the diffstat I assumed you were talking about a test that validates the examples in some manpage are correct. But this is not that. There indeed appear to be other good tests for these commands, even "git whatchanged", so for what it's worth, Reviewed-by: Jonathan Nieder Thanks.
Re: reftable [v6]: new ref storage format
On Tue, Aug 8, 2017 at 4:34 PM, Junio C Hamanowrote: > Shawn Pearce writes: > >> Given that the index can now also be multi-level, I don't expect to >> see a 2G index. A 2G index forces the reader to load the entire 2G to >> take advantage of the restart table. It may be more efficient for such >> a reader to have had the writer make a mutli-level index, instead of a >> single monster index block. And so perhaps the writer shouldn't make a >> 2G index block that she is forced to buffer. :) > > Ah, OK, then it is sensible to have all table blocks to have the > same format, and restart at the beginning to help readers would be a > fine choice. For the same "let's make them as consistent" sake, I > am tempted to suggest that we lift "the index block can be 2G" and > have it also be within uint_24(), perhaps? Otherwise the readers > would have to read (or mmap) the whole 2G. Gah. I just finished moving the restart table back to the end of the block. :) However, I think I can agree with the index fitting into the uint24 size of 15M, and asking writers making an index that exceeds that to use multi-level indexing.
Re: [PATCH] builtin/add: add a missing newline to an stderr message
On 08/08/17 22:45, René Scharfe wrote: > Am 08.08.2017 um 23:36 schrieb Ramsay Jones: >> >> Signed-off-by: Ramsay Jones>> --- >> >> Hi Junio, >> >> I noticed this while looking into the t3700 failure on cygwin tonight. >> Also, I couldn't decide whether or not to add the i18n '_()' brackets >> around the message. In the end I didn't, but will happily add them >> if you think I should. >> >> Thanks! >> >> ATB, >> Ramsay Jones >> >> builtin/add.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/builtin/add.c b/builtin/add.c >> index e888fb8c5..385b53ae7 100644 >> --- a/builtin/add.c >> +++ b/builtin/add.c >> @@ -43,7 +43,7 @@ static void chmod_pathspec(struct pathspec *pathspec, int >> force_mode) >> continue; >> >> if (chmod_cache_entry(ce, force_mode) < 0) >> -fprintf(stderr, "cannot chmod '%s'", ce->name); >> +fprintf(stderr, "cannot chmod '%s'\n", ce->name); >> } >> } >> > > FYI: I brought this up yesterday in the original thread, along with a > few other observations: > > https://public-inbox.org/git/3c61d9f6-e0fd-22a4-68e0-89fd9ce9b...@web.de/ Ah, I missed that. Hmm, I just looked at the code in builtin/update-index.c. Yes, it would probably be a good idea to harmonize the messages - but just where did 'flip' come from? ;-) ATB, Ramsay Jones
Re: reftable [v6]: new ref storage format
On Mon, Aug 7, 2017 at 11:30 AM, Shawn Pearcewrote: > On Mon, Aug 7, 2017 at 11:27 AM, Stefan Beller wrote: >> On Sun, Aug 6, 2017 at 6:47 PM, Shawn Pearce wrote: >>> 6th iteration of the reftable storage format. >>> >>> You can read a rendered version of this here: >>> https://googlers.googlesource.com/sop/jgit/+/reftable/Documentation/technical/reftable.md >>> >>> The index may be organized into a multi-level index, where ... >>> which may in turn point to either index blocks (3rd level) or ref blocks >>> (leaf level). >> >> So we allow 3 levels at most? > > No, its just an example. Large ref sets with small block size need 4 > levels. Or more. A malicious (or buggy) writer can produce indexes pointing to each other producing a circle. (Who would do that?) A reader should - instead of segfaulting due to unbounded recursion or being stuck in an infinite loop - ignore the indexes in this case and fallback to the slow non-indexed behavior, i.e. while the file format allows for unbounded levels, a reader should not.
[PATCH] t1200: remove t1200-tutorial.sh
Nowadays there are better tutorials out there such as "Git from bottom up" or others, easily found online. Additionally to that a tutorial in our test suite is not as easy to discover as e.g. online tutorials. This test/tutorial was discovered by the patch author in the effort to migrate our tests in preparation to switch the hashing function. Transforming this tutorial to be agnostic of the underlying hash function would hurt its readability, hence being even less useful as a tutorial. Instead delete this test as (a) the functionality is tested elsewhere as well and (b) reducing the test suite to its core improves performance, which aids developers in keeping their development velocity. Signed-off-by: Stefan Beller--- t/t1200-tutorial.sh | 268 1 file changed, 268 deletions(-) delete mode 100755 t/t1200-tutorial.sh diff --git a/t/t1200-tutorial.sh b/t/t1200-tutorial.sh deleted file mode 100755 index 397ccb6909..00 --- a/t/t1200-tutorial.sh +++ /dev/null @@ -1,268 +0,0 @@ -#!/bin/sh -# -# Copyright (c) 2005 Johannes Schindelin -# - -test_description='A simple turial in the form of a test case' - -. ./test-lib.sh - -test_expect_success 'blob' ' - echo "Hello World" > hello && - echo "Silly example" > example && - - git update-index --add hello example && - - test blob = "$(git cat-file -t 557db03)" -' - -test_expect_success 'blob 557db03' ' - test "Hello World" = "$(git cat-file blob 557db03)" -' - -echo "It's a new day for git" >>hello -cat > diff.expect << EOF -diff --git a/hello b/hello -index 557db03..263414f 100644 a/hello -+++ b/hello -@@ -1 +1,2 @@ - Hello World -+It's a new day for git -EOF - -test_expect_success 'git diff-files -p' ' - git diff-files -p > diff.output && - test_cmp diff.expect diff.output -' - -test_expect_success 'git diff' ' - git diff > diff.output && - test_cmp diff.expect diff.output -' - -test_expect_success 'tree' ' - tree=$(git write-tree 2>/dev/null) && - test 8988da15d077d4829fc51d8544c097def6644dbb = $tree -' - -test_expect_success 'git diff-index -p HEAD' ' - test_tick && - tree=$(git write-tree) && - commit=$(echo "Initial commit" | git commit-tree $tree) && - git update-ref HEAD $commit && - git diff-index -p HEAD > diff.output && - test_cmp diff.expect diff.output -' - -test_expect_success 'git diff HEAD' ' - git diff HEAD > diff.output && - test_cmp diff.expect diff.output -' - -cat > whatchanged.expect << EOF -commit VARIABLE -Author: VARIABLE -Date: VARIABLE - -Initial commit - -diff --git a/example b/example -new file mode 100644 -index 000..f24c74a /dev/null -+++ b/example -@@ -0,0 +1 @@ -+Silly example -diff --git a/hello b/hello -new file mode 100644 -index 000..557db03 /dev/null -+++ b/hello -@@ -0,0 +1 @@ -+Hello World -EOF - -test_expect_success 'git whatchanged -p --root' ' - git whatchanged -p --root | - sed -e "1s/^\(.\{7\}\).\{40\}/\1VARIABLE/" \ - -e "2,3s/^\(.\{8\}\).*$/\1VARIABLE/" \ - > whatchanged.output && - test_cmp whatchanged.expect whatchanged.output -' - -test_expect_success 'git tag my-first-tag' ' - git tag my-first-tag && - test_cmp .git/refs/heads/master .git/refs/tags/my-first-tag -' - -test_expect_success 'git checkout -b mybranch' ' - git checkout -b mybranch && - test_cmp .git/refs/heads/master .git/refs/heads/mybranch -' - -cat > branch.expect < branch.output && - test_cmp branch.expect branch.output -' - -test_expect_success 'git resolve now fails' ' - git checkout mybranch && - echo "Work, work, work" >>hello && - test_tick && - git commit -m "Some work." -i hello && - - git checkout master && - - echo "Play, play, play" >>hello && - echo "Lots of fun" >>example && - test_tick && - git commit -m "Some fun." -i hello example && - - test_must_fail git merge -m "Merge work in mybranch" mybranch -' - -cat > hello << EOF -Hello World -It's a new day for git -Play, play, play -Work, work, work -EOF - -cat > show-branch.expect << EOF -* [master] Merge work in mybranch - ! [mybranch] Some work. --- -- [master] Merge work in mybranch -*+ [mybranch] Some work. -* [master^] Some fun. -EOF - -test_expect_success 'git show-branch' ' - test_tick && - git commit -m "Merge work in mybranch" -i hello && - git show-branch --topo-order --more=1 master mybranch \ - > show-branch.output && - test_cmp show-branch.expect show-branch.output -' - -cat > resolve.expect << EOF -Updating VARIABLE..VARIABLE -FASTFORWARD (no commit created; -m option ignored) - example | 1 + - hello | 1 + - 2 files changed, 2 insertions(+) -EOF - -test_expect_success 'git resolve' ' - git checkout mybranch && - git merge -m "Merge upstream changes." master |
Re: reftable [v6]: new ref storage format
Shawn Pearcewrites: > Given that the index can now also be multi-level, I don't expect to > see a 2G index. A 2G index forces the reader to load the entire 2G to > take advantage of the restart table. It may be more efficient for such > a reader to have had the writer make a mutli-level index, instead of a > single monster index block. And so perhaps the writer shouldn't make a > 2G index block that she is forced to buffer. :) Ah, OK, then it is sensible to have all table blocks to have the same format, and restart at the beginning to help readers would be a fine choice. For the same "let's make them as consistent" sake, I am tempted to suggest that we lift "the index block can be 2G" and have it also be within uint_24(), perhaps? Otherwise the readers would have to read (or mmap) the whole 2G.
Re: [PATCH] sha1_file: avoid comparison if no packed hash matches the first byte
On Tue, Aug 08, 2017 at 06:52:31PM -0400, Jeff King wrote: > > Interesting. I see that we still have the conditional code to call > > out to sha1-lookup.c::sha1_entry_pos(). Do we need a similar change > > over there, I wonder? Alternatively, as we have had the experimental > > sha1-lookup.c::sha1_entry_pos() long enough without anybody using it, > > perhaps we should write it off as a failed experiment and retire it? > > There is also sha1_pos(), which seems to have the same problem (and is > used in several places). Actually, I take it back. The problem happens when we enter the loop with no entries to look at. But both sha1_pos() and sha1_entry_pos() return early before hitting their do-while loops in that case. -Peff
Re: [PATCH] sha1_file: avoid comparison if no packed hash matches the first byte
On Tue, Aug 08, 2017 at 03:43:13PM -0700, Junio C Hamano wrote: > > @@ -2812,7 +2812,7 @@ off_t find_pack_entry_one(const unsigned char *sha1, > > hi = mi; > > else > > lo = mi+1; > > - } while (lo < hi); > > + } > > return 0; > > } > > Interesting. I see that we still have the conditional code to call > out to sha1-lookup.c::sha1_entry_pos(). Do we need a similar change > over there, I wonder? Alternatively, as we have had the experimental > sha1-lookup.c::sha1_entry_pos() long enough without anybody using it, > perhaps we should write it off as a failed experiment and retire it? There is also sha1_pos(), which seems to have the same problem (and is used in several places). -Peff
Re: [PATCH] sha1_file: avoid comparison if no packed hash matches the first byte
René Scharfewrites: > find_pack_entry_one() uses the fan-out table of pack indexes to find out > which entries match the first byte of the searched hash and does a > binary search on this subset of the main index table. > > If there are no matching entries then lo and hi will have the same > value. The binary search still starts and compares the hash of the > following entry (which has a non-matching first byte, so won't cause any > trouble), or whatever comes after the sorted list of entries. > > The probability of that stray comparison matching by mistake is low, but > let's not take any chances and check when entering the binary search > loop if we're actually done already. > > Signed-off-by: Rene Scharfe > --- > sha1_file.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/sha1_file.c b/sha1_file.c > index b60ae15f70..11ee69a99d 100644 > --- a/sha1_file.c > +++ b/sha1_file.c > @@ -2799,7 +2799,7 @@ off_t find_pack_entry_one(const unsigned char *sha1, > return nth_packed_object_offset(p, pos); > } > > - do { > + while (lo < hi) { > unsigned mi = (lo + hi) / 2; > int cmp = hashcmp(index + mi * stride, sha1); > > @@ -2812,7 +2812,7 @@ off_t find_pack_entry_one(const unsigned char *sha1, > hi = mi; > else > lo = mi+1; > - } while (lo < hi); > + } > return 0; > } Interesting. I see that we still have the conditional code to call out to sha1-lookup.c::sha1_entry_pos(). Do we need a similar change over there, I wonder? Alternatively, as we have had the experimental sha1-lookup.c::sha1_entry_pos() long enough without anybody using it, perhaps we should write it off as a failed experiment and retire it?
Re: [PATCH] sha1_file: avoid comparison if no packed hash matches the first byte
René Scharfe wrote: > find_pack_entry_one() uses the fan-out table of pack indexes to find out > which entries match the first byte of the searched hash and does a > binary search on this subset of the main index table. > > If there are no matching entries then lo and hi will have the same > value. The binary search still starts and compares the hash of the > following entry (which has a non-matching first byte, so won't cause any > trouble), or whatever comes after the sorted list of entries. > > The probability of that stray comparison matching by mistake is low, but > let's not take any chances and check when entering the binary search > loop if we're actually done already. > > Signed-off-by: Rene Scharfe> --- > sha1_file.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) Thanks for a clear explanation. Sanity checking: is this correct in the sha1[0] == 0 case? In that case, we have lo = 0, hi = the 0th offset from the fanout table. The offsets in the fanout table are defined as "the number of objects in the corresponding pack, the first byte of whose object name is less than or equal to N." So hi == lo would mean there are no objects with id starting with 0, as hoped. Or in other words, the [lo, hi) interval we're trying to search is indeed a half-open interval. Reviewed-by: Jonathan Nieder
Re: [PATCH] t4062: stop using repetition in regex
Am 09.08.2017 um 00:26 schrieb Junio C Hamano: > Junio C Hamanowrites: > >> So I find Dscho's concern quite valid, even though I do believe you >> when you say the code somehow segfaults. I just can not tell >> how/why it would segfault, though---it is possible that regexec() >> implementation is stupid and does not realize that it can return early >> reporting success without looking at the rest of the buffer, but >> somehow I find it unlikely. >> >> Puzzled. >> >>> You get different results? How is that possible? The search string is >>> NUL-terminated in each case, while the point of the test is that the >>> file contents isn't, right? > > Intellectual curiosity tells me we may want to find out why it > fails, but in the meantime, I think replacing the test with "0$" to > force the scanner to find either the end of line or the end of the > buffer may be a good workaround. We do not have to care how many of > random bytes are in front of the last "0" in order to ensure that > the regexec_buf() does not overstep to 4097th byte, while seeing > that regexec() that does not know how long the haystack is has to do > so, no? Our regexec() calls strlen() (see my other reply). Using "0$" looks like the best option to me. René
Re: [PATCH] builtin/add: add a missing newline to an stderr message
Ramsay Jones wrote: > Signed-off-by: Ramsay JonesReviewed-by: Jonathan Nieder Thanks for catching it.
Re: [PATCH] builtin/add: add a missing newline to an stderr message
Junio C Hamano wrote: > René Scharfewrites: >>> diff --git a/builtin/add.c b/builtin/add.c >>> index e888fb8c5..385b53ae7 100644 >>> --- a/builtin/add.c >>> +++ b/builtin/add.c >>> @@ -43,7 +43,7 @@ static void chmod_pathspec(struct pathspec *pathspec, int >>> force_mode) >>> continue; >>> >>> if (chmod_cache_entry(ce, force_mode) < 0) >>> - fprintf(stderr, "cannot chmod '%s'", ce->name); >>> + fprintf(stderr, "cannot chmod '%s'\n", ce->name); >>> } >>> } >> >> FYI: I brought this up yesterday in the original thread, along with a >> few other observations: >> >> https://public-inbox.org/git/3c61d9f6-e0fd-22a4-68e0-89fd9ce9b...@web.de/ >> >> Not sure if the discussion can or should be revived after all this >> time, though; just sending patches like yours might be the way to go. > > Thanks, so it should become > > fprintf(stderr, "cannot chmod %c '%s'\n", force_mode, ce->name); > > in the final version to be queued? I don't believe the force_mode without an 'x' provides a clear signal to the end user. Perhaps you meant %cx? Thanks, Jonathan
Re: reftable [v6]: new ref storage format
On Tue, Aug 8, 2017 at 12:25 PM, Junio C Hamanowrote: > Shawn Pearce writes: > >> For `log_type = 0x4..0x7` the `log_chained` section is used instead to >> compress information that already appeared in a prior log record. The >> `log_chained` always includes `old_id` for this record, as `new_id` is >> implied by the prior (by file order, more recent) record's `old_id`. >> >> The `not_same_committer` block appears if `log_type & 0x1` is true, >> `not_same_message` block appears if `log_type & 0x2` is true. When >> one of these blocks is missing, its values are implied by the prior >> (more recent) log record. > > Two comments. > > * not-same-committer would be what I would use when I switch >timezones, even if I stay to be me, right? Correct. This is based on the theory that the timezone in a reflog is actually the system timezone, not your timezone. If you push to a remote system, that system's reflog will be using that system's timezone, not your timezone. So you aren't really that different, and we can compress the timezone part away. Also, if you do move timezones, you are likely to remain in that timezone for some period of time, and such we can compress many log records again with the same timezone+name+email. Its ancient history from my research with "pack v4", but people don't really change timezones very often in the Git committer data. I suspect its even more true with reflog data. > I am just wondering >if it is clear to everybody that "committer" in that phrase is a >short-hand for "committer information other than the timestamp". Maybe not. I will try to come up with another shorthand name for this. > * Should the set of entries that are allowed to use of "chained" >log be related to the set of entries that appear in the restart >table in any way? For a reader that scans starting at a restart >point, it would be very cumbersome if the entry were chained from >the previous entry, as it would force it to backtrack entries to >find the first non-chained log entry. A simple "log_chained must >not be used for an entry that appear in the restart table" rule >would solve that, but I didn't see it in the document. Good catch! This is implemented as you described in JGit (for the reasons you described), but not documented. I'll fix it.
Re: [PATCH] t4062: stop using repetition in regex
Am 09.08.2017 um 00:09 schrieb Junio C Hamano: > René Scharfewrites: > >> Am 08.08.2017 um 16:49 schrieb Johannes Schindelin: >>> Hi René, >>> >>> On Tue, 8 Aug 2017, René Scharfe wrote: >>> OpenBSD's regex library has a repetition limit (RE_DUP_MAX) of 255. That's the minimum acceptable value according to POSIX. In t4062 we use 4096 repetitions in the test "-G matches", though, causing it to fail. Do the same as the test "-S --pickaxe-regex" in the same file and search for a single zero instead. That still suffices to trigger the buffer overrun in older versions (checked with b7d36ffca02^ and --valgrind on Linux), simplifies the test a bit, and avoids exceeding OpenBSD's limit. >>> >>> I am afraid not. The 4096 is precisely the page size required to trigger >>> the bug on Windows against which this regression test tries to safeguard. >> >> Checked with b7d36ffca02^ on MinGW now as well and found that it >> segfaults with the proposed change ten out of ten times. > > That is a strange result but I can believe it. > > The reason why I find it strange is that the test wants to run > diff_grep() in diffcore-pickaxe.c with one == NULL (because we are > looking at difference between an initial empty commit and the > current commit that adds 4096-zeroes.txt file), which makes the > current blob (i.e. a page of '0' that may be mmap(2)ed without > trailing NUL to terminate it) scanned via regexec() to look for the > search string. > > I can understand why Dscho originally did "^0{4096}$"; it is to > ensure that the whole page is scanned for 4096 zeroes and then the > code would somehow make sure that there is no more byte until the > end of line, which will force regexec (but not regexec_buf that knows > where the buffer ends) to look at the 4097th byte that does not exist. > > If you change the pattern to just "0" that is not anchored, I'd expect > regexec() that does not know how big the haystack is to just find "0" > at the first byte and happily return without segfaulting (because it > does not even have to scan the remainder of the buffer). > > So I find Dscho's concern quite valid, even though I do believe you > when you say the code somehow segfaults. I just can not tell > how/why it would segfault, though---it is possible that regexec() > implementation is stupid and does not realize that it can return early > reporting success without looking at the rest of the buffer, but > somehow I find it unlikely. > > Puzzled. Good point. Valgrind reports: ==57466== Process terminating with default action of signal 11 (SIGSEGV): dumping core ==57466== Access not within mapped region at address 0x4027000 ==57466==at 0x4C2EDF4: strlen (vg_replace_strmem.c:458) ==57466==by 0x59D9F76: regexec@@GLIBC_2.3.4 (regexec.c:240) ==57466==by 0x54D96E: diff_grep (diffcore-pickaxe.c:0) ==57466==by 0x54DAC3: pickaxe_match (diffcore-pickaxe.c:149) And you can see in our version in compat/regex/regexec.c:241 that the first thing regexec() does is calling strlen(). So to avoid depending on that implementation detail we'd need to use a search string that won't be found (e.g. "1") or with unlimited repetition (e.g. "0*"), right? René
Re: reftable [v6]: new ref storage format
On Tue, Aug 8, 2017 at 12:01 PM, Junio C Hamanowrote: > I notice that you raised the location of restart table within a > block in this iteration (or maybe it happened in v5). > > This forces you to hold all contents in core before the first byte > is written out. You start from the first entry (which will become > the first restart entry), emit a handful as prefix compressed > entries, emit a full entry (which will become the next restart > entry), ... until you have enough to fill both the data and the > restart table, then start writing from the header (which needs the > length of the block), restart table and then data. > > I think it is OK to do so for the blocks whose size is limited to > 16M, but I wonder if it is sensible to do the same for the index > block whose limit is 2G. If you keep the restart table after the > data, then you could stream out the entries as you emit, write the > restart table, and then seek back to fix the length in the header, > without holding the 2G in core, no? Yes. I'm torn on the ordering: restart table first or restart table last. The advantage of it first is the reader can immediately work with it, without necessarily touching the rest of the block. The disadvantage is a writer can only stream at block sizes, as the writer is forced to buffer the entire block. As it happens my implementation in JGit buffers the entire block anyway, so this didn't really factor as an issue for me. Given that the index can now also be multi-level, I don't expect to see a 2G index. A 2G index forces the reader to load the entire 2G to take advantage of the restart table. It may be more efficient for such a reader to have had the writer make a mutli-level index, instead of a single monster index block. And so perhaps the writer shouldn't make a 2G index block that she is forced to buffer. :) Perhaps I'll move it back to the tail of the block. I can see the streaming writer code is maybe more straightforward that way.
Re: [PATCH] t4062: stop using repetition in regex
Junio C Hamanowrites: > So I find Dscho's concern quite valid, even though I do believe you > when you say the code somehow segfaults. I just can not tell > how/why it would segfault, though---it is possible that regexec() > implementation is stupid and does not realize that it can return early > reporting success without looking at the rest of the buffer, but > somehow I find it unlikely. > > Puzzled. > >> You get different results? How is that possible? The search string is >> NUL-terminated in each case, while the point of the test is that the >> file contents isn't, right? Intellectual curiosity tells me we may want to find out why it fails, but in the meantime, I think replacing the test with "0$" to force the scanner to find either the end of line or the end of the buffer may be a good workaround. We do not have to care how many of random bytes are in front of the last "0" in order to ensure that the regexec_buf() does not overstep to 4097th byte, while seeing that regexec() that does not know how long the haystack is has to do so, no?
Re: [PATCH] builtin/add: add a missing newline to an stderr message
René Scharfewrites: >> diff --git a/builtin/add.c b/builtin/add.c >> index e888fb8c5..385b53ae7 100644 >> --- a/builtin/add.c >> +++ b/builtin/add.c >> @@ -43,7 +43,7 @@ static void chmod_pathspec(struct pathspec *pathspec, int >> force_mode) >> continue; >> >> if (chmod_cache_entry(ce, force_mode) < 0) >> -fprintf(stderr, "cannot chmod '%s'", ce->name); >> +fprintf(stderr, "cannot chmod '%s'\n", ce->name); >> } >> } > > FYI: I brought this up yesterday in the original thread, along with a > few other observations: > > https://public-inbox.org/git/3c61d9f6-e0fd-22a4-68e0-89fd9ce9b...@web.de/ > > Not sure if the discussion can or should be revived after all this > time, though; just sending patches like yours might be the way to go. Thanks, so it should become fprintf(stderr, "cannot chmod %c '%s'\n", force_mode, ce->name); in the final version to be queued?
Re: [PATCH] t4062: stop using repetition in regex
René Scharfewrites: > Am 08.08.2017 um 16:49 schrieb Johannes Schindelin: >> Hi René, >> >> On Tue, 8 Aug 2017, René Scharfe wrote: >> >>> OpenBSD's regex library has a repetition limit (RE_DUP_MAX) of 255. >>> That's the minimum acceptable value according to POSIX. In t4062 we use >>> 4096 repetitions in the test "-G matches", though, causing it to fail. >>> >>> Do the same as the test "-S --pickaxe-regex" in the same file and search >>> for a single zero instead. That still suffices to trigger the buffer >>> overrun in older versions (checked with b7d36ffca02^ and --valgrind on >>> Linux), simplifies the test a bit, and avoids exceeding OpenBSD's limit. >> >> I am afraid not. The 4096 is precisely the page size required to trigger >> the bug on Windows against which this regression test tries to safeguard. > > Checked with b7d36ffca02^ on MinGW now as well and found that it > segfaults with the proposed change ten out of ten times. That is a strange result but I can believe it. The reason why I find it strange is that the test wants to run diff_grep() in diffcore-pickaxe.c with one == NULL (because we are looking at difference between an initial empty commit and the current commit that adds 4096-zeroes.txt file), which makes the current blob (i.e. a page of '0' that may be mmap(2)ed without trailing NUL to terminate it) scanned via regexec() to look for the search string. I can understand why Dscho originally did "^0{4096}$"; it is to ensure that the whole page is scanned for 4096 zeroes and then the code would somehow make sure that there is no more byte until the end of line, which will force regexec (but not regexec_buf that knows where the buffer ends) to look at the 4097th byte that does not exist. If you change the pattern to just "0" that is not anchored, I'd expect regexec() that does not know how big the haystack is to just find "0" at the first byte and happily return without segfaulting (because it does not even have to scan the remainder of the buffer). So I find Dscho's concern quite valid, even though I do believe you when you say the code somehow segfaults. I just can not tell how/why it would segfault, though---it is possible that regexec() implementation is stupid and does not realize that it can return early reporting success without looking at the rest of the buffer, but somehow I find it unlikely. Puzzled. > You get different results? How is that possible? The search string is > NUL-terminated in each case, while the point of the test is that the > file contents isn't, right?
[PATCH] sha1_file: avoid comparison if no packed hash matches the first byte
find_pack_entry_one() uses the fan-out table of pack indexes to find out which entries match the first byte of the searched hash and does a binary search on this subset of the main index table. If there are no matching entries then lo and hi will have the same value. The binary search still starts and compares the hash of the following entry (which has a non-matching first byte, so won't cause any trouble), or whatever comes after the sorted list of entries. The probability of that stray comparison matching by mistake is low, but let's not take any chances and check when entering the binary search loop if we're actually done already. Signed-off-by: Rene Scharfe--- sha1_file.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/sha1_file.c b/sha1_file.c index b60ae15f70..11ee69a99d 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -2799,7 +2799,7 @@ off_t find_pack_entry_one(const unsigned char *sha1, return nth_packed_object_offset(p, pos); } - do { + while (lo < hi) { unsigned mi = (lo + hi) / 2; int cmp = hashcmp(index + mi * stride, sha1); @@ -2812,7 +2812,7 @@ off_t find_pack_entry_one(const unsigned char *sha1, hi = mi; else lo = mi+1; - } while (lo < hi); + } return 0; } -- 2.14.0
Re: [PATCH] Convert size datatype to size_t
Martin Koeglerwrites: > diff --git a/apply.c b/apply.c > index f2d5991..ea97fd2 100644 > --- a/apply.c > +++ b/apply.c > @@ -3085,7 +3085,7 @@ static int apply_binary_fragment(struct apply_state > *state, >struct patch *patch) > { > struct fragment *fragment = patch->fragments; > - unsigned long len; > + size_t len; > void *dst; > > if (!fragment) This variable is made size_t because it receives the result size of patch_delta(). And it later is assigned to img->len field, which already is size_t before this patch. Curiously, the patch_delta() invocation with this patch reads like this: dst = patch_delta(img->buf, img->len, fragment->patch, fragment->size, ); where patch_delta() is updated (correctly) to: void *patch_delta(const void *src_buf, size_t src_size, const void *delta_buf, size_t delta_size, size_t *dst_size) with this patch. But "size" field in "struct fragment" is still "int". So we'd need to update it as well, not necessarily in this patch (which is already too big) but as part of a larger whole. > @@ -3174,7 +3174,7 @@ static int apply_binary(struct apply_state *state, > if (has_sha1_file(oid.hash)) { > /* We already have the postimage */ > enum object_type type; > - unsigned long size; > + size_t size; > char *result; > > result = read_sha1_file(oid.hash, , ); This is to receive the resulting size from read_sha1_file(). It is assigned to img->len, which is already size_t, so all is good here. > @@ -3236,7 +3236,7 @@ static int read_blob_object(struct strbuf *buf, const > struct object_id *oid, uns > strbuf_addf(buf, "Subproject commit %s\n", oid_to_hex(oid)); > } else { > enum object_type type; > - unsigned long sz; > + size_t sz; > char *result; > > result = read_sha1_file(oid->hash, , ); By reading the remainder of this function, this conversion also is good. sz that is now size_t is used as the size attached to an existing strbuf like so: result = read_sha1_file(oid->hash, , ); if (!result) return -1; /* XXX read_sha1_file NUL-terminates */ strbuf_attach(buf, result, sz, sz + 1); in the part beyond the post context of this hunk. In the longer term, sz+1 we see here may want to become the overflow-safe variant st_add(). As you said in the comment after three-dashes in the patch, a lot more work is needed and your patch is a good starting point. I am not sure if we can split the patch somehow to make it easier to review. The deceptively small part of your patch, i.e. apply.c | 6 +++--- needs the above analysis to see if they are correct and what more work is necessary, and there are 65 more files with ~190 lines changed.
Re: [PATCH] builtin/add: add a missing newline to an stderr message
Am 08.08.2017 um 23:36 schrieb Ramsay Jones: > > Signed-off-by: Ramsay Jones> --- > > Hi Junio, > > I noticed this while looking into the t3700 failure on cygwin tonight. > Also, I couldn't decide whether or not to add the i18n '_()' brackets > around the message. In the end I didn't, but will happily add them > if you think I should. > > Thanks! > > ATB, > Ramsay Jones > > builtin/add.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/builtin/add.c b/builtin/add.c > index e888fb8c5..385b53ae7 100644 > --- a/builtin/add.c > +++ b/builtin/add.c > @@ -43,7 +43,7 @@ static void chmod_pathspec(struct pathspec *pathspec, int > force_mode) > continue; > > if (chmod_cache_entry(ce, force_mode) < 0) > - fprintf(stderr, "cannot chmod '%s'", ce->name); > + fprintf(stderr, "cannot chmod '%s'\n", ce->name); > } > } > FYI: I brought this up yesterday in the original thread, along with a few other observations: https://public-inbox.org/git/3c61d9f6-e0fd-22a4-68e0-89fd9ce9b...@web.de/ Not sure if the discussion can or should be revived after all this time, though; just sending patches like yours might be the way to go. René
[PATCH] builtin/add: add a missing newline to an stderr message
Signed-off-by: Ramsay Jones--- Hi Junio, I noticed this while looking into the t3700 failure on cygwin tonight. Also, I couldn't decide whether or not to add the i18n '_()' brackets around the message. In the end I didn't, but will happily add them if you think I should. Thanks! ATB, Ramsay Jones builtin/add.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/builtin/add.c b/builtin/add.c index e888fb8c5..385b53ae7 100644 --- a/builtin/add.c +++ b/builtin/add.c @@ -43,7 +43,7 @@ static void chmod_pathspec(struct pathspec *pathspec, int force_mode) continue; if (chmod_cache_entry(ce, force_mode) < 0) - fprintf(stderr, "cannot chmod '%s'", ce->name); + fprintf(stderr, "cannot chmod '%s'\n", ce->name); } } -- 2.14.0
Re: [PATCH] Convert size datatype to size_t
Martin Koeglerwrites: > From: Martin Koegler > > It changes the signature of the core object access function > including any other functions to assure a clean compile if > sizeof(size_t) != sizeof(unsigned long). > > Signed-off-by: Martin Koegler > --- > ... > 66 files changed, 193 insertions(+), 191 deletions(-) Wow, that's a lot of changes. What version did you worked this patch on? The reason I ask is because... > diff --git a/diff-delta.c b/diff-delta.c > index cd238c8..3d5e1ef 100644 ... I do not see any version of Git that had blob cd238c8 at that path, so "git am -3" is having hard time applying this pach to allow me reviewing it.
Re: [RFC PATCH 00/10] An attempt to move packfile funcs to its own file
Jonathan Tanwrites: > What do you mean by "keep the exposed surface area small enough"? If you > mean the total number of exposed functions in sha1_file and pack (once > everything is done), I think it will be almost the same as that > currently in sha1_file. > ... > During this patch set, there might be some functions that need to be > temporarily made global, but those are reverted to static in the end. That is exactly what I meant. > As stated above, I don't think so, but I'll make a list of the functions > needing to be made global. Good. Thanks.
Re: [PATCH 2/2] filter-branch: Handle rewritting (very) old style tags which lack tagger
Ian Campbellwrites: > Such as v2.6.12-rc2..v2.6.13-rc3 in the Linux kernel source tree. > > Insert a fake tag header, since newer `git mktag` wont accept the input > otherwise: > > $ git cat-file tag v2.6.12-rc2 > object 1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 > type commit > tag v2.6.12-rc2 > > Linux v2.6.12-rc2 release > -BEGIN PGP SIGNATURE- > Version: GnuPG v1.2.4 (GNU/Linux) > > iD8DBQBCbW8ZF3YsRnbiHLsRAgFRAKCq/TkuDaEombFABkPqYgGCgWN2lQCcC0qc > wznDbFU45A54dZC8RZ5JxyE= > =ESRP > -END PGP SIGNATURE- > > $ git cat-file tag v2.6.12-rc2 | git mktag > error: char76: could not find "tagger " > fatal: invalid tag signature file > $ git cat-file tag v2.6.13-rc4 | git mktag > 7eab951de91d95875ba34ec4c599f37e1208db93 > > Signed-off-by: Ian Campbell > --- > git-filter-branch.sh | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/git-filter-branch.sh b/git-filter-branch.sh > index d07db3fee..6927aa2da 100755 > --- a/git-filter-branch.sh > +++ b/git-filter-branch.sh > @@ -540,6 +540,9 @@ if [ "$filter_tag_name" ]; then > new_sha1=$( ( printf 'object %s\ntype commit\ntag %s\n' > \ > "$new_sha1" "$new_ref" > git cat-file tag "$ref" | > + awk '/^tagger/ { tagged=1 } > + /^$/ { if (!tagged && !done) { print > "tagger Unknown 0 +" } ; done=1 } > + // { print }' | > sed -n \ > -e '1,/^$/{ > /^object /d What the change wants to do makes perfect sense, but piping output from awk into sed looks somewhat gross. Perhaps we'd want to roll what the existing sed script is trying to do into this new awk script?
Re: [PATCH 1/2] filter-branch: Add --state-branch to hold pickled copy of ref map
Ian Campbellwrites: > Allowing for incremental updates of large trees. "by doing what" is missing. And ... > > I have been using this as part of the device tree extraction from the Linux > kernel source since 2013, about time I sent the patch upstream! ... this does not help understanding what is going on. It belongs to the space after three dashes. Perhaps Subject: filter-branch: stash away ref map in a branch With "--state-branch=" option, the mapping from old object names and filtered ones in ./map/ directory is stashed away in the object database, and the one from the previous run is read to populate the ./map/ directory, allowing for incremental updates of large trees. or something? > > Signed-off-by: Ian Campbell > --- > git-filter-branch.sh | 39 ++- > 1 file changed, 38 insertions(+), 1 deletion(-) > > diff --git a/git-filter-branch.sh b/git-filter-branch.sh > index 3a74602ef..d07db3fee 100755 > --- a/git-filter-branch.sh > +++ b/git-filter-branch.sh > @@ -86,7 +86,7 @@ USAGE="[--setup ] [--env-filter ] > [--parent-filter ] [--msg-filter ] > [--commit-filter ] [--tag-name-filter ] > [--subdirectory-filter ] [--original ] > - [-d ] [-f | --force] > + [-d ] [-f | --force] [--state-branch ] > [--] [...]" > > OPTIONS_SPEC= > @@ -106,6 +106,7 @@ filter_msg=cat > filter_commit= > filter_tag_name= > filter_subdir= > +state_branch= > orig_namespace=refs/original/ > force= > prune_empty= > @@ -181,6 +182,9 @@ do > --original) > orig_namespace=$(expr "$OPTARG/" : '\(.*[^/]\)/*$')/ > ;; > + --state-branch) > + state_branch="$OPTARG" > + ;; > *) > usage > ;; > @@ -252,6 +256,20 @@ export GIT_INDEX_FILE > # map old->new commit ids for rewriting parents > mkdir ../map || die "Could not create map/ directory" > > +if [ -n "$state_branch" ] ; then > + state_commit=`git show-ref -s "$state_branch"` I hate to nitpick styles, especially on this script that already has existing violations, but for completeness: Style: we prefer to write $(command substitution) instead. Style: we prefer to write "if test", not "if [". Style: we prefer to avoid ';' and write "if test condtion" and "then" on different lines. It is a bit curious use of "show-ref". It is not wrong per-se, but "git rev-parse" may be more common. I do not care too deeply either way, though. Don't we want to make sure the value given to --state-branch is a full refname, not just a branch name? What happens when you say filter-branch --state-branch master by mistake? "show-ref -s" is likely to show your refs/heads/master, and other master branches that appear as remote-tracking branches for the remotes you interact with. > + if [ -n "$state_commit" ] ; then > + echo "Populating map from $state_branch ($state_commit)" 1>&2 > + git show "$state_commit":filter.map | > + perl -n -e 'm/(.*):(.*)/ or die; > + open F, ">../map/$1" or die; > + print F "$2" or die; > + close(F) or die' The process calling this perl script, which carefully diagnoses malformed input and dies, does not seem to do anything when it sees errors. Intended? > + else > + echo "Branch $state_branch does not exist. Will create" 1>&2 > + fi > +fi > + > # we need "--" only if there are no path arguments in $@ > nonrevs=$(git rev-parse --no-revs "$@") || exit > if test -z "$nonrevs" > @@ -544,6 +562,25 @@ if [ "$filter_tag_name" ]; then > done > fi > > +if [ -n "$state_branch" ] ; then > + echo "Saving rewrite state to $state_branch" 1>&2 > + STATE_BLOB=$(ls ../map | > + perl -n -e 'chomp(); > + open F, "<../map/$_" or die; > + chomp($f = ); print "$_:$f\n";' | I see it somewhat gross to pipe the output of "/bin/ls" to a Perl script, instead of iterating over "while (<../map/*>)" inside the script itself. > + git hash-object -w --stdin ) > + STATE_TREE=$(/bin/echo -e "100644 blob $STATE_BLOB\tfilter.map" | git > mktree) > + STATE_PARENT=$(git show-ref -s "$state_branch") Don't you already have this in $state_commit? One advantage of reading $state_branch again at this point is to detect mistakes of running more than one filter-branch (which may cause you to read $STATE_PARENT that is different from $state_commit you read earlier), but I do not think that is being done here, so... > + unset GIT_AUTHOR_NAME GIT_AUTHOR_EMAIL GIT_AUTHOR_DATE > + unset GIT_COMMITTER_NAME GIT_COMMITTER_EMAIL GIT_COMMITTER_DATE Hmph. I can see that you are trying not to be affected by the committers and authors of the commits on the branch being filtered (which are set
Re: [RFC PATCH 01/10] pack: move pack name-related functions
On Tue, 8 Aug 2017 13:36:24 -0700 Stefan Bellerwrote: > On Tue, Aug 8, 2017 at 12:32 PM, Jonathan Tan > wrote: > > Currently, sha1_file.c and cache.h contain many functions, both related > > to and unrelated to packfiles. This makes both files very large and > > causes an unclear separation of concerns. > > > > Create a new file, pack.c, to hold all packfile-related functions > > currently in sha1_file.c, and designate pack.h to hold these > > packfile-related functions. > > There are also packed refs, so one could (like I did) think that > pack.c is for generic packing of things, maybe packfile.c > would be more clear? Good point. I'll use packfile.c and packfile.h in the next version.
Re: [RFC PATCH 04/10] pack: move open_pack_index(), parse_pack_index()
On Tue, 08 Aug 2017 13:19:23 -0700 Junio C Hamanowrote: > Jonathan Tan writes: > > > Signed-off-by: Jonathan Tan > > --- > > builtin/count-objects.c | 1 + > > cache.h | 8 --- > > pack.c | 149 > > > > pack.h | 8 +++ > > sha1_file.c | 140 - > > sha1_name.c | 1 + > > 6 files changed, 159 insertions(+), 148 deletions(-) > > This patch is a bit strange... > > > diff --git a/pack.c b/pack.c > > index 60d9fc3b0..6edc43228 100644 > > --- a/pack.c > > +++ b/pack.c > > ... > > +static struct packed_git *alloc_packed_git(int extra) > > +{ > > + struct packed_git *p = xmalloc(st_add(sizeof(*p), extra)); > > + memset(p, 0, sizeof(*p)); > > + p->pack_fd = -1; > > + return p; > > +} > > + > > +struct packed_git *parse_pack_index(unsigned char *sha1, const char > > *idx_path) > > +{ > > + const char *path = sha1_pack_name(sha1); > > + size_t alloc = st_add(strlen(path), 1); > > + struct packed_git *p = alloc_packed_git(alloc); > > + > > + memcpy(p->pack_name, path, alloc); /* includes NUL */ > > + hashcpy(p->sha1, sha1); > > + if (check_packed_git_idx(idx_path, p)) { > > + free(p); > > + return NULL; > > + } > > + > > + return p; > > +} > > We see these two functions appear in pack.c > > > diff --git a/sha1_file.c b/sha1_file.c > > index 0de39f480..2e414f5f5 100644 > > --- a/sha1_file.c > > +++ b/sha1_file.c > > ... > > @@ -1300,22 +1176,6 @@ struct packed_git *add_packed_git(const char *path, > > size_t path_len, int local) > > return p; > > } > > > > -struct packed_git *parse_pack_index(unsigned char *sha1, const char > > *idx_path) > > -{ > > - const char *path = sha1_pack_name(sha1); > > - size_t alloc = st_add(strlen(path), 1); > > - struct packed_git *p = alloc_packed_git(alloc); > > - > > - memcpy(p->pack_name, path, alloc); /* includes NUL */ > > - hashcpy(p->sha1, sha1); > > - if (check_packed_git_idx(idx_path, p)) { > > - free(p); > > - return NULL; > > - } > > - > > - return p; > > -} > > - > > And we see parse_pack_index() came from sha1_file.c > > But where did alloc_packed_git() come from? Was the patch split > incorrectly or something? > > When I applied the whole series and did > > git blame -s -w -M -C -C master.. pack.c > > expecting that pretty much everything has come from sha1_file.c but > noticed that some lines were actually blamed to a version of pack.c > and these functions were among them. alloc_packed_git() in pack.c is a duplicate of the function of the same name in sha1_file.c in this patch, because at this patch, there are still functions in both files using this function. A subsequent patch in this patch set will remove it from pack.c. I'll add a note explaining this to this patch in the next version.
Re: [RFC PATCH 00/10] An attempt to move packfile funcs to its own file
On Tue, 08 Aug 2017 13:05:05 -0700 Junio C Hamanowrote: > Jonathan Tan writes: > > > While investigating annotating packfiles and loose objects to support > > connectivity checks in partial clones [1], I decided to make the effort > > to separate packfile-related code from sha1_file.c into its own file, to > > make it easier to both code such changes and review them. Here is the > > beginning of those efforts. > > > > Is this something worth doing, and if yes, is this in the right > > direction? > > Overall I think it is a good idea to slim down sha1_file.c *if* we > can keep the exposed surface area small enough. What do you mean by "keep the exposed surface area small enough"? If you mean the total number of exposed functions in sha1_file and pack (once everything is done), I think it will be almost the same as that currently in sha1_file. find_pack_entry() and has_packed_and_bad() (not yet in this patch set) might need to be changed from static to global, but those are the only 2 I can think of. Anyway, I'll report the functions that need to be changed from static to global at the end. During this patch set, there might be some functions that need to be temporarily made global, but those are reverted to static in the end. > I wonder if the names "pack.[ch]" communicate well that these are > "object access layer that is about reading from packfiles". The > writer side is called "pack-objects.[ch]". This file will end up being slightly broader than reading from packfiles - in particular, things like pack_report() (reporting some statistics not only on the in-repo packfiles themselves) and parse_pack_index() (which parses an idx file obtained through http) are there too. Hence the generic name, but I agree that there might be a better name (or better set of names). > This may have to make some symbols that used to be private to the > "object access" layer, which was what sha1_file.c was about, global > symbols. After moving things around, do we end up exposing too many > implementation details to the world outside the "object access" > layer? I'd assume they are limited to the resulting pack.h and it > would be OK as long as nobody other than sha1_file.c and pack.c > would inculde it, though. As stated above, I don't think so, but I'll make a list of the functions needing to be made global.
Re: [RFC PATCH 01/10] pack: move pack name-related functions
On Tue, Aug 8, 2017 at 12:32 PM, Jonathan Tanwrote: > Currently, sha1_file.c and cache.h contain many functions, both related > to and unrelated to packfiles. This makes both files very large and > causes an unclear separation of concerns. > > Create a new file, pack.c, to hold all packfile-related functions > currently in sha1_file.c, and designate pack.h to hold these > packfile-related functions. There are also packed refs, so one could (like I did) think that pack.c is for generic packing of things, maybe packfile.c would be more clear?
Re: [PATCH v2 5/5] sha1_file: support loading lazy objects
On 7/31/2017 5:02 PM, Jonathan Tan wrote: Teach sha1_file to invoke the command configured in extensions.lazyObject whenever an object is requested and unavailable. The usage of the hook can be suppressed through a flag when invoking has_object_file_with_flags() and other similar functions. This is meant as a temporary measure to ensure that all Git commands work in such a situation. Future patches will update some commands to either tolerate missing objects (without invoking the command) or be more efficient in invoking this command. To prevent fetch from downloading all missing objects, you will also need to add logic in check_connected. The simplest model is to simply return 0 if repository_format_lazy_object is set. /* * Running a with lazy_objects there will be objects that are * missing locally and we don't want to download a bunch of * commits, trees, and blobs just to make sure everything is * reachable locally so this option will skip reachablility * checks below that use rev-list. This will stop the check * before uploadpack runs to determine if there is anything to * fetch. Returning zero for the first check will also prevent the * uploadpack from happening. It will also skip the check after * the fetch is finished to make sure all the objects where * downloaded in the pack file. This will allow the fetch to * run and get all the latest tip commit ids for all the branches * in the fetch but not pull down commits, trees, or blobs via * upload pack. */ if (repository_format_lazy_object) return 0; [...] diff --git a/sha1_file.c b/sha1_file.c index b60ae15f7..1785c61d8 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -28,6 +28,11 @@ #include "list.h" #include "mergesort.h" #include "quote.h" +#include "iterator.h" +#include "dir-iterator.h" +#include "sha1-lookup.h" +#include "lazy-object.h" +#include "sha1-array.h" #define SZ_FMT PRIuMAX static inline uintmax_t sz_fmt(size_t s) { return s; } @@ -2984,6 +2989,7 @@ int sha1_object_info_extended(const unsigned char *sha1, struct object_info *oi, const unsigned char *real = (flags & OBJECT_INFO_LOOKUP_REPLACE) ? lookup_replace_object(sha1) : sha1; + int already_retried = 0; if (!oi) oi = _oi; @@ -3008,30 +3014,38 @@ int sha1_object_info_extended(const unsigned char *sha1, struct object_info *oi, } } - if (!find_pack_entry(real, )) { - /* Most likely it's a loose object. */ - if (!sha1_loose_object_info(real, oi, flags)) { - oi->whence = OI_LOOSE; - return 0; - } +retry: + if (find_pack_entry(real, )) + goto found_packed; - /* Not a loose object; someone else may have just packed it. */ - if (flags & OBJECT_INFO_QUICK) { - return -1; - } else { - reprepare_packed_git(); - if (!find_pack_entry(real, )) - return -1; - } + /* Most likely it's a loose object. */ + if (!sha1_loose_object_info(real, oi, flags)) { + oi->whence = OI_LOOSE; + return 0; } + /* Not a loose object; someone else may have just packed it. */ + reprepare_packed_git(); + if (find_pack_entry(real, )) + goto found_packed; Same feedback as before. I like to avoid using goto's as flow control other than in error handling. Also, this patch looses the OBJECT_INFO_QUICK logic which could be restored. [...] diff --git a/t/t0410/lazy-object b/t/t0410/lazy-object new file mode 100755 index 0..4f4a9c38a --- /dev/null +++ b/t/t0410/lazy-object @@ -0,0 +1,102 @@ +#!/usr/bin/perl +# +# Example implementation for the Git lazyObject protocol version 1. See +# the documentation for extensions.lazyObject in +# Documentation/technical/repository-version.txt +# +# Allows you to test the ability for blobs to be pulled from a host git repo +# "on demand." Called when git needs a blob it couldn't find locally due to +# a lazy clone that only cloned the commits and trees. +# +# Please note, this sample is a minimal skeleton. No proper error handling +# was implemented. + +use strict; +use warnings; + +# +# Point $DIR to the folder where your host git repo is located so we can pull +# missing objects from it +# +my $DIR = $ARGV[0]; + At some point, this should be based on the refactored pkt_* functions currently contained in the ObjectDB patch series. +sub packet_bin_read { + my $buffer; + my $bytes_read = read STDIN, $buffer, 4; + if ( $bytes_read == 0 ) { + + # EOF - Git stopped talking to us! + exit(); + } + elsif ( $bytes_read != 4 ) { + die "invalid packet: '$buffer'"; + } + my $pkt_size =
Re: [RFC PATCH 04/10] pack: move open_pack_index(), parse_pack_index()
Jonathan Tanwrites: > Signed-off-by: Jonathan Tan > --- > builtin/count-objects.c | 1 + > cache.h | 8 --- > pack.c | 149 > > pack.h | 8 +++ > sha1_file.c | 140 - > sha1_name.c | 1 + > 6 files changed, 159 insertions(+), 148 deletions(-) This patch is a bit strange... > diff --git a/pack.c b/pack.c > index 60d9fc3b0..6edc43228 100644 > --- a/pack.c > +++ b/pack.c > ... > +static struct packed_git *alloc_packed_git(int extra) > +{ > + struct packed_git *p = xmalloc(st_add(sizeof(*p), extra)); > + memset(p, 0, sizeof(*p)); > + p->pack_fd = -1; > + return p; > +} > + > +struct packed_git *parse_pack_index(unsigned char *sha1, const char > *idx_path) > +{ > + const char *path = sha1_pack_name(sha1); > + size_t alloc = st_add(strlen(path), 1); > + struct packed_git *p = alloc_packed_git(alloc); > + > + memcpy(p->pack_name, path, alloc); /* includes NUL */ > + hashcpy(p->sha1, sha1); > + if (check_packed_git_idx(idx_path, p)) { > + free(p); > + return NULL; > + } > + > + return p; > +} We see these two functions appear in pack.c > diff --git a/sha1_file.c b/sha1_file.c > index 0de39f480..2e414f5f5 100644 > --- a/sha1_file.c > +++ b/sha1_file.c > ... > @@ -1300,22 +1176,6 @@ struct packed_git *add_packed_git(const char *path, > size_t path_len, int local) > return p; > } > > -struct packed_git *parse_pack_index(unsigned char *sha1, const char > *idx_path) > -{ > - const char *path = sha1_pack_name(sha1); > - size_t alloc = st_add(strlen(path), 1); > - struct packed_git *p = alloc_packed_git(alloc); > - > - memcpy(p->pack_name, path, alloc); /* includes NUL */ > - hashcpy(p->sha1, sha1); > - if (check_packed_git_idx(idx_path, p)) { > - free(p); > - return NULL; > - } > - > - return p; > -} > - And we see parse_pack_index() came from sha1_file.c But where did alloc_packed_git() come from? Was the patch split incorrectly or something? When I applied the whole series and did git blame -s -w -M -C -C master.. pack.c expecting that pretty much everything has come from sha1_file.c but noticed that some lines were actually blamed to a version of pack.c and these functions were among them.
Re: [PATCH v2 2/2 / RFC] branch: quote branch/ref names to improve readability
On Tue, Aug 8, 2017 at 12:43 PM, Junio C Hamanowrote: > Stefan Beller writes: > >> (Though wondering for non-submodule users, if they perceive it as >> inconsistency as other parts of the code may not follow the rigorous quoting) > > Do you mean that we may instead want to remove the excessive quoting > of branch names and stuff from submodule.c code, because they are > newer ones that broke the consistency existed before them (i.e. not > quoting)? No, I do not. I was just wondering if a non-submodule user may see differences between different commands now. For example "checkout -b" already quotes 'external data', which would be inline with this proposal, but there may be others. My question was rather an encouragement to check the code base if there are other occurrences left that do not quote. In an ideal code base we could just grep for any %s that has no surrounding quotes, but of course it is not as easy in the real world: * some outputs use %s construction for non-human consumption in e.g. the diff machinery * sometimes we play sentence lego, stringing words together which also is using %s unquoted correctly. > That certainly is tempting, but I personally find it easier to read > a message that marks parts that holds "external data" differently > from the message's text, so I think this patch 2/2 goes in the right > direction. Yes. I like the direction this patch is going. A note on 'external data': For branchs, paths, submodule names a single quote seems to be best (my opinion), whereas in e.g. git-status: Submodule 'sm1' 000...1beffeb (new submodule) parens seem to do a better job as they describe the state, not reproducing external data. (This is also the place where I was reminded of potential sentence lego)
Re: [RFC PATCH 00/10] An attempt to move packfile funcs to its own file
Jonathan Tanwrites: > While investigating annotating packfiles and loose objects to support > connectivity checks in partial clones [1], I decided to make the effort > to separate packfile-related code from sha1_file.c into its own file, to > make it easier to both code such changes and review them. Here is the > beginning of those efforts. > > Is this something worth doing, and if yes, is this in the right > direction? Overall I think it is a good idea to slim down sha1_file.c *if* we can keep the exposed surface area small enough. I wonder if the names "pack.[ch]" communicate well that these are "object access layer that is about reading from packfiles". The writer side is called "pack-objects.[ch]". This may have to make some symbols that used to be private to the "object access" layer, which was what sha1_file.c was about, global symbols. After moving things around, do we end up exposing too many implementation details to the world outside the "object access" layer? I'd assume they are limited to the resulting pack.h and it would be OK as long as nobody other than sha1_file.c and pack.c would inculde it, though. Thanks. > > [1] > https://public-inbox.org/git/20170804145113.5ceaf...@twelve2.svl.corp.google.com/ > > Jonathan Tan (10): > pack: move pack name-related functions > pack: move static state variables > pack: move pack_report() > pack: move open_pack_index(), parse_pack_index() > pack: move release_pack_memory() > pack: move pack-closing functions > pack: move use_pack() > pack: move unuse_pack() > pack: move add_packed_git() > pack: move install_packed_git() > > Makefile | 1 + > builtin/am.c | 1 + > builtin/clone.c | 1 + > builtin/count-objects.c | 1 + > builtin/fetch.c | 1 + > builtin/merge.c | 1 + > builtin/pack-redundant.c | 1 + > cache.h | 45 > connected.c | 1 + > git-compat-util.h| 2 - > pack.c | 669 > +++ > pack.h | 51 > sha1_file.c | 667 -- > sha1_name.c | 1 + > streaming.c | 1 + > 15 files changed, 730 insertions(+), 714 deletions(-) > create mode 100644 pack.c
[PATCH] Convert size datatype to size_t
From: Martin KoeglerIt changes the signature of the core object access function including any other functions to assure a clean compile if sizeof(size_t) != sizeof(unsigned long). Signed-off-by: Martin Koegler --- There is much more to change in the codebase. It JUST fixes the signature of some functions. apply.c | 6 +++--- archive-tar.c| 4 ++-- archive-zip.c| 2 +- archive.c| 2 +- archive.h| 2 +- blame.c | 4 ++-- blame.h | 2 +- builtin/cat-file.c | 10 +- builtin/difftool.c | 2 +- builtin/fast-export.c| 6 +++--- builtin/fmt-merge-msg.c | 2 +- builtin/fsck.c | 2 +- builtin/grep.c | 6 +++--- builtin/index-pack.c | 12 ++-- builtin/log.c| 4 ++-- builtin/ls-tree.c| 2 +- builtin/merge-tree.c | 6 +++--- builtin/mktag.c | 2 +- builtin/notes.c | 6 +++--- builtin/pack-objects.c | 10 +- builtin/reflog.c | 2 +- builtin/tag.c| 4 ++-- builtin/unpack-file.c| 2 +- builtin/unpack-objects.c | 14 +++--- builtin/verify-commit.c | 2 +- bundle.c | 2 +- cache.h | 22 +++--- combine-diff.c | 9 + commit.c | 6 +++--- config.c | 2 +- delta.h | 26 +- diff-delta.c | 16 diff.c | 14 +++--- diff.h | 2 +- diffcore.h | 2 +- dir.c| 2 +- entry.c | 4 ++-- fast-import.c| 18 +- fsck.c | 2 +- grep.h | 2 +- http-push.c | 2 +- mailmap.c| 2 +- match-trees.c| 4 ++-- merge-blobs.c| 6 +++--- merge-blobs.h| 2 +- merge-recursive.c| 4 ++-- notes-cache.c| 2 +- notes-merge.c| 2 +- notes.c | 6 +++--- object.c | 2 +- pack-check.c | 2 +- pack-objects.h | 6 +++--- patch-delta.c| 11 ++- read-cache.c | 4 ++-- ref-filter.c | 4 ++-- remote-testsvn.c | 4 ++-- rerere.c | 2 +- sha1_file.c | 44 ++-- streaming.c | 8 streaming.h | 2 +- submodule-config.c | 2 +- t/helper/test-delta.c| 2 +- tag.c| 4 ++-- tree-walk.c | 8 tree.c | 2 +- xdiff-interface.c| 2 +- 66 files changed, 193 insertions(+), 191 deletions(-) diff --git a/apply.c b/apply.c index f2d5991..ea97fd2 100644 --- a/apply.c +++ b/apply.c @@ -3085,7 +3085,7 @@ static int apply_binary_fragment(struct apply_state *state, struct patch *patch) { struct fragment *fragment = patch->fragments; - unsigned long len; + size_t len; void *dst; if (!fragment) @@ -3174,7 +3174,7 @@ static int apply_binary(struct apply_state *state, if (has_sha1_file(oid.hash)) { /* We already have the postimage */ enum object_type type; - unsigned long size; + size_t size; char *result; result = read_sha1_file(oid.hash, , ); @@ -3236,7 +3236,7 @@ static int read_blob_object(struct strbuf *buf, const struct object_id *oid, uns strbuf_addf(buf, "Subproject commit %s\n", oid_to_hex(oid)); } else { enum object_type type; - unsigned long sz; + size_t sz; char *result; result = read_sha1_file(oid->hash, , ); diff --git a/archive-tar.c b/archive-tar.c index c6ed96e..719673d 100644 --- a/archive-tar.c +++ b/archive-tar.c @@ -115,7 +115,7 @@ static int stream_blocked(const unsigned char *sha1) { struct git_istream *st; enum object_type type; - unsigned long sz; + size_t sz; char buf[BLOCKSIZE]; ssize_t readlen; @@ -240,7 +240,7 @@ static int write_tar_entry(struct archiver_args *args, struct ustar_header header; struct strbuf ext_header = STRBUF_INIT; unsigned int old_mode = mode; - unsigned long size, size_in_header; + size_t size, size_in_header; void *buffer; int err = 0; diff --git a/archive-zip.c b/archive-zip.c index e8913e5..4492d64 100644 --- a/archive-zip.c +++ b/archive-zip.c @@ -295,7 +295,7 @@ static int write_zip_entry(struct archiver_args *args, void *buffer; struct git_istream *stream = NULL; unsigned long flags = 0; -
Re: [PATCH v2 2/2 / RFC] branch: quote branch/ref names to improve readability
Stefan Bellerwrites: > (Though wondering for non-submodule users, if they perceive it as > inconsistency as other parts of the code may not follow the rigorous quoting) Do you mean that we may instead want to remove the excessive quoting of branch names and stuff from submodule.c code, because they are newer ones that broke the consistency existed before them (i.e. not quoting)? That certainly is tempting, but I personally find it easier to read a message that marks parts that holds "external data" differently from the message's text, so I think this patch 2/2 goes in the right direction.
Re: [PATCH v2] t3700: fix broken test under !POSIXPERM
On 08/08/17 20:32, Ramsay Jones wrote: > > > On 08/08/17 20:21, René Scharfe wrote: >> 76e368c378 (t3700: fix broken test under !SANITY) explains that the test >> 'git add --chmod=[+-]x changes index with already added file' can fail >> if xfoo3 is still present as a symlink from a previous test and deletes >> it with rm(1). That still leaves it present in the index, which causes >> the test to fail if POSIXPERM is not defined. Get rid of it by calling >> "git reset --hard" as well, as 76e368c378 already mentioned in passing. > > Hmm, I don't think its POSIXPERM (which is defined on cygwin) but > the lack of SANITY that is the problem. The test would fail on Linux > as well, if it skipped the prior tests marked with SANITY (they get > rid of the xfoo3->xfoo2 symlinks, among others). Yep, I didn't read the commit message properly! ;-) > > Ack, this fixes it for me. > > ATB, > Ramsay Jones > >> >> Helped-by: Adam Dinwoodie>> Signed-off-by: Rene Scharfe >> --- >> Change since v1: Keep the rm(1) call to avoid a problem on Cygwin. >> >> t/t3700-add.sh | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/t/t3700-add.sh b/t/t3700-add.sh >> index f3a4b4a913..0aae21d698 100755 >> --- a/t/t3700-add.sh >> +++ b/t/t3700-add.sh >> @@ -356,6 +356,7 @@ test_expect_success POSIXPERM,SYMLINKS 'git add >> --chmod=+x with symlinks' ' >> >> test_expect_success 'git add --chmod=[+-]x changes index with already added >> file' ' >> rm -f foo3 xfoo3 && >> +git reset --hard && >> echo foo >foo3 && >> git add foo3 && >> git add --chmod=+x foo3 && >> >
Re: [PATCH v2] am: fix signoff when other trailers are present
Phillip Woodwrites: > test_expect_success 'am --signoff does not add Signed-off-by: line if > already there' ' > - git format-patch --stdout HEAD^ >patch3 && > - sed -e "/^Subject/ s,\[PATCH,Re: Re: Re: & 1/5 v2] [foo," patch3 > >patch4 && > - rm -fr .git/rebase-apply && > - git reset --hard && > - git checkout HEAD^ && > - git am --signoff patch4 && > - git cat-file commit HEAD >actual && > - test $(grep -c "^Signed-off-by:" actual) -eq 1 > + git format-patch --stdout first >patch3 && > + git reset --hard first && > + git am --signoff + git log --pretty=%B -2 HEAD >actual && > + test_cmp expected-log actual > +' > + > +test_expect_success 'am --signoff adds Signed-off-by: if another author is > preset' ' Present? I think the motivation for adding this is to make sure that what the previous test checks will be true only when the one we are about to add is already at the end. So perhaps the previous test needs to be retitled from "if already there" to something a bit tighter, e.g. "if already at the end"? Also, strictly speaking, I think this isn't testing if another author is present---it is specifying what should happen when the last one is not us. > + NAME="A N Other" && > + EMAIL="a.n.ot...@example.com" && > + { > + printf "third\n\nSigned-off-by: %s <%s>\nSigned-off-by: %s > <%s>\n\n" \ > + "$GIT_COMMITTER_NAME" "$GIT_COMMITTER_EMAIL" \ > + "$NAME" "$EMAIL" && A "printf" tip: you can do printf "third\n\n" printf "S-o-b: %s <%s>\n" A B C D to get two lines of the latter. That would clarify what the next test does which wants to add three of them. > + cat msg && > + printf "Signed-off-by: %s <%s>\nSigned-off-by: %s <%s>\n\n" \ > + "$GIT_COMMITTER_NAME" "$GIT_COMMITTER_EMAIL" \ > + "$NAME" "$EMAIL" > + } >expected-log && > + git reset --hard first && > + GIT_COMMITTER_NAME="$NAME" GIT_COMMITTER_EMAIL="$EMAIL" \ > + git am --signoff + git log --pretty=%B -2 HEAD >actual && > + test_cmp expected-log actual > +' > + > +test_expect_success 'am --signoff duplicates Signed-off-by: if it is not the > last one' ' > + NAME="A N Other" && > + EMAIL="a.n.ot...@example.com" && > + { > + printf "third\n\nSigned-off-by: %s <%s>\n\ > +Signed-off-by: %s <%s>\nSigned-off-by: %s <%s>\n\n" \ > + "$GIT_COMMITTER_NAME" "$GIT_COMMITTER_EMAIL" \ > + "$NAME" "$EMAIL" \ > + "$GIT_COMMITTER_NAME" "$GIT_COMMITTER_EMAIL" && > + cat msg && > + printf "Signed-off-by: %s <%s>\nSigned-off-by: %s <%s>\n\ > +Signed-off-by: %s <%s>\n\n" \ > + "$GIT_COMMITTER_NAME" "$GIT_COMMITTER_EMAIL" \ > + "$NAME" "$EMAIL" \ > + "$GIT_COMMITTER_NAME" "$GIT_COMMITTER_EMAIL" > + } >expected-log && > + git format-patch --stdout first >patch3 && > + git reset --hard first && > + git am --signoff + git log --pretty=%B -2 HEAD >actual && > + test_cmp expected-log actual > ' > > test_expect_success 'am without --keep removes Re: and [PATCH] stuff' ' > + git format-patch --stdout HEAD^ >tmp && > + sed -e "/^Subject/ s,\[PATCH,Re: Re: Re: & 1/5 v2] [foo," tmp >patch4 && > + git reset --hard HEAD^ && > + git amgit rev-parse HEAD >expected && > git rev-parse master2 >actual && > test_cmp expected actual
[RFC PATCH 04/10] pack: move open_pack_index(), parse_pack_index()
Signed-off-by: Jonathan Tan--- builtin/count-objects.c | 1 + cache.h | 8 --- pack.c | 149 pack.h | 8 +++ sha1_file.c | 140 - sha1_name.c | 1 + 6 files changed, 159 insertions(+), 148 deletions(-) diff --git a/builtin/count-objects.c b/builtin/count-objects.c index 1d82e61f2..185d3190a 100644 --- a/builtin/count-objects.c +++ b/builtin/count-objects.c @@ -10,6 +10,7 @@ #include "builtin.h" #include "parse-options.h" #include "quote.h" +#include "pack.h" static unsigned long garbage; static off_t size_garbage; diff --git a/cache.h b/cache.h index c7f802e4a..5d6839525 100644 --- a/cache.h +++ b/cache.h @@ -1603,8 +1603,6 @@ struct pack_entry { struct packed_git *p; }; -extern struct packed_git *parse_pack_index(unsigned char *sha1, const char *idx_path); - /* A hook to report invalid files in pack directory */ #define PACKDIR_FILE_PACK 1 #define PACKDIR_FILE_IDX 2 @@ -1639,12 +1637,6 @@ extern int odb_mkstemp(struct strbuf *template, const char *pattern); */ extern int odb_pack_keep(const char *name); -/* - * mmap the index file for the specified packfile (if it is not - * already mmapped). Return 0 on success. - */ -extern int open_pack_index(struct packed_git *); - /* * munmap the index file for the specified packfile (if it is * currently mmapped). diff --git a/pack.c b/pack.c index 60d9fc3b0..6edc43228 100644 --- a/pack.c +++ b/pack.c @@ -1,5 +1,6 @@ #include "cache.h" #include "mru.h" +#include "pack.h" char *odb_pack_name(struct strbuf *buf, const unsigned char *sha1, @@ -59,3 +60,151 @@ void pack_report(void) pack_open_windows, peak_pack_open_windows, sz_fmt(pack_mapped), sz_fmt(peak_pack_mapped)); } + +/* + * Open and mmap the index file at path, perform a couple of + * consistency checks, then record its information to p. Return 0 on + * success. + */ +static int check_packed_git_idx(const char *path, struct packed_git *p) +{ + void *idx_map; + struct pack_idx_header *hdr; + size_t idx_size; + uint32_t version, nr, i, *index; + int fd = git_open(path); + struct stat st; + + if (fd < 0) + return -1; + if (fstat(fd, )) { + close(fd); + return -1; + } + idx_size = xsize_t(st.st_size); + if (idx_size < 4 * 256 + 20 + 20) { + close(fd); + return error("index file %s is too small", path); + } + idx_map = xmmap(NULL, idx_size, PROT_READ, MAP_PRIVATE, fd, 0); + close(fd); + + hdr = idx_map; + if (hdr->idx_signature == htonl(PACK_IDX_SIGNATURE)) { + version = ntohl(hdr->idx_version); + if (version < 2 || version > 2) { + munmap(idx_map, idx_size); + return error("index file %s is version %"PRIu32 +" and is not supported by this binary" +" (try upgrading GIT to a newer version)", +path, version); + } + } else + version = 1; + + nr = 0; + index = idx_map; + if (version > 1) + index += 2; /* skip index header */ + for (i = 0; i < 256; i++) { + uint32_t n = ntohl(index[i]); + if (n < nr) { + munmap(idx_map, idx_size); + return error("non-monotonic index %s", path); + } + nr = n; + } + + if (version == 1) { + /* +* Total size: +* - 256 index entries 4 bytes each +* - 24-byte entries * nr (20-byte sha1 + 4-byte offset) +* - 20-byte SHA1 of the packfile +* - 20-byte SHA1 file checksum +*/ + if (idx_size != 4*256 + nr * 24 + 20 + 20) { + munmap(idx_map, idx_size); + return error("wrong index v1 file size in %s", path); + } + } else if (version == 2) { + /* +* Minimum size: +* - 8 bytes of header +* - 256 index entries 4 bytes each +* - 20-byte sha1 entry * nr +* - 4-byte crc entry * nr +* - 4-byte offset entry * nr +* - 20-byte SHA1 of the packfile +* - 20-byte SHA1 file checksum +* And after the 4-byte offset table might be a +* variable sized table containing 8-byte entries +* for offsets larger than 2^31. +*/ + unsigned long min_size = 8 + 4*256 + nr*(20 + 4 + 4) + 20 + 20; +
[RFC PATCH 08/10] pack: move unuse_pack()
Signed-off-by: Jonathan Tan--- cache.h | 1 - pack.c | 9 + pack.h | 1 + sha1_file.c | 9 - 4 files changed, 10 insertions(+), 10 deletions(-) diff --git a/cache.h b/cache.h index dd9f9a9ae..4812f3a63 100644 --- a/cache.h +++ b/cache.h @@ -1637,7 +1637,6 @@ extern int odb_mkstemp(struct strbuf *template, const char *pattern); */ extern int odb_pack_keep(const char *name); -extern void unuse_pack(struct pack_window **); extern void clear_delta_base_cache(void); extern struct packed_git *add_packed_git(const char *path, size_t path_len, int local); diff --git a/pack.c b/pack.c index 85cb65558..93526ea7b 100644 --- a/pack.c +++ b/pack.c @@ -596,3 +596,12 @@ unsigned char *use_pack(struct packed_git *p, *left = win->len - xsize_t(offset); return win->base + offset; } + +void unuse_pack(struct pack_window **w_cursor) +{ + struct pack_window *w = *w_cursor; + if (w) { + w->inuse_cnt--; + *w_cursor = NULL; + } +} diff --git a/pack.h b/pack.h index bf2b99bf9..3876e9ae6 100644 --- a/pack.h +++ b/pack.h @@ -149,5 +149,6 @@ extern void close_all_packs(void); extern int open_packed_git(struct packed_git *p); extern unsigned char *use_pack(struct packed_git *, struct pack_window **, off_t, unsigned long *); +extern void unuse_pack(struct pack_window **); #endif diff --git a/sha1_file.c b/sha1_file.c index 8f17a07e9..12501ef06 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -717,15 +717,6 @@ void *xmmap(void *start, size_t length, return ret; } -void unuse_pack(struct pack_window **w_cursor) -{ - struct pack_window *w = *w_cursor; - if (w) { - w->inuse_cnt--; - *w_cursor = NULL; - } -} - static struct packed_git *alloc_packed_git(int extra) { struct packed_git *p = xmalloc(st_add(sizeof(*p), extra)); -- 2.14.0.434.g98096fd7a8-goog
[RFC PATCH 10/10] pack: move install_packed_git()
Signed-off-by: Jonathan Tan--- cache.h | 1 - pack.c | 11 ++- pack.h | 4 ++-- sha1_file.c | 9 - 4 files changed, 12 insertions(+), 13 deletions(-) diff --git a/cache.h b/cache.h index bf93477e8..41562dc0b 100644 --- a/cache.h +++ b/cache.h @@ -1611,7 +1611,6 @@ extern void (*report_garbage)(unsigned seen_bits, const char *path); extern void prepare_packed_git(void); extern void reprepare_packed_git(void); -extern void install_packed_git(struct packed_git *pack); /* * Give a rough count of objects in the repository. This sacrifices accuracy diff --git a/pack.c b/pack.c index efe0ed3e8..4eb65e460 100644 --- a/pack.c +++ b/pack.c @@ -28,7 +28,7 @@ static unsigned int pack_used_ctr; static unsigned int pack_mmap_calls; static unsigned int peak_pack_open_windows; static unsigned int pack_open_windows; -unsigned int pack_open_fds; +static unsigned int pack_open_fds; static unsigned int pack_max_fds; static size_t peak_pack_mapped; static size_t pack_mapped; @@ -658,3 +658,12 @@ struct packed_git *add_packed_git(const char *path, size_t path_len, int local) hashclr(p->sha1); return p; } + +void install_packed_git(struct packed_git *pack) +{ + if (pack->pack_fd != -1) + pack_open_fds++; + + pack->next = packed_git; + packed_git = pack; +} diff --git a/pack.h b/pack.h index c1f3ff32d..576c4fc7c 100644 --- a/pack.h +++ b/pack.h @@ -124,8 +124,6 @@ extern char *sha1_pack_name(const unsigned char *sha1); */ extern char *sha1_pack_index_name(const unsigned char *sha1); -extern unsigned int pack_open_fds; - extern void pack_report(void); /* @@ -152,4 +150,6 @@ extern unsigned char *use_pack(struct packed_git *, struct pack_window **, off_t extern void unuse_pack(struct pack_window **); extern struct packed_git *add_packed_git(const char *path, size_t path_len, int local); +extern void install_packed_git(struct packed_git *pack); + #endif diff --git a/sha1_file.c b/sha1_file.c index 7f12b1ee0..b956ca0c9 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -717,15 +717,6 @@ void *xmmap(void *start, size_t length, return ret; } -void install_packed_git(struct packed_git *pack) -{ - if (pack->pack_fd != -1) - pack_open_fds++; - - pack->next = packed_git; - packed_git = pack; -} - void (*report_garbage)(unsigned seen_bits, const char *path); static void report_helper(const struct string_list *list, -- 2.14.0.434.g98096fd7a8-goog
[RFC PATCH 00/10] An attempt to move packfile funcs to its own file
While investigating annotating packfiles and loose objects to support connectivity checks in partial clones [1], I decided to make the effort to separate packfile-related code from sha1_file.c into its own file, to make it easier to both code such changes and review them. Here is the beginning of those efforts. Is this something worth doing, and if yes, is this in the right direction? [1] https://public-inbox.org/git/20170804145113.5ceaf...@twelve2.svl.corp.google.com/ Jonathan Tan (10): pack: move pack name-related functions pack: move static state variables pack: move pack_report() pack: move open_pack_index(), parse_pack_index() pack: move release_pack_memory() pack: move pack-closing functions pack: move use_pack() pack: move unuse_pack() pack: move add_packed_git() pack: move install_packed_git() Makefile | 1 + builtin/am.c | 1 + builtin/clone.c | 1 + builtin/count-objects.c | 1 + builtin/fetch.c | 1 + builtin/merge.c | 1 + builtin/pack-redundant.c | 1 + cache.h | 45 connected.c | 1 + git-compat-util.h| 2 - pack.c | 669 +++ pack.h | 51 sha1_file.c | 667 -- sha1_name.c | 1 + streaming.c | 1 + 15 files changed, 730 insertions(+), 714 deletions(-) create mode 100644 pack.c -- 2.14.0.434.g98096fd7a8-goog
[RFC PATCH 09/10] pack: move add_packed_git()
Signed-off-by: Jonathan Tan--- cache.h | 1 - connected.c | 1 + pack.c | 53 + pack.h | 1 + sha1_file.c | 61 - 5 files changed, 55 insertions(+), 62 deletions(-) diff --git a/cache.h b/cache.h index 4812f3a63..bf93477e8 100644 --- a/cache.h +++ b/cache.h @@ -1638,7 +1638,6 @@ extern int odb_mkstemp(struct strbuf *template, const char *pattern); extern int odb_pack_keep(const char *name); extern void clear_delta_base_cache(void); -extern struct packed_git *add_packed_git(const char *path, size_t path_len, int local); /* * Make sure that a pointer access into an mmap'd index file is within bounds, diff --git a/connected.c b/connected.c index 136c2ac16..3e3f0148c 100644 --- a/connected.c +++ b/connected.c @@ -3,6 +3,7 @@ #include "sigchain.h" #include "connected.h" #include "transport.h" +#include "pack.h" /* * If we feed all the commits we want to verify to this command diff --git a/pack.c b/pack.c index 93526ea7b..efe0ed3e8 100644 --- a/pack.c +++ b/pack.c @@ -605,3 +605,56 @@ void unuse_pack(struct pack_window **w_cursor) *w_cursor = NULL; } } + +static void try_to_free_pack_memory(size_t size) +{ + release_pack_memory(size); +} + +struct packed_git *add_packed_git(const char *path, size_t path_len, int local) +{ + static int have_set_try_to_free_routine; + struct stat st; + size_t alloc; + struct packed_git *p; + + if (!have_set_try_to_free_routine) { + have_set_try_to_free_routine = 1; + set_try_to_free_routine(try_to_free_pack_memory); + } + + /* +* Make sure a corresponding .pack file exists and that +* the index looks sane. +*/ + if (!strip_suffix_mem(path, _len, ".idx")) + return NULL; + + /* +* ".pack" is long enough to hold any suffix we're adding (and +* the use xsnprintf double-checks that) +*/ + alloc = st_add3(path_len, strlen(".pack"), 1); + p = alloc_packed_git(alloc); + memcpy(p->pack_name, path, path_len); + + xsnprintf(p->pack_name + path_len, alloc - path_len, ".keep"); + if (!access(p->pack_name, F_OK)) + p->pack_keep = 1; + + xsnprintf(p->pack_name + path_len, alloc - path_len, ".pack"); + if (stat(p->pack_name, ) || !S_ISREG(st.st_mode)) { + free(p); + return NULL; + } + + /* ok, it looks sane as far as we can check without +* actually mapping the pack file. +*/ + p->pack_size = st.st_size; + p->pack_local = local; + p->mtime = st.st_mtime; + if (path_len < 40 || get_sha1_hex(path + path_len - 40, p->sha1)) + hashclr(p->sha1); + return p; +} diff --git a/pack.h b/pack.h index 3876e9ae6..c1f3ff32d 100644 --- a/pack.h +++ b/pack.h @@ -150,5 +150,6 @@ extern int open_packed_git(struct packed_git *p); extern unsigned char *use_pack(struct packed_git *, struct pack_window **, off_t, unsigned long *); extern void unuse_pack(struct pack_window **); +extern struct packed_git *add_packed_git(const char *path, size_t path_len, int local); #endif diff --git a/sha1_file.c b/sha1_file.c index 12501ef06..7f12b1ee0 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -717,67 +717,6 @@ void *xmmap(void *start, size_t length, return ret; } -static struct packed_git *alloc_packed_git(int extra) -{ - struct packed_git *p = xmalloc(st_add(sizeof(*p), extra)); - memset(p, 0, sizeof(*p)); - p->pack_fd = -1; - return p; -} - -static void try_to_free_pack_memory(size_t size) -{ - release_pack_memory(size); -} - -struct packed_git *add_packed_git(const char *path, size_t path_len, int local) -{ - static int have_set_try_to_free_routine; - struct stat st; - size_t alloc; - struct packed_git *p; - - if (!have_set_try_to_free_routine) { - have_set_try_to_free_routine = 1; - set_try_to_free_routine(try_to_free_pack_memory); - } - - /* -* Make sure a corresponding .pack file exists and that -* the index looks sane. -*/ - if (!strip_suffix_mem(path, _len, ".idx")) - return NULL; - - /* -* ".pack" is long enough to hold any suffix we're adding (and -* the use xsnprintf double-checks that) -*/ - alloc = st_add3(path_len, strlen(".pack"), 1); - p = alloc_packed_git(alloc); - memcpy(p->pack_name, path, path_len); - - xsnprintf(p->pack_name + path_len, alloc - path_len, ".keep"); - if (!access(p->pack_name, F_OK)) - p->pack_keep = 1; - - xsnprintf(p->pack_name + path_len, alloc - path_len, ".pack"); - if (stat(p->pack_name, ) || !S_ISREG(st.st_mode)) { - free(p); -
[RFC PATCH 03/10] pack: move pack_report()
Signed-off-by: Jonathan Tan--- cache.h | 2 -- pack.c | 24 pack.h | 2 ++ sha1_file.c | 24 4 files changed, 26 insertions(+), 26 deletions(-) diff --git a/cache.h b/cache.h index 1f0f47819..c7f802e4a 100644 --- a/cache.h +++ b/cache.h @@ -1624,8 +1624,6 @@ unsigned long approximate_object_count(void); extern struct packed_git *find_sha1_pack(const unsigned char *sha1, struct packed_git *packs); -extern void pack_report(void); - /* * Create a temporary file rooted in the object database directory, or * die on failure. The filename is taken from "pattern", which should have the diff --git a/pack.c b/pack.c index 0f46e0617..60d9fc3b0 100644 --- a/pack.c +++ b/pack.c @@ -35,3 +35,27 @@ struct packed_git *packed_git; static struct mru packed_git_mru_storage; struct mru *packed_git_mru = _git_mru_storage; + +#define SZ_FMT PRIuMAX +static inline uintmax_t sz_fmt(size_t s) { return s; } + +void pack_report(void) +{ + fprintf(stderr, + "pack_report: getpagesize()= %10" SZ_FMT "\n" + "pack_report: core.packedGitWindowSize = %10" SZ_FMT "\n" + "pack_report: core.packedGitLimit = %10" SZ_FMT "\n", + sz_fmt(getpagesize()), + sz_fmt(packed_git_window_size), + sz_fmt(packed_git_limit)); + fprintf(stderr, + "pack_report: pack_used_ctr= %10u\n" + "pack_report: pack_mmap_calls = %10u\n" + "pack_report: pack_open_windows= %10u / %10u\n" + "pack_report: pack_mapped = " + "%10" SZ_FMT " / %10" SZ_FMT "\n", + pack_used_ctr, + pack_mmap_calls, + pack_open_windows, peak_pack_open_windows, + sz_fmt(pack_mapped), sz_fmt(peak_pack_mapped)); +} diff --git a/pack.h b/pack.h index 7fcd45f7b..6098bfe40 100644 --- a/pack.h +++ b/pack.h @@ -133,4 +133,6 @@ extern unsigned int pack_max_fds; extern size_t peak_pack_mapped; extern size_t pack_mapped; +extern void pack_report(void); + #endif diff --git a/sha1_file.c b/sha1_file.c index 4d95e21eb..0de39f480 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -29,9 +29,6 @@ #include "mergesort.h" #include "quote.h" -#define SZ_FMT PRIuMAX -static inline uintmax_t sz_fmt(size_t s) { return s; } - const unsigned char null_sha1[20]; const struct object_id null_oid; const struct object_id empty_tree_oid = { @@ -682,27 +679,6 @@ static int has_loose_object(const unsigned char *sha1) return check_and_freshen(sha1, 0); } -void pack_report(void) -{ - fprintf(stderr, - "pack_report: getpagesize()= %10" SZ_FMT "\n" - "pack_report: core.packedGitWindowSize = %10" SZ_FMT "\n" - "pack_report: core.packedGitLimit = %10" SZ_FMT "\n", - sz_fmt(getpagesize()), - sz_fmt(packed_git_window_size), - sz_fmt(packed_git_limit)); - fprintf(stderr, - "pack_report: pack_used_ctr= %10u\n" - "pack_report: pack_mmap_calls = %10u\n" - "pack_report: pack_open_windows= %10u / %10u\n" - "pack_report: pack_mapped = " - "%10" SZ_FMT " / %10" SZ_FMT "\n", - pack_used_ctr, - pack_mmap_calls, - pack_open_windows, peak_pack_open_windows, - sz_fmt(pack_mapped), sz_fmt(peak_pack_mapped)); -} - /* * Open and mmap the index file at path, perform a couple of * consistency checks, then record its information to p. Return 0 on -- 2.14.0.434.g98096fd7a8-goog
[RFC PATCH 05/10] pack: move release_pack_memory()
The function unuse_one_window() needs to be temporarily made global. Its scope will be restored to static in a subsequent commit. Signed-off-by: Jonathan Tan--- git-compat-util.h | 2 -- pack.c| 49 + pack.h| 4 sha1_file.c | 49 - 4 files changed, 53 insertions(+), 51 deletions(-) diff --git a/git-compat-util.h b/git-compat-util.h index db9c22de7..201056e2d 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -749,8 +749,6 @@ const char *inet_ntop(int af, const void *src, char *dst, size_t size); extern int git_atexit(void (*handler)(void)); #endif -extern void release_pack_memory(size_t); - typedef void (*try_to_free_t)(size_t); extern try_to_free_t set_try_to_free_routine(try_to_free_t); diff --git a/pack.c b/pack.c index 6edc43228..8daa74ad1 100644 --- a/pack.c +++ b/pack.c @@ -208,3 +208,52 @@ struct packed_git *parse_pack_index(unsigned char *sha1, const char *idx_path) return p; } + +static void scan_windows(struct packed_git *p, + struct packed_git **lru_p, + struct pack_window **lru_w, + struct pack_window **lru_l) +{ + struct pack_window *w, *w_l; + + for (w_l = NULL, w = p->windows; w; w = w->next) { + if (!w->inuse_cnt) { + if (!*lru_w || w->last_used < (*lru_w)->last_used) { + *lru_p = p; + *lru_w = w; + *lru_l = w_l; + } + } + w_l = w; + } +} + +int unuse_one_window(struct packed_git *current) +{ + struct packed_git *p, *lru_p = NULL; + struct pack_window *lru_w = NULL, *lru_l = NULL; + + if (current) + scan_windows(current, _p, _w, _l); + for (p = packed_git; p; p = p->next) + scan_windows(p, _p, _w, _l); + if (lru_p) { + munmap(lru_w->base, lru_w->len); + pack_mapped -= lru_w->len; + if (lru_l) + lru_l->next = lru_w->next; + else + lru_p->windows = lru_w->next; + free(lru_w); + pack_open_windows--; + return 1; + } + return 0; +} + +void release_pack_memory(size_t need) +{ + size_t cur = pack_mapped; + while (need >= (cur - pack_mapped) && unuse_one_window(NULL)) + ; /* nothing */ +} diff --git a/pack.h b/pack.h index 5be0ed42a..c16220586 100644 --- a/pack.h +++ b/pack.h @@ -143,4 +143,8 @@ extern int open_pack_index(struct packed_git *); extern struct packed_git *parse_pack_index(unsigned char *sha1, const char *idx_path); +extern int unuse_one_window(struct packed_git *current); + +extern void release_pack_memory(size_t); + #endif diff --git a/sha1_file.c b/sha1_file.c index 2e414f5f5..644876e4e 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -679,55 +679,6 @@ static int has_loose_object(const unsigned char *sha1) return check_and_freshen(sha1, 0); } -static void scan_windows(struct packed_git *p, - struct packed_git **lru_p, - struct pack_window **lru_w, - struct pack_window **lru_l) -{ - struct pack_window *w, *w_l; - - for (w_l = NULL, w = p->windows; w; w = w->next) { - if (!w->inuse_cnt) { - if (!*lru_w || w->last_used < (*lru_w)->last_used) { - *lru_p = p; - *lru_w = w; - *lru_l = w_l; - } - } - w_l = w; - } -} - -static int unuse_one_window(struct packed_git *current) -{ - struct packed_git *p, *lru_p = NULL; - struct pack_window *lru_w = NULL, *lru_l = NULL; - - if (current) - scan_windows(current, _p, _w, _l); - for (p = packed_git; p; p = p->next) - scan_windows(p, _p, _w, _l); - if (lru_p) { - munmap(lru_w->base, lru_w->len); - pack_mapped -= lru_w->len; - if (lru_l) - lru_l->next = lru_w->next; - else - lru_p->windows = lru_w->next; - free(lru_w); - pack_open_windows--; - return 1; - } - return 0; -} - -void release_pack_memory(size_t need) -{ - size_t cur = pack_mapped; - while (need >= (cur - pack_mapped) && unuse_one_window(NULL)) - ; /* nothing */ -} - static void mmap_limit_check(size_t length) { static size_t limit = 0; -- 2.14.0.434.g98096fd7a8-goog
[RFC PATCH 06/10] pack: move pack-closing functions
The function close_pack_fd() needs to be temporarily made global. Its scope will be restored to static in a subsequent commit. Signed-off-by: Jonathan Tan--- In doing this, I discovered that some builtins close the packs even though they, in theory, should not know anything about how objects are stored. Can we remove those calls? (The tests pass with those calls removed.) --- builtin/am.c| 1 + builtin/clone.c | 1 + builtin/fetch.c | 1 + builtin/merge.c | 1 + cache.h | 8 pack.c | 54 ++ pack.h | 9 + sha1_file.c | 55 --- 8 files changed, 67 insertions(+), 63 deletions(-) diff --git a/builtin/am.c b/builtin/am.c index c973bd96d..c38dd10a3 100644 --- a/builtin/am.c +++ b/builtin/am.c @@ -31,6 +31,7 @@ #include "mailinfo.h" #include "apply.h" #include "string-list.h" +#include "pack.h" /** * Returns 1 if the file is empty or does not exist, 0 otherwise. diff --git a/builtin/clone.c b/builtin/clone.c index 08b5cc433..53410a45d 100644 --- a/builtin/clone.c +++ b/builtin/clone.c @@ -25,6 +25,7 @@ #include "remote.h" #include "run-command.h" #include "connected.h" +#include "pack.h" /* * Overall FIXMEs: diff --git a/builtin/fetch.c b/builtin/fetch.c index c87e59f3b..196a3bfc4 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -17,6 +17,7 @@ #include "connected.h" #include "argv-array.h" #include "utf8.h" +#include "pack.h" static const char * const builtin_fetch_usage[] = { N_("git fetch [] [ [...]]"), diff --git a/builtin/merge.c b/builtin/merge.c index 900bafdb4..9cff4b276 100644 --- a/builtin/merge.c +++ b/builtin/merge.c @@ -32,6 +32,7 @@ #include "gpg-interface.h" #include "sequencer.h" #include "string-list.h" +#include "pack.h" #define DEFAULT_TWOHEAD (1<<0) #define DEFAULT_OCTOPUS (1<<1) diff --git a/cache.h b/cache.h index 5d6839525..25a21a61f 100644 --- a/cache.h +++ b/cache.h @@ -1637,15 +1637,7 @@ extern int odb_mkstemp(struct strbuf *template, const char *pattern); */ extern int odb_pack_keep(const char *name); -/* - * munmap the index file for the specified packfile (if it is - * currently mmapped). - */ -extern void close_pack_index(struct packed_git *); - extern unsigned char *use_pack(struct packed_git *, struct pack_window **, off_t, unsigned long *); -extern void close_pack_windows(struct packed_git *); -extern void close_all_packs(void); extern void unuse_pack(struct pack_window **); extern void clear_delta_base_cache(void); extern struct packed_git *add_packed_git(const char *path, size_t path_len, int local); diff --git a/pack.c b/pack.c index 8daa74ad1..c8e2dbdee 100644 --- a/pack.c +++ b/pack.c @@ -257,3 +257,57 @@ void release_pack_memory(size_t need) while (need >= (cur - pack_mapped) && unuse_one_window(NULL)) ; /* nothing */ } + +void close_pack_windows(struct packed_git *p) +{ + while (p->windows) { + struct pack_window *w = p->windows; + + if (w->inuse_cnt) + die("pack '%s' still has open windows to it", + p->pack_name); + munmap(w->base, w->len); + pack_mapped -= w->len; + pack_open_windows--; + p->windows = w->next; + free(w); + } +} + +int close_pack_fd(struct packed_git *p) +{ + if (p->pack_fd < 0) + return 0; + + close(p->pack_fd); + pack_open_fds--; + p->pack_fd = -1; + + return 1; +} + +void close_pack_index(struct packed_git *p) +{ + if (p->index_data) { + munmap((void *)p->index_data, p->index_size); + p->index_data = NULL; + } +} + +static void close_pack(struct packed_git *p) +{ + close_pack_windows(p); + close_pack_fd(p); + close_pack_index(p); +} + +void close_all_packs(void) +{ + struct packed_git *p; + + for (p = packed_git; p; p = p->next) + if (p->do_not_close) + die("BUG: want to close pack marked 'do-not-close'"); + else + close_pack(p); +} diff --git a/pack.h b/pack.h index c16220586..fd4668528 100644 --- a/pack.h +++ b/pack.h @@ -147,4 +147,13 @@ extern int unuse_one_window(struct packed_git *current); extern void release_pack_memory(size_t); +extern void close_pack_windows(struct packed_git *); +extern int close_pack_fd(struct packed_git *); +/* + * munmap the index file for the specified packfile (if it is + * currently mmapped). + */ +extern void close_pack_index(struct packed_git *); +extern void close_all_packs(void); + #endif diff --git a/sha1_file.c b/sha1_file.c index 644876e4e..e2927244f 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -717,53 +717,6 @@ void *xmmap(void *start, size_t length, return ret; } -void
[RFC PATCH 07/10] pack: move use_pack()
open_packed_git is made global. Signed-off-by: Jonathan Tan--- Unlike the other commits where variables and functions are made global and then remade static, open_packed_git is not remade static (because a function in sha1_file.c still uses it). --- cache.h | 1 - pack.c | 303 ++-- pack.h | 14 +-- sha1_file.c | 285 streaming.c | 1 + 5 files changed, 299 insertions(+), 305 deletions(-) diff --git a/cache.h b/cache.h index 25a21a61f..dd9f9a9ae 100644 --- a/cache.h +++ b/cache.h @@ -1637,7 +1637,6 @@ extern int odb_mkstemp(struct strbuf *template, const char *pattern); */ extern int odb_pack_keep(const char *name); -extern unsigned char *use_pack(struct packed_git *, struct pack_window **, off_t, unsigned long *); extern void unuse_pack(struct pack_window **); extern void clear_delta_base_cache(void); extern struct packed_git *add_packed_git(const char *path, size_t path_len, int local); diff --git a/pack.c b/pack.c index c8e2dbdee..85cb65558 100644 --- a/pack.c +++ b/pack.c @@ -24,14 +24,14 @@ char *sha1_pack_index_name(const unsigned char *sha1) return odb_pack_name(, sha1, "idx"); } -unsigned int pack_used_ctr; -unsigned int pack_mmap_calls; -unsigned int peak_pack_open_windows; -unsigned int pack_open_windows; +static unsigned int pack_used_ctr; +static unsigned int pack_mmap_calls; +static unsigned int peak_pack_open_windows; +static unsigned int pack_open_windows; unsigned int pack_open_fds; -unsigned int pack_max_fds; -size_t peak_pack_mapped; -size_t pack_mapped; +static unsigned int pack_max_fds; +static size_t peak_pack_mapped; +static size_t pack_mapped; struct packed_git *packed_git; static struct mru packed_git_mru_storage; @@ -228,7 +228,7 @@ static void scan_windows(struct packed_git *p, } } -int unuse_one_window(struct packed_git *current) +static int unuse_one_window(struct packed_git *current) { struct packed_git *p, *lru_p = NULL; struct pack_window *lru_w = NULL, *lru_l = NULL; @@ -274,7 +274,7 @@ void close_pack_windows(struct packed_git *p) } } -int close_pack_fd(struct packed_git *p) +static int close_pack_fd(struct packed_git *p) { if (p->pack_fd < 0) return 0; @@ -311,3 +311,288 @@ void close_all_packs(void) else close_pack(p); } + +/* + * The LRU pack is the one with the oldest MRU window, preferring packs + * with no used windows, or the oldest mtime if it has no windows allocated. + */ +static void find_lru_pack(struct packed_git *p, struct packed_git **lru_p, struct pack_window **mru_w, int *accept_windows_inuse) +{ + struct pack_window *w, *this_mru_w; + int has_windows_inuse = 0; + + /* +* Reject this pack if it has windows and the previously selected +* one does not. If this pack does not have windows, reject +* it if the pack file is newer than the previously selected one. +*/ + if (*lru_p && !*mru_w && (p->windows || p->mtime > (*lru_p)->mtime)) + return; + + for (w = this_mru_w = p->windows; w; w = w->next) { + /* +* Reject this pack if any of its windows are in use, +* but the previously selected pack did not have any +* inuse windows. Otherwise, record that this pack +* has windows in use. +*/ + if (w->inuse_cnt) { + if (*accept_windows_inuse) + has_windows_inuse = 1; + else + return; + } + + if (w->last_used > this_mru_w->last_used) + this_mru_w = w; + + /* +* Reject this pack if it has windows that have been +* used more recently than the previously selected pack. +* If the previously selected pack had windows inuse and +* we have not encountered a window in this pack that is +* inuse, skip this check since we prefer a pack with no +* inuse windows to one that has inuse windows. +*/ + if (*mru_w && *accept_windows_inuse == has_windows_inuse && + this_mru_w->last_used > (*mru_w)->last_used) + return; + } + + /* +* Select this pack. +*/ + *mru_w = this_mru_w; + *lru_p = p; + *accept_windows_inuse = has_windows_inuse; +} + +static int close_one_pack(void) +{ + struct packed_git *p, *lru_p = NULL; + struct pack_window *mru_w = NULL; + int accept_windows_inuse = 1; + + for (p = packed_git; p; p = p->next) { + if (p->pack_fd == -1) + continue; +
Re: [PATCH v2] t3700: fix broken test under !POSIXPERM
On 08/08/17 20:21, René Scharfe wrote: > 76e368c378 (t3700: fix broken test under !SANITY) explains that the test > 'git add --chmod=[+-]x changes index with already added file' can fail > if xfoo3 is still present as a symlink from a previous test and deletes > it with rm(1). That still leaves it present in the index, which causes > the test to fail if POSIXPERM is not defined. Get rid of it by calling > "git reset --hard" as well, as 76e368c378 already mentioned in passing. Hmm, I don't think its POSIXPERM (which is defined on cygwin) but the lack of SANITY that is the problem. The test would fail on Linux as well, if it skipped the prior tests marked with SANITY (they get rid of the xfoo3->xfoo2 symlinks, among others). Ack, this fixes it for me. ATB, Ramsay Jones > > Helped-by: Adam Dinwoodie> Signed-off-by: Rene Scharfe > --- > Change since v1: Keep the rm(1) call to avoid a problem on Cygwin. > > t/t3700-add.sh | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/t/t3700-add.sh b/t/t3700-add.sh > index f3a4b4a913..0aae21d698 100755 > --- a/t/t3700-add.sh > +++ b/t/t3700-add.sh > @@ -356,6 +356,7 @@ test_expect_success POSIXPERM,SYMLINKS 'git add > --chmod=+x with symlinks' ' > > test_expect_success 'git add --chmod=[+-]x changes index with already added > file' ' > rm -f foo3 xfoo3 && > + git reset --hard && > echo foo >foo3 && > git add foo3 && > git add --chmod=+x foo3 && >
[RFC PATCH 02/10] pack: move static state variables
sha1_file.c declares some static variables that store packfile-related state. Move them to pack.c. They are temporarily made global, but subsequent commits will restore their scope back to static. Signed-off-by: Jonathan Tan--- pack.c | 14 ++ pack.h | 9 + sha1_file.c | 13 - 3 files changed, 23 insertions(+), 13 deletions(-) diff --git a/pack.c b/pack.c index 0d191dfd6..0f46e0617 100644 --- a/pack.c +++ b/pack.c @@ -1,4 +1,5 @@ #include "cache.h" +#include "mru.h" char *odb_pack_name(struct strbuf *buf, const unsigned char *sha1, @@ -21,3 +22,16 @@ char *sha1_pack_index_name(const unsigned char *sha1) static struct strbuf buf = STRBUF_INIT; return odb_pack_name(, sha1, "idx"); } + +unsigned int pack_used_ctr; +unsigned int pack_mmap_calls; +unsigned int peak_pack_open_windows; +unsigned int pack_open_windows; +unsigned int pack_open_fds; +unsigned int pack_max_fds; +size_t peak_pack_mapped; +size_t pack_mapped; +struct packed_git *packed_git; + +static struct mru packed_git_mru_storage; +struct mru *packed_git_mru = _git_mru_storage; diff --git a/pack.h b/pack.h index 63bfde00c..7fcd45f7b 100644 --- a/pack.h +++ b/pack.h @@ -124,4 +124,13 @@ extern char *sha1_pack_name(const unsigned char *sha1); */ extern char *sha1_pack_index_name(const unsigned char *sha1); +extern unsigned int pack_used_ctr; +extern unsigned int pack_mmap_calls; +extern unsigned int peak_pack_open_windows; +extern unsigned int pack_open_windows; +extern unsigned int pack_open_fds; +extern unsigned int pack_max_fds; +extern size_t peak_pack_mapped; +extern size_t pack_mapped; + #endif diff --git a/sha1_file.c b/sha1_file.c index 7e511ce9e..4d95e21eb 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -682,19 +682,6 @@ static int has_loose_object(const unsigned char *sha1) return check_and_freshen(sha1, 0); } -static unsigned int pack_used_ctr; -static unsigned int pack_mmap_calls; -static unsigned int peak_pack_open_windows; -static unsigned int pack_open_windows; -static unsigned int pack_open_fds; -static unsigned int pack_max_fds; -static size_t peak_pack_mapped; -static size_t pack_mapped; -struct packed_git *packed_git; - -static struct mru packed_git_mru_storage; -struct mru *packed_git_mru = _git_mru_storage; - void pack_report(void) { fprintf(stderr, -- 2.14.0.434.g98096fd7a8-goog
[RFC PATCH 01/10] pack: move pack name-related functions
Currently, sha1_file.c and cache.h contain many functions, both related to and unrelated to packfiles. This makes both files very large and causes an unclear separation of concerns. Create a new file, pack.c, to hold all packfile-related functions currently in sha1_file.c, and designate pack.h to hold these packfile-related functions. In this commit, the pack name-related functions are moved. Subsequent commits will move the other functions. Signed-off-by: Jonathan Tan--- Makefile | 1 + builtin/pack-redundant.c | 1 + cache.h | 23 --- pack.c | 23 +++ pack.h | 23 +++ sha1_file.c | 22 -- 6 files changed, 48 insertions(+), 45 deletions(-) create mode 100644 pack.c diff --git a/Makefile b/Makefile index 461c845d3..a7b901a18 100644 --- a/Makefile +++ b/Makefile @@ -816,6 +816,7 @@ LIB_OBJS += notes-merge.o LIB_OBJS += notes-utils.o LIB_OBJS += object.o LIB_OBJS += oidset.o +LIB_OBJS += pack.o LIB_OBJS += pack-bitmap.o LIB_OBJS += pack-bitmap-write.o LIB_OBJS += pack-check.o diff --git a/builtin/pack-redundant.c b/builtin/pack-redundant.c index cb1df1c76..df36d10e7 100644 --- a/builtin/pack-redundant.c +++ b/builtin/pack-redundant.c @@ -7,6 +7,7 @@ */ #include "builtin.h" +#include "pack.h" #define BLKSIZE 512 diff --git a/cache.h b/cache.h index 71fe09264..1f0f47819 100644 --- a/cache.h +++ b/cache.h @@ -902,20 +902,6 @@ extern void check_repository_format(void); */ extern const char *sha1_file_name(const unsigned char *sha1); -/* - * Return the name of the (local) packfile with the specified sha1 in - * its name. The return value is a pointer to memory that is - * overwritten each time this function is called. - */ -extern char *sha1_pack_name(const unsigned char *sha1); - -/* - * Return the name of the (local) pack index file with the specified - * sha1 in its name. The return value is a pointer to memory that is - * overwritten each time this function is called. - */ -extern char *sha1_pack_index_name(const unsigned char *sha1); - /* * Return an abbreviated sha1 unique within this repository's object database. * The result will be at least `len` characters long, and will be NUL @@ -1648,15 +1634,6 @@ extern void pack_report(void); */ extern int odb_mkstemp(struct strbuf *template, const char *pattern); -/* - * Generate the filename to be used for a pack file with checksum "sha1" and - * extension "ext". The result is written into the strbuf "buf", overwriting - * any existing contents. A pointer to buf->buf is returned as a convenience. - * - * Example: odb_pack_name(out, sha1, "idx") => ".git/objects/pack/pack-1234..idx" - */ -extern char *odb_pack_name(struct strbuf *buf, const unsigned char *sha1, const char *ext); - /* * Create a pack .keep file named "name" (which should generally be the output * of odb_pack_name). Returns a file descriptor opened for writing, or -1 on diff --git a/pack.c b/pack.c new file mode 100644 index 0..0d191dfd6 --- /dev/null +++ b/pack.c @@ -0,0 +1,23 @@ +#include "cache.h" + +char *odb_pack_name(struct strbuf *buf, + const unsigned char *sha1, + const char *ext) +{ + strbuf_reset(buf); + strbuf_addf(buf, "%s/pack/pack-%s.%s", get_object_directory(), + sha1_to_hex(sha1), ext); + return buf->buf; +} + +char *sha1_pack_name(const unsigned char *sha1) +{ + static struct strbuf buf = STRBUF_INIT; + return odb_pack_name(, sha1, "pack"); +} + +char *sha1_pack_index_name(const unsigned char *sha1) +{ + static struct strbuf buf = STRBUF_INIT; + return odb_pack_name(, sha1, "idx"); +} diff --git a/pack.h b/pack.h index 8294341af..63bfde00c 100644 --- a/pack.h +++ b/pack.h @@ -101,4 +101,27 @@ extern int read_pack_header(int fd, struct pack_header *); extern struct sha1file *create_tmp_packfile(char **pack_tmp_name); extern void finish_tmp_packfile(struct strbuf *name_buffer, const char *pack_tmp_name, struct pack_idx_entry **written_list, uint32_t nr_written, struct pack_idx_option *pack_idx_opts, unsigned char sha1[]); +/* + * Generate the filename to be used for a pack file with checksum "sha1" and + * extension "ext". The result is written into the strbuf "buf", overwriting + * any existing contents. A pointer to buf->buf is returned as a convenience. + * + * Example: odb_pack_name(out, sha1, "idx") => ".git/objects/pack/pack-1234..idx" + */ +extern char *odb_pack_name(struct strbuf *buf, const unsigned char *sha1, const char *ext); + +/* + * Return the name of the (local) packfile with the specified sha1 in + * its name. The return value is a pointer to memory that is + * overwritten each time this function is called. + */ +extern char *sha1_pack_name(const unsigned char *sha1); + +/* + * Return the name of the (local) pack index
Re: reftable [v6]: new ref storage format
Shawn Pearcewrites: > For `log_type = 0x4..0x7` the `log_chained` section is used instead to > compress information that already appeared in a prior log record. The > `log_chained` always includes `old_id` for this record, as `new_id` is > implied by the prior (by file order, more recent) record's `old_id`. > > The `not_same_committer` block appears if `log_type & 0x1` is true, > `not_same_message` block appears if `log_type & 0x2` is true. When > one of these blocks is missing, its values are implied by the prior > (more recent) log record. Two comments. * not-same-committer would be what I would use when I switch timezones, even if I stay to be me, right? I am just wondering if it is clear to everybody that "committer" in that phrase is a short-hand for "committer information other than the timestamp". * Should the set of entries that are allowed to use of "chained" log be related to the set of entries that appear in the restart table in any way? For a reader that scans starting at a restart point, it would be very cumbersome if the entry were chained from the previous entry, as it would force it to backtrack entries to find the first non-chained log entry. A simple "log_chained must not be used for an entry that appear in the restart table" rule would solve that, but I didn't see it in the document.
[PATCH v2] t3700: fix broken test under !POSIXPERM
76e368c378 (t3700: fix broken test under !SANITY) explains that the test 'git add --chmod=[+-]x changes index with already added file' can fail if xfoo3 is still present as a symlink from a previous test and deletes it with rm(1). That still leaves it present in the index, which causes the test to fail if POSIXPERM is not defined. Get rid of it by calling "git reset --hard" as well, as 76e368c378 already mentioned in passing. Helped-by: Adam DinwoodieSigned-off-by: Rene Scharfe --- Change since v1: Keep the rm(1) call to avoid a problem on Cygwin. t/t3700-add.sh | 1 + 1 file changed, 1 insertion(+) diff --git a/t/t3700-add.sh b/t/t3700-add.sh index f3a4b4a913..0aae21d698 100755 --- a/t/t3700-add.sh +++ b/t/t3700-add.sh @@ -356,6 +356,7 @@ test_expect_success POSIXPERM,SYMLINKS 'git add --chmod=+x with symlinks' ' test_expect_success 'git add --chmod=[+-]x changes index with already added file' ' rm -f foo3 xfoo3 && + git reset --hard && echo foo >foo3 && git add foo3 && git add --chmod=+x foo3 && -- 2.14.0
Re: reftable [v6]: new ref storage format
I notice that you raised the location of restart table within a block in this iteration (or maybe it happened in v5). This forces you to hold all contents in core before the first byte is written out. You start from the first entry (which will become the first restart entry), emit a handful as prefix compressed entries, emit a full entry (which will become the next restart entry), ... until you have enough to fill both the data and the restart table, then start writing from the header (which needs the length of the block), restart table and then data. I think it is OK to do so for the blocks whose size is limited to 16M, but I wonder if it is sensible to do the same for the index block whose limit is 2G. If you keep the restart table after the data, then you could stream out the entries as you emit, write the restart table, and then seek back to fix the length in the header, without holding the 2G in core, no?
Re: [PATCH] git svn fetch: Create correct commit timestamp when using --localtime
Junio C Hamanowrites: > > Yep, seems alright. Can you apply directly? > > Been a bit preoccupied as of late. Thanks. > > Surely, I'll just add your Reviewed-by: myself ;-) OK, thanks. This will fix the bug I've reported here a week or so ago (see the References header). urs
Re: [PATCH v2 2/2 / RFC] branch: quote branch/ref names to improve readability
On Tue, Aug 8, 2017 at 10:11 AM, Kaartic Sivaraamwrote: > Signed-off-by: Kaartic Sivaraam > --- > branch.c | 16 > 1 file changed, 8 insertions(+), 8 deletions(-) I like this patch. In submodule.c we quote a lot of things (branches, submodules, paths), so this is another step to make the output as a whole more consistent. (Though wondering for non-submodule users, if they perceive it as inconsistency as other parts of the code may not follow the rigorous quoting) > > diff --git a/branch.c b/branch.c > index ad5a2299b..a40721f3c 100644 > --- a/branch.c > +++ b/branch.c > @@ -90,24 +90,24 @@ int install_branch_config(int flag, const char *local, > const char *origin, const > if (shortname) { > if (origin) > printf_ln(rebasing ? > - _("Branch %s set up to track remote > branch %s from %s by rebasing.") : > - _("Branch %s set up to track remote > branch %s from %s."), > + _("Branch '%s' set up to track > remote branch '%s' from '%s' by rebasing.") : > + _("Branch '%s' set up to track > remote branch '%s' from '%s'."), > local, shortname, origin); > else > printf_ln(rebasing ? > - _("Branch %s set up to track local > branch %s by rebasing.") : > - _("Branch %s set up to track local > branch %s."), > + _("Branch '%s' set up to track > local branch '%s' by rebasing.") : > + _("Branch '%s' set up to track > local branch '%s'."), > local, shortname); > } else { > if (origin) > printf_ln(rebasing ? > - _("Branch %s set up to track remote > ref %s by rebasing.") : > - _("Branch %s set up to track remote > ref %s."), > + _("Branch '%s' set up to track > remote ref '%s' by rebasing.") : > + _("Branch '%s' set up to track > remote ref '%s'."), > local, remote); > else > printf_ln(rebasing ? > - _("Branch %s set up to track local > ref %s by rebasing.") : > - _("Branch %s set up to track local > ref %s."), > + _("Branch '%s' set up to track > local ref '%s' by rebasing.") : > + _("Branch '%s' set up to track > local ref '%s'."), > local, remote); > } > } > -- > 2.14.0.rc1.434.g6eded367a >
Re: What's cooking in git.git (Jul 2017, #09; Mon, 31)
On 08/08/2017 15:14, Ævar Arnfjörð Bjarmason wrote: > On Mon, Aug 07 2017, Igor Djordjevic jotted: >> On 07/08/2017 23:25, Igor Djordjevic wrote: >>> On 06/08/2017 22:26, Ævar Arnfjörð Bjarmason wrote: On Sat, Aug 05 2017, Junio C. Hamano jotted: > I actually consider "branch" to *never* invoking a checkout. Even > when "git branch -m A B" happens to be done when your checked out > branch is A and you end up being on B. That is not a "checkout". I think we just have a different mental model of what "checkout" means. In my mind any operation that updates the HEAD to point to a new branch is a checkout of that branch. >>> >>> If I may, from a side-viewer`s point of view, it seems you`re >>> thinking in low-level implementation details, where what Junio >>> describes seems more as a high-level, conceptual/end-user`s point of >>> view. > > Yeah, I think that's a fair summary. Also I didn't mean to de-rail this > whole thread on what "checkout" really means, just explain what I meant > with previous comments, since there seemed to be confusion about that. > >>> Needing to update HEAD reference once we "rename" a branch, too, what >>> you consider a "checkout", seems to be required only because branch >>> name _is_ the branch reference in Git, so we need to update HEAD to >>> point to a new/renamed branch reference -- but it`s still the same >>> branch, conceptually. > > It's not *required* we could do one of three things: > > 1) Do what we do now, i.e. rename the branch/reflog & check out the new > name. > > 2) Rename the branch/reflog and checkout HEAD^0, i.e. say "the branch > is now elsewhere, but we haven't moved your commit". > > 3) Just not run replace_each_worktree_head_symref() which would end up > on a branch with no commits, i.e. an orphan branch. > > Now, I think 2 & 3 are pretty nonsensical and wouldn't ever propose we > should do that, but it's illustrative that #1 is not some required > inevitability in terms of explaining what's happening with the new name > being checked out (i.e. HEAD being updated). I think we agree, but we`re talking from different standpoints again. You say "it *can* be done *but* it doesn`t make sense", where I say "it *can`t* be done *because* it doesn`t make sense". Implementation wise, of course it can be done differently, but conceptually it would be wrong (or confusing, at least), thus different implementation, while theoretically possible, may not be a sane option, thus it`s impractical (not to say "impossible"). By "required", I really meant "required in order to be conceptually sane". >>> Documentation for "git-checkout" states that it is used to "*Switch >>> branches*...[snip]", and that is not what happens here. > > That's just the summary at the top but not the full story of what > git-checkout does. E.g. you can checkout a bare SHA1 which is not > switching branches, or a tag or whatever. I disagree, by checking out a bare SHA1 (or tag or whatever) I`d say you *are* still switching branches - conceptually, at least. When you move from a named branch to (yet) unnamed one, you are switching branches. Same goes for when you switch from one unnamed branch to another (named or unnamed). Might be "detached HEAD" is not something we call an "unnamed branch", but that`s what it practically is. >>> Implementation-wise it does because we can`t do it differently at the >>> moment, but in user`s eyes it`s still the same branch, so no switch >>> is made as far as the user is concerned. > > Kind of, it's also worthwhile to think about that in some sense no > switch would be performed as far as the user is concerned by taking > option #2, i.e. we'd be in the same working tree / you could still make > commits. > > You just couldn't make new commits on your "master" which is now called > "topic" and get new commits on "topic". I think it makes sense to do > that, but again, it's illustrative that it's not inevitable for > discussing the implementation. But your second paragraph exactly explains why it`s not the same with option #2 (I guess you mean "no HEAD update to point to renamed branch" here) - if user renames branch "master" to "topic", it`s natural to expect that next commit is made on "topic" (which is now a new name for "master", conceptually still being the same, just renamed branch). Ending up in limbo, in a detached HEAD state, or even worse, still being on a branch with an old name "master" (which conceptually shouldn`t exist anymore, being known as "topic" now), would be confusing, to say the least, and conceptually wrong, I would add. >>> In a different implementation, where branches would have permanent >>> references other than their names, no HEAD update would be needed as >>> the reference would still be the same, no matter the name change, >>> making the `git branch -m` situation clear even from your standpoint, >>> I`d say. >>> > Really from the end-user's point of
Re: [PATCH v2 1/2 / RFC] builtin/branch: stop supporting the use of --set-upstream option
On 8 August 2017 at 19:11, Kaartic Sivaraamwrote: > The '--set-upstream' option of branch was deprecated in, > > b347d06bf branch: deprecate --set-upstream and show help if we > detect possible mistaken use (Thu, 30 Aug 2012 19:23:13 +0200) > > It was deprecated for the reasons specified in the commit message of > the referenced commit. > > Refactor 'branch' so that it dies with an appropraite error message > when the '--set-upstream' is used. appropriate. (Also, is this really a refactoring?) > > Note that there's a reason behind "dying with an error message" instead of > "not accepting the '--set-upstream'". ;git branch' would still *accept* > '--set-upstream' even after it's removal as a consequence of "unique > prefix can be abbrievated in option names" AND '--set-upstream' is a unique > prefix of '--set-upstream-to' when '--set-upstream' has been removed. In > order to smooth the transition for users due to the "prefix issue" it was > decided to make branch die when seeing the '--set-upstream' flag for a few > years and let the users know that it would be removed some time in the future. > > The before/after behaviour for a simple case follows, > > $ git remote > origin > > Before, > > $ git branch > * master > > $ git branch --set-upstream origin/master > The --set-upstream flag is deprecated and will be removed. Consider using > --track or --set-upstream-to > Branch origin/master set up to track local branch master. > > $ echo $? > 0 > > $ git branch > * master > origin/master > > After, > > $ git branch > * master > > $ git branch --set-upstream origin/master > fatal: the '--set-upstream' flag is no longer supported and will be > removed. Consider using '--track' or '--set-upstream-to' > > $ echo $? > 128 > > $ git branch > * master > > Signed-off-by: Kaartic Sivaraam > --- > Changes in v2: > > The previous patch removed the concerned option while the current patch > makes 'git branch' die on seeing the option. > > The possibility of '--set-upstream' becoming an alias of '--set-upstream-to' > was documented. > > Documentation/git-branch.txt | 8 +++ > builtin/branch.c | 21 +-- > t/t3200-branch.sh| 50 > > t/t6040-tracking-info.sh | 16 +++--- > 4 files changed, 17 insertions(+), 78 deletions(-) > > diff --git a/Documentation/git-branch.txt b/Documentation/git-branch.txt > index 81bd0a7b7..372107e0c 100644 > --- a/Documentation/git-branch.txt > +++ b/Documentation/git-branch.txt > @@ -195,10 +195,10 @@ start-point is either a local or remote-tracking branch. > branch.autoSetupMerge configuration variable is true. > > --set-upstream:: > - If specified branch does not exist yet or if `--force` has been > - given, acts exactly like `--track`. Otherwise sets up configuration > - like `--track` would when creating the branch, except that where > - branch points to is not changed. > + This option is no longer supported and will be removed in the future. > + Consider using --track or --set-upstream-to instead. > ++ > +Note: This could possibly become an alias of --set-upstream-to in the future. Maybe the final note could be removed? Someone who is looking up --set-upstream because Git just "crashed" on them will only want to know what they should do instead. Our thoughts about the future are perhaps not that interesting. (I sort of wonder if this option needs to be documented at all, especially if this doesn't say anything more than the die() just did.) Also, I'm wondering if it should be "has been removed" instead of "will be removed"? /Implementation-wise/, it has not been removed yet, but to the user, it has. So maybe just "This option has been removed. Consider using --track or --set-upstream-to instead." The same below. I don't know if it's worth trying to use PARSE_OPT_HIDDEN in the options-struct? Martin > -u :: > --set-upstream-to=:: > diff --git a/builtin/branch.c b/builtin/branch.c > index a3bd2262b..2fcb6f7e5 100644 > --- a/builtin/branch.c > +++ b/builtin/branch.c > @@ -755,8 +755,6 @@ int cmd_branch(int argc, const char **argv, const char > *prefix) > strbuf_release(); > } else if (argc > 0 && argc <= 2) { > struct branch *branch = branch_get(argv[0]); > - int branch_existed = 0, remote_tracking = 0; > - struct strbuf buf = STRBUF_INIT; > > if (!strcmp(argv[0], "HEAD")) > die(_("it does not make sense to create 'HEAD' > manually")); > @@ -768,28 +766,11 @@ int cmd_branch(int argc, const char **argv, const char > *prefix) > die(_("-a and -r options to 'git branch' do not make > sense with a branch name")); > > if (track ==
Re: [RFC] clang-format: outline the git project's coding style
Brandon Williamswrites: >> > +# Add a line break after the return type of top-level functions >> > +# int >> > +# foo(); >> > +AlwaysBreakAfterReturnType: TopLevel >> >> We do that? > > Haha So generally no we don't do this. Though there are definitely many > places in our code base where we do. Personally this makes it a bit > easier to read when you end up having long function names. I also > worked on a code base which did this and it made it incredible easy to > grep for the definition of a function. If you grep for 'foo()' then > you'd get all the uses of the function including the definition but if > you grep for '^foo()' you'd get only the definition. > > But that's my preference, if we end up using this tool it would probably > make sense to change this. Yeah, I even know people who did int foo(void) for greppability of "^foo". It took some effort to get used to that style. >> > +# Insert a space after a cast >> > +# x = (int32) y;notx = (int32)y; >> > +SpaceAfterCStyleCast: true >> >> Hmph, I thought we did the latter, i.e. cast sticks to the casted >> expression without SP. > > I've seen both and I wasn't sure which was the correct form to use. We do the latter because checkpatch.pl from the kernel project tells us to, I think.
Re: [RFC] clang-format: outline the git project's coding style
On 08/08, Stefan Beller wrote: > On Tue, Aug 8, 2017 at 5:05 AM, Johannes Schindelin >wrote: > > Hi Brandon, > > > > On Mon, 7 Aug 2017, Brandon Williams wrote: > > > >> Add a '.clang-format' file which outlines the git project's coding > >> style. This can be used with clang-format to auto-format .c and .h > >> files to conform with git's style. > >> > >> Signed-off-by: Brandon Williams > >> --- > >> > >> I'm sure this sort of thing comes up every so often on the list but back at > >> git-merge I mentioned how it would be nice to not have to worry about style > >> when reviewing patches as that is something mechanical and best left to a > >> machine (for the most part). > > > > Amen. > > > > If I never have to see a review mentioning an unwrapped line, I am quite > > certain I will be quite content. > > > > Ciao, > > Dscho > > As a thought experiment I'd like to propose to take it one step further: > > If the code was formatted perfectly in one style such that a formatter for > this style would not produce changes when rerun again on the code, then > each individual could have a clean/smudge filter to work in their preferred > style, and only the exchange and storage of code is in a mutual agreed > style. If the mutually agreed style is close to what I prefer, I don't have > to > use clean/smudge filters. > > Additionally to this patch, we'd want to either put a note into > SubmittingPatches or Documentation/gitworkflows.txt to hint at > how to use this formatting to just affect the patch that is currently > worked on or rather a pre-commit hook? I'm sure both of these things will probably happen if we decide to make use of a code-formatter. This RFC is really just trying to ask the question: "Is this something we want to entertain doing?" My response would be 'Yes' but there's many other opinions to consider first :) -- Brandon Williams
Dear Beloved Friend
Dear Beloved Friend, I am Mrs Nicole Benoite Marois and I have been suffering from ovarian cancer disease and the doctor says that i have just few weeks to leave. I am from (Paris) France but based in Benin republic since eleven years ago as a business woman dealing with gold exportation before the death of my husband many years ago. I have $4.5 Million US Dollars at Eco-Bank here in Benin republic and I instructed the bank to transfer the fund to you as foreigner that will apply to the bank after I have gone, that they should release the fund to him/her, but you will assure me that you will take 50 % of the fund and give 50% to the orphanages home in your country for my heart to rest. Yours fairly friend, Mrs. Nicole Benoite Marois
Re: [RFC] clang-format: outline the git project's coding style
On 08/08, Junio C Hamano wrote: > Brandon Williamswrites: > > > Add a '.clang-format' file which outlines the git project's coding > > style. This can be used with clang-format to auto-format .c and .h > > files to conform with git's style. > > > > Signed-off-by: Brandon Williams > > --- > > > > I'm sure this sort of thing comes up every so often on the list but back at > > git-merge I mentioned how it would be nice to not have to worry about style > > when reviewing patches as that is something mechanical and best left to a > > machine (for the most part). I saw that 'clang-format' was brought up on > > the > > list once before a couple years ago > > (https://public-inbox.org/git/20150121220903.ga10...@peff.net/) but nothing > > really came of it. I spent a little bit of time combing through the various > > options and came up with this config based on the general style of our code > > base. The big issue though is that our code base isn't consistent so try as > > you might you wont be able to come up with a config which matches > > everything we > > do (mostly due to the inconsistencies in our code base). > > > > Anyway, I thought I'd bring this topic back up and see how people feel > > about it. > > I gave just one pass over all the rules you have here. I didn't > think too deeply about implications of some of them, but most of > them looked sensible. Thanks for compiling this set of rules. > > Having said that, it is easier to refine individual rules you have > below than to make sure that among the develoepers there is a shared > notion of how these rules are to be used. If we get that part wrong, > we'd see unpleasant consequences, e.g. > > - We may see unwanted code churn on existing codebase, only for the >sake of satisfying the formatting rules specified here. This is an issue when you have an inconsistent code base such as ours as the tool, even when used to format a diff, will format the surrounding context. > > - We may see far more style-only critique to patches posted on the >list simply because there are more readers than writers, and it >is likely that running the tool to nitpick other people's patches >is far easier than writing these patches in the first place (and >forgetting to ask the formatter to nitpick before sending them >out). I would hope that use of such a tool would eventually completely eliminate style-only critiques. > > - Human aesthetics judgement often is necessary to overrule >mechanical rules (e.g. A human may have two pairs of >parameters on separate lines in a function declaration; >BinPackParameters may try to break after ptrA, lenA, ptrB before >lenB in such a case). I know that when you introduce a formatter there are bound to be some issues where a human would choose one way over the other. If we start using a formatter I would expect some of the penalties would need to be tweaked overtime before we rely too heavily on it. > > We need to set our expectation and a guideline to apply these rules > straight, before introducing something like this. > > > > .clang-format | 166 > > ++ > > 1 file changed, 166 insertions(+) > > create mode 100644 .clang-format > > > > diff --git a/.clang-format b/.clang-format > > new file mode 100644 > > index 0..7f28dc259 > > --- /dev/null > > +++ b/.clang-format > > @@ -0,0 +1,166 @@ > > +# Defaults > > + > > +# Use tabs whenever we need to fill whitespace that spans at least from > > one tab > > +# stop to the next one. > > +UseTab: Always > > +TabWidth: 8 > > +IndentWidth: 8 > > +ContinuationIndentWidth: 8 > > +ColumnLimit: 80 > > I often deliberately chomp my lines much shorter than this limit, > and also I deliberately write a line that is a tad longer than this > limit some other times, if doing so makes the result easier to read. > And I know other develoepers with good taste do the same. It is > pointless to have a discussion that begins with "who uses a physical > terminal these days that can only show 80-columns?" to tweak this > value, I think. It is more important to give a guideline on what to > do when lines in your code goes over this limit. > > A mechanical "formatter" would just find an appropriate point in a > line with least "penalty" and chomp it into two. But an appropriate > way to make such a code that is way too deeply indented readable may > instead be judicious use of goto's and one-time helper functions, > for example, which mechanical tools would not do. > > That is an example of what I meant above, i.e. a guideline to apply > these rules. We would not want to say "clang-format suggests this > rewrite, so we should accept the resulting code that is still too > deeply indented as-is"---using the tool to point out an issue is > good, though. > > > + > > +# C Language specifics > > +Language: Cpp > > Hmph ;-)
Re: [RFC] clang-format: outline the git project's coding style
Brandon Williamswrites: > Add a '.clang-format' file which outlines the git project's coding > style. This can be used with clang-format to auto-format .c and .h > files to conform with git's style. > > Signed-off-by: Brandon Williams > --- > > I'm sure this sort of thing comes up every so often on the list but back at > git-merge I mentioned how it would be nice to not have to worry about style > when reviewing patches as that is something mechanical and best left to a > machine (for the most part). I saw that 'clang-format' was brought up on the > list once before a couple years ago > (https://public-inbox.org/git/20150121220903.ga10...@peff.net/) but nothing > really came of it. I spent a little bit of time combing through the various > options and came up with this config based on the general style of our code > base. The big issue though is that our code base isn't consistent so try as > you might you wont be able to come up with a config which matches everything > we > do (mostly due to the inconsistencies in our code base). > > Anyway, I thought I'd bring this topic back up and see how people feel about > it. I gave just one pass over all the rules you have here. I didn't think too deeply about implications of some of them, but most of them looked sensible. Thanks for compiling this set of rules. Having said that, it is easier to refine individual rules you have below than to make sure that among the develoepers there is a shared notion of how these rules are to be used. If we get that part wrong, we'd see unpleasant consequences, e.g. - We may see unwanted code churn on existing codebase, only for the sake of satisfying the formatting rules specified here. - We may see far more style-only critique to patches posted on the list simply because there are more readers than writers, and it is likely that running the tool to nitpick other people's patches is far easier than writing these patches in the first place (and forgetting to ask the formatter to nitpick before sending them out). - Human aesthetics judgement often is necessary to overrule mechanical rules (e.g. A human may have two pairs of parameters on separate lines in a function declaration; BinPackParameters may try to break after ptrA, lenA, ptrB before lenB in such a case). We need to set our expectation and a guideline to apply these rules straight, before introducing something like this. > .clang-format | 166 > ++ > 1 file changed, 166 insertions(+) > create mode 100644 .clang-format > > diff --git a/.clang-format b/.clang-format > new file mode 100644 > index 0..7f28dc259 > --- /dev/null > +++ b/.clang-format > @@ -0,0 +1,166 @@ > +# Defaults > + > +# Use tabs whenever we need to fill whitespace that spans at least from one > tab > +# stop to the next one. > +UseTab: Always > +TabWidth: 8 > +IndentWidth: 8 > +ContinuationIndentWidth: 8 > +ColumnLimit: 80 I often deliberately chomp my lines much shorter than this limit, and also I deliberately write a line that is a tad longer than this limit some other times, if doing so makes the result easier to read. And I know other develoepers with good taste do the same. It is pointless to have a discussion that begins with "who uses a physical terminal these days that can only show 80-columns?" to tweak this value, I think. It is more important to give a guideline on what to do when lines in your code goes over this limit. A mechanical "formatter" would just find an appropriate point in a line with least "penalty" and chomp it into two. But an appropriate way to make such a code that is way too deeply indented readable may instead be judicious use of goto's and one-time helper functions, for example, which mechanical tools would not do. That is an example of what I meant above, i.e. a guideline to apply these rules. We would not want to say "clang-format suggests this rewrite, so we should accept the resulting code that is still too deeply indented as-is"---using the tool to point out an issue is good, though. > + > +# C Language specifics > +Language: Cpp Hmph ;-) > +# Add a line break after the return type of top-level functions > +# int > +# foo(); > +AlwaysBreakAfterReturnType: TopLevel We do that? > +# Pack as many parameters or arguments onto the same line as possible > +# int myFunction(int , int , > +#int ); > +BinPackArguments: true > +BinPackParameters: true I am OK with this but with the caveats I already mentioned. > +# Insert a space after a cast > +# x = (int32) y;notx = (int32)y; > +SpaceAfterCStyleCast: true Hmph, I thought we did the latter, i.e. cast sticks to the casted expression without SP. Thanks.
Re: [RFC] clang-format: outline the git project's coding style
On Tue, Aug 8, 2017 at 5:05 AM, Johannes Schindelinwrote: > Hi Brandon, > > On Mon, 7 Aug 2017, Brandon Williams wrote: > >> Add a '.clang-format' file which outlines the git project's coding >> style. This can be used with clang-format to auto-format .c and .h >> files to conform with git's style. >> >> Signed-off-by: Brandon Williams >> --- >> >> I'm sure this sort of thing comes up every so often on the list but back at >> git-merge I mentioned how it would be nice to not have to worry about style >> when reviewing patches as that is something mechanical and best left to a >> machine (for the most part). > > Amen. > > If I never have to see a review mentioning an unwrapped line, I am quite > certain I will be quite content. > > Ciao, > Dscho As a thought experiment I'd like to propose to take it one step further: If the code was formatted perfectly in one style such that a formatter for this style would not produce changes when rerun again on the code, then each individual could have a clean/smudge filter to work in their preferred style, and only the exchange and storage of code is in a mutual agreed style. If the mutually agreed style is close to what I prefer, I don't have to use clean/smudge filters. Additionally to this patch, we'd want to either put a note into SubmittingPatches or Documentation/gitworkflows.txt to hint at how to use this formatting to just affect the patch that is currently worked on or rather a pre-commit hook?
Re: t3700 broken on pu on Cygwin
On Tue, Aug 08, 2017 at 05:32:21PM +0200, René Scharfe wrote: > Am 08.08.2017 um 17:18 schrieb Adam Dinwoodie: > > The t3700-add.sh test is currently failing on the pu branch on Cygwin. > > To my surprise, the problem appears to have been introduced by a merge, > > 867fa1d6a. Both parents of that merge have the test succeeding, but > > it's failing on that merge commit. > > > > Failing test output below: > > > expecting success: > > git reset --hard && > > echo foo >foo3 && > > git add foo3 && > > git add --chmod=+x foo3 && > > test_mode_in_index 100755 foo3 && > > echo foo >xfoo3 && > > chmod 755 xfoo3 && > > git add xfoo3 && > > git add --chmod=-x xfoo3 && > > test_mode_in_index 100644 xfoo3 > > > > ++ git reset --hard > > HEAD is now at d12df1f commit all > > ++ echo foo > > ++ git add foo3 > > ++ git add --chmod=+x foo3 > > ++ test_mode_in_index 100755 foo3 > > ++ case "$(git ls-files -s "$2")" in > > +++ git ls-files -s foo3 > > ++ echo pass > > pass > > ++ echo foo > > ++ chmod 755 xfoo3 > > ++ git add xfoo3 > > ++ git add --chmod=-x xfoo3 > > cannot chmod 'xfoo3'++ test_mode_in_index 100644 xfoo3 > > ++ case "$(git ls-files -s "$2")" in > > +++ git ls-files -s xfoo3 > > ++ echo fail > > fail > > ++ git ls-files -s xfoo3 > > 12 c5c4ca97a3a080c32920941b665e94a997901491 0 xfoo3 > > ++ return 1 > > + test_eval_ret_=1 > > + want_trace > > + test t = t > > + test t = t > > + set +x > > error: last command exited with $?=1 > > not ok 41 - git add --chmod=[+-]x changes index with already added file > > # > > # git reset --hard && > > # echo foo >foo3 && > > # git add foo3 && > > # git add --chmod=+x foo3 && > > # test_mode_in_index 100755 foo3 && > > # echo foo >xfoo3 && > > # chmod 755 xfoo3 && > > # git add xfoo3 && > > # git add --chmod=-x xfoo3 && > > # test_mode_in_index 100644 xfoo3 > > # > > > > That's strange. The two changes don't seem to be related at all: > > diff --git a/t/t3700-add.sh b/t/t3700-add.sh > index f3a4b4a913..06e3835efb 100755 > --- a/t/t3700-add.sh > +++ b/t/t3700-add.sh > @@ -331,9 +331,8 @@ test_expect_success 'git add --dry-run --ignore-missing > of non-existing file out > test_i18ncmp expect.err actual.err >' > > -test_expect_success 'git add empty string should invoke warning' ' > - git add "" 2>output && > - test_i18ngrep "warning: empty strings" output > +test_expect_success 'git add empty string should fail' ' > + test_must_fail git add "" >' > >test_expect_success 'git add --chmod=[+-]x stages correctly' ' > @@ -355,7 +354,7 @@ test_expect_success POSIXPERM,SYMLINKS 'git add > --chmod=+x with symlinks' ' >' > >test_expect_success 'git add --chmod=[+-]x changes index with already > added file' ' > - rm -f foo3 xfoo3 && > + git reset --hard && > echo foo >foo3 && > git add foo3 && > git add --chmod=+x foo3 && > > The only difference I can see being introduced with the first change > is that the file "output" is gone now. Yep! I thought it was strange too. I spent some time undoing and redoing the changes to check the problem was reproducible. > Does it help to add the "rm -f foo3 xfoo3 &&" back, in addition to > the "git reset --hard"? Apparently so. Including only the output from that test: expecting success: rm -f foo3 xfoo3 && git reset --hard && echo foo >foo3 && git add foo3 && git add --chmod=+x foo3 && test_mode_in_index 100755 foo3 && echo foo >xfoo3 && chmod 755 xfoo3 && git add xfoo3 && git add --chmod=-x xfoo3 && test_mode_in_index 100644 xfoo3 ++ rm -f foo3 xfoo3 ++ git reset --hard HEAD is now at 02bfd98 commit all ++ echo foo ++ git add foo3 ++ git add --chmod=+x foo3 ++ test_mode_in_index 100755 foo3 ++ case "$(git ls-files -s "$2")" in +++ git ls-files -s foo3 ++ echo pass pass ++ echo foo ++ chmod 755 xfoo3 ++ git add xfoo3 ++ git add --chmod=-x xfoo3 ++ test_mode_in_index 100644 xfoo3 ++ case "$(git ls-files -s "$2")" in +++ git ls-files -s xfoo3 ++ echo pass pass + test_eval_ret_=0 + want_trace + test t = t + test t = t + set +x ok 41 - git add --chmod=[+-]x changes index with already added file I'm running a bisect