Re: [PATCH v2 7/7] pack-objects: use mru list when iterating over packs
On Fri, Jul 29, 2016 at 12:15:24AM -0400, Jeff King wrote: > Note that this reordering does have a potential impact on > the final pack, as we store only a single "found" pack for > each object, even if it is present in multiple packs. In > principle, any copy is acceptable, as they all refer to the > same content. But in practice, they may differ in whether > they are stored as deltas, against which base, etc. This may > have an impact on delta reuse, and even the delta search > (since we skip pairs that were already in the same pack). > > It's not clear whether this change of order would hurt or > even help average cases, though. The most likely reason to > have duplicate objects is from the completion of thin packs > (e.g., you have some objects, then receive several pushes; > the packs you receive may be thin on the wire, with deltas > that refer to bases outside the pack, but we complete them > with duplicate base objects when indexing them). > > In such a case the current code would always find the thin > duplicates (because we currently walk the packs in reverse > chronological order). With this patch, it's possible that > some of them would be found in older packs instead. But > again, it's unclear whether that is a net win or loss in > practice. Hmm, so the efficacy of packing aside, this does definitely have a negative effect. I happened to have a repository sitting around that has 15 million objects and 3600 packs (don't ask), so this seemed like a good test. With this patch series, it took 11 minutes to do the counting, delta compression, and write phases. Without it, after 11 minutes git had not even gotten 10% of the way through counting. So that's good news. The not-so-good news is that during the write phase, it hit the "recursive delta detected" warning in write_one(), many times. I think what is happening is that in the original code, we cannot ever see a delta cycle, because the pack ordering is fixed. So if `A` is a delta of `B`, then we know that they must both exist in the same pack (since we do not do cross-pack deltas on disk). And because we look at the packs in the same order for each object, we know that if we are going to find `A`, we must either find `B` in the same pack (or a prior one if there is another duplicate). But if we do so, then we cannot also find `B` as a delta of `A` in that pack (because we know that packs do not have delta cycles themselves) or an earlier pack (because if so, we would have found `A` in that pack, too). But because this series switches the order of pack-lookup between objects, it is possible for us to find a `B` which is a delta against `A` in one pack, and then another copy of `A` which is a delta against another copy of `B` from another pack. We add both of the deltas to our packing list, but at write time when we try to write out all of the bases for `A`, we realize that whoops, we are recursing infinitely. As it turns out, Git actually handles this pretty well! Upon noticing the recursion, it breaks the delta chain and writes out one of the objects as a full base. This is due to Junio's f63c79d (pack-object: tolerate broken packs that have duplicated objects, 2011-11-16), though I think we later decided that duplicated objects were simply insane. So one option is to simply silence the warning, because the resulting pack is perfectly fine. But we do notice this during the write phase, after the delta search is done. So it's possible that the resulting pack is not as small as it could be (i.e., we break the chain with a base object, but it's possible if we looked that we could have broken the chain by making a delta against an existing base object). So I wonder if it's possible to detect this case earlier, during the "can we reuse this delta" bits of check_object(). Suggestions welcome. I haven't really dug past what I've written here, and it's way too late here to go any further tonight. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 7/7] pack-objects: use mru list when iterating over packs
In the original implementation of want_object_in_pack(), we always looked for the object in every pack, so the order did not matter for performance. As of the last few patches, however, we can now often break out of the loop early after finding the first instance, and avoid looking in the other packs at all. In this case, pack order can make a big difference, because we'd like to find the objects by looking at as few packs as possible. This patch switches us to the same packed_git_mru list that is now used by normal object lookups. Here are timings for p5303 on linux.git: Test HEAD^HEAD 5303.3: rev-list (1) 31.31(31.07+0.23)31.28(31.00+0.27) -0.1% 5303.4: repack (1)40.35(38.84+2.60)40.53(39.31+2.32) +0.4% 5303.6: rev-list (50) 31.37(31.15+0.21)31.41(31.16+0.24) +0.1% 5303.7: repack (50) 58.25(68.54+2.03)47.28(57.66+1.89) -18.8% 5303.9: rev-list (1000) 31.91(31.57+0.33)31.93(31.64+0.28) +0.1% 5303.10: repack (1000)304.80(376.00+3.92) 87.21(159.54+2.84) -71.4% The rev-list numbers are unchanged, which makes sense (they are not exercising this code at all). The 50- and 1000-pack repack cases show considerable improvement. The single-pack repack case doesn't, of course; there's nothing to improve. In fact, it gives us a baseline for how fast we could possibly go. You can see that though rev-list can approach the single-pack case even with 1000 packs, repack doesn't. The reason is simple: the loop we are optimizing is only part of what the repack is doing. After the "counting" phase, we do delta compression, which is much more expensive when there are multiple packs, because we have fewer deltas we can reuse (you can also see that these numbers come from a multicore machine; the CPU times are much higher than the wall-clock times due to the delta phase). So the good news is that in cases with many packs, we used to be dominated by the "counting" phase, and now we are dominated by the delta compression (which is faster, and which we have already parallelized). Here are similar numbers for git.git: Test HEAD^ HEAD - 5303.3: rev-list (1) 1.55(1.51+0.02) 1.54(1.53+0.00) -0.6% 5303.4: repack (1)1.82(1.80+0.08) 1.82(1.78+0.09) +0.0% 5303.6: rev-list (50) 1.58(1.57+0.00) 1.58(1.56+0.01) +0.0% 5303.7: repack (50) 2.50(3.12+0.07) 2.31(2.95+0.06) -7.6% 5303.9: rev-list (1000) 2.22(2.20+0.02) 2.23(2.19+0.03) +0.5% 5303.10: repack (1000)10.47(16.78+0.22) 7.50(13.76+0.22) -28.4% Not as impressive in terms of percentage, but still measurable wins. If you look at the wall-clock time improvements in the 1000-pack case, you can see that linux improved by roughly 10x as many seconds as git. That's because it has roughly 10x as many objects, and we'd expect this improvement to scale linearly with the number of objects (since the number of packs is kept constant). It's just that the "counting" phase is a smaller percentage of the total time spent for a git.git repack, and hence the percentage win is smaller. The implementation itself is a straightforward use of the MRU code. We only bother marking a pack as used when we know that we are able to break early out of the loop, for two reasons: 1. If we can't break out early, it does no good; we have to visit each pack anyway, so we might as well avoid even the minor overhead of managing the cache order. 2. The mru_mark() function reorders the list, which would screw up our traversal. So it is only safe to mark when we are about to break out of the loop. We could record the found pack and mark it after the loop finishes, of course, but that's more complicated and it doesn't buy us anything due to (1). Note that this reordering does have a potential impact on the final pack, as we store only a single "found" pack for each object, even if it is present in multiple packs. In principle, any copy is acceptable, as they all refer to the same content. But in practice, they may differ in whether they are stored as deltas, against which base, etc. This may have an impact on delta reuse, and even the delta search (since we skip pairs that were already in the same pack). It's not clear whether this change of order would hurt or even help average cases, though. The most likely reason to have duplicate objects is from the completion of thin packs (e.g., you have some objects, then receive several pushes; the packs you receive may be thin on the wire, with deltas that refer to bases outside the pack, but we complete them with duplicate base objects when indexing them). In such a case the current code would always find the thin duplicates (because we currently walk the packs in reverse chronological order). With this patch, it's
[PATCH v2 6/7] pack-objects: compute local/ignore_pack_keep early
In want_object_in_pack(), we can exit early from our loop if neither "local" nor "ignore_pack_keep" are set. If they are, however, we must examine each pack to see if it has the object and is non-local or has a ".keep". It's quite common for there to be no non-local or .keep packs at all, in which case we know ahead of time that looking further will be pointless. We can pre-compute this by simply iterating over the list of packs ahead of time, and dropping the flags if there are no packs that could match. Another similar strategy would be to modify the loop in want_object_in_pack() to notice that we have already found the object once, and that we are looping only to check for "local" and "keep" attributes. If a pack has neither of those, we can skip the call to find_pack_entry_one(), which is the expensive part of the loop. This has two advantages: - it isn't all-or-nothing; we still get some improvement when there's a small number of kept or non-local packs, and a large number of non-kept local packs - it eliminates any possible race where we add new non-local or kept packs after our initial scan. In practice, I don't think this race matters; we already cache the packed_git information, so somebody who adds a new pack or .keep file after we've started will not be noticed at all, unless we happen to need to call reprepare_packed_git() because a lookup fails. In other words, we're already racy, and the race is not a big deal (losing the race means we might include an object in the pack that would not otherwise be, which is an acceptable outcome). However, it also has a disadvantage: we still loop over the rest of the packs for each object to check their flags. This is much less expensive than doing the object lookup, but still not free. So if we wanted to implement that strategy to cover the non-all-or-nothing cases, we could do so in addition to this one (so you get the most speedup in the all-or-nothing case, and the best we can do in the other cases). But given that the all-or-nothing case is likely the most common, it is probably not worth the trouble, and we can revisit this later if evidence points otherwise. Signed-off-by: Jeff King--- builtin/pack-objects.c | 26 +- 1 file changed, 25 insertions(+), 1 deletion(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 8ad11f2..c4c2a3c 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -46,6 +46,7 @@ static int keep_unreachable, unpack_unreachable, include_tag; static unsigned long unpack_unreachable_expiration; static int pack_loose_unreachable; static int local; +static int have_non_local_packs; static int incremental; static int ignore_packed_keep; static int allow_ofs_delta; @@ -990,7 +991,8 @@ static int want_object_in_pack(const unsigned char *sha1, * we just found is going to be packed, so break * out of the loop to return 1 now. */ - if (!ignore_packed_keep && !local) + if (!ignore_packed_keep && + (!local || !have_non_local_packs)) break; if (local && !p->pack_local) @@ -2799,6 +2801,28 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) progress = 2; prepare_packed_git(); + if (ignore_packed_keep) { + struct packed_git *p; + for (p = packed_git; p; p = p->next) + if (p->pack_local && p->pack_keep) + break; + if (!p) /* no keep-able packs found */ + ignore_packed_keep = 0; + } + if (local) { + /* +* unlike ignore_packed_keep above, we do not want to +* unset "local" based on looking at packs, as it +* also covers non-local objects +*/ + struct packed_git *p; + for (p = packed_git; p; p = p->next) { + if (!p->pack_local) { + have_non_local_packs = 1; + break; + } + } + } if (progress) progress_state = start_progress(_("Counting objects"), 0); -- 2.9.2.607.g98dce7b -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 5/7] pack-objects: break out of want_object loop early
When pack-objects collects the list of objects to pack (either from stdin, or via its internal rev-list), it filters each one through want_object_in_pack(). This function loops through each existing packfile, looking for the object. When we find it, we mark the pack/offset combo for later use. However, we can't just return "yes, we want it" at that point. If --honor-pack-keep is in effect, we must keep looking to find it in _all_ packs, to make sure none of them has a .keep. Likewise, if --local is in effect, we must make sure it is not present in any non-local pack. As a result, the sum effort of these calls is effectively O(nr_objects * nr_packs). In an ordinary repository, we have only a handful of packs, and this doesn't make a big difference. But in pathological cases, it can slow the counting phase to a crawl. This patch notices the case that we have neither "--local" nor "--honor-pack-keep" in effect and breaks out of the loop early, after finding the first instance. Note that our worst case is still "objects * packs" (i.e., we might find each object in the last pack we look in), but in practice we will often break out early. On an "average" repo, my git.git with 8 packs, this shows a modest 2% (a few dozen milliseconds) improvement in the counting-objects phase of "git pack-objects --all--- Same as earlier, though I took the re-ordering and comment from Junio that came out of the earlier discussion. builtin/pack-objects.c | 16 1 file changed, 16 insertions(+) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index a2f8cfd..8ad11f2 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -977,6 +977,22 @@ static int want_object_in_pack(const unsigned char *sha1, return 1; if (incremental) return 0; + + /* +* When asked to do --local (do not include an +* object that appears in a pack we borrow +* from elsewhere) or --honor-pack-keep (do not +* include an object that appears in a pack marked +* with .keep), we need to make sure no copy of this +* object come from in _any_ pack that causes us to +* omit it, and need to complete this loop. When +* neither option is in effect, we know the object +* we just found is going to be packed, so break +* out of the loop to return 1 now. +*/ + if (!ignore_packed_keep && !local) + break; + if (local && !p->pack_local) return 0; if (ignore_packed_keep && p->pack_local && p->pack_keep) -- 2.9.2.607.g98dce7b -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 4/7] find_pack_entry: replace last_found_pack with MRU cache
Each pack has an index for looking up entries in O(log n) time, but if we have multiple packs, we have to scan through them linearly. This can produce a measurable overhead for some operations. We dealt with this long ago in f7c22cc (always start looking up objects in the last used pack first, 2007-05-30), which keeps what is essentially a 1-element most-recently-used cache. In theory, we should be able to do better by keeping a similar but longer cache, that is the same length as the pack-list itself. Since we now have a convenient generic MRU structure, we can plug it in and measure. Here are the numbers for running p5303 against linux.git: Test HEAD^HEAD 5303.3: rev-list (1) 31.56(31.28+0.27)31.30(31.08+0.20) -0.8% 5303.4: repack (1)40.62(39.35+2.36)40.60(39.27+2.44) -0.0% 5303.6: rev-list (50) 31.31(31.06+0.23)31.23(31.00+0.22) -0.3% 5303.7: repack (50) 58.65(69.12+1.94)58.27(68.64+2.05) -0.6% 5303.9: rev-list (1000) 38.74(38.40+0.33)31.87(31.62+0.24) -17.7% 5303.10: repack (1000)367.20(441.80+4.62) 342.00(414.04+3.72) -6.9% The main numbers of interest here are the rev-list ones (since that is exercising the normal object lookup code path). The single-pack case shouldn't improve at all; the 260ms speedup there is just part of the run-to-run noise (but it's important to note that we didn't make anything worse with the overhead of maintaining our cache). In the 50-pack case, we see similar results. There may be a slight improvement, but it's mostly within the noise. The 1000-pack case does show a big improvement, though. That carries over to the repack case, as well. Even though we haven't touched its pack-search loop yet, it does still do a lot of normal object lookups (e.g., for the internal revision walk), and so improves. As a point of reference, I also ran the 1000-pack test against a version of HEAD^ with the last_found_pack optimization disabled. It takes ~60s, so that gives an indication of how much even the single-element cache is helping. For comparison, here's a smaller repository, git.git: Test HEAD^ HEAD - 5303.3: rev-list (1) 1.56(1.54+0.01)1.54(1.51+0.02) -1.3% 5303.4: repack (1)1.84(1.80+0.10)1.82(1.80+0.09) -1.1% 5303.6: rev-list (50) 1.58(1.55+0.02)1.59(1.57+0.01) +0.6% 5303.7: repack (50) 2.50(3.18+0.04)2.50(3.14+0.04) +0.0% 5303.9: rev-list (1000) 2.76(2.71+0.04)2.24(2.21+0.02) -18.8% 5303.10: repack (1000)13.21(19.56+0.25) 11.66(18.01+0.21) -11.7% You can see that the percentage improvement is similar. That's because the lookup we are optimizing is roughly O(nr_objects * nr_packs). Since the number of packs is constant in both tests, we'd expect the improvement to be linear in the number of objects. But the whole process is also linear in the number of objects, so the improvement is a constant factor. The exact improvement does also depend on the contents of the packs. In p5303, the extra packs all have 5 first-parent commits in them, which is a reasonable simulation of a pushed-to repository. But it also means that only 250 first-parent commits are in those packs (compared to almost 50,000 total in linux.git), and the rest are in the huge "base" pack. So once we start looking at history in taht big pack, that's where we'll find most everything, and even the 1-element cache gets close to 100% cache hits. You could almost certainly show better numbers with a more pathological case (e.g., distributing the objects more evenly across the packs). But that's simply not that realistic a scenario, so it makes more sense to focus on these numbers. The implementation itself is a straightforward application of the MRU code. We provide an MRU-ordered list of packs that shadows the packed_git list. This is easy to do because we only create and revise the pack list in one place. The "reprepare" code path actually drops the whole MRU and replaces it for simplicity. It would be more efficient to just add new entries, but there's not much point in optimizing here; repreparing happens rarely, and only after doing a lot of other expensive work. The key things to keep optimized are traversal (which is just a normal linked list, albeit with one extra level of indirection over the regular packed_git list), and marking (which is a constant number of pointer assignments, though slightly more than the old last_found_pack was; it doesn't seem to create a measurable slowdown, though). Signed-off-by: Jeff King--- I could see an argument against this, which is basically: - this is touching a really critical and core part of the code - in normal cases, we should never grow beyond 50 packs - it seems to be a wash at 50 packs So it's all risk and no benefit. But it
[PATCH v2 3/7] add generic most-recently-used list
There are a few places in Git that would benefit from a fast most-recently-used cache (e.g., the list of packs, which we search linearly but would like to order based on locality). This patch introduces a generic list that can be used to store arbitrary pointers in most-recently-used order. The implementation is just a doubly-linked list, where "marking" an item as used moves it to the front of the list. Insertion and marking are O(1), and iteration is O(n). There's no lookup support provided; if you need fast lookups, you are better off with a different data structure in the first place. There is also no deletion support. This would not be hard to do, but it's not necessary for handling pack structs, which are created and never removed. Signed-off-by: Jeff King--- Makefile | 1 + mru.c| 50 ++ mru.h| 45 + 3 files changed, 96 insertions(+) create mode 100644 mru.c create mode 100644 mru.h diff --git a/Makefile b/Makefile index 6a13386..ad3624d 100644 --- a/Makefile +++ b/Makefile @@ -755,6 +755,7 @@ LIB_OBJS += merge.o LIB_OBJS += merge-blobs.o LIB_OBJS += merge-recursive.o LIB_OBJS += mergesort.o +LIB_OBJS += mru.o LIB_OBJS += name-hash.o LIB_OBJS += notes.o LIB_OBJS += notes-cache.o diff --git a/mru.c b/mru.c new file mode 100644 index 000..9dedae0 --- /dev/null +++ b/mru.c @@ -0,0 +1,50 @@ +#include "cache.h" +#include "mru.h" + +void mru_append(struct mru *mru, void *item) +{ + struct mru_entry *cur = xmalloc(sizeof(*cur)); + cur->item = item; + cur->prev = mru->tail; + cur->next = NULL; + + if (mru->tail) + mru->tail->next = cur; + else + mru->head = cur; + mru->tail = cur; +} + +void mru_mark(struct mru *mru, struct mru_entry *entry) +{ + /* If we're already at the front of the list, nothing to do */ + if (mru->head == entry) + return; + + /* Otherwise, remove us from our current slot... */ + if (entry->prev) + entry->prev->next = entry->next; + if (entry->next) + entry->next->prev = entry->prev; + else + mru->tail = entry->prev; + + /* And insert us at the beginning. */ + entry->prev = NULL; + entry->next = mru->head; + if (mru->head) + mru->head->prev = entry; + mru->head = entry; +} + +void mru_clear(struct mru *mru) +{ + struct mru_entry *p = mru->head; + + while (p) { + struct mru_entry *to_free = p; + p = p->next; + free(to_free); + } + mru->head = mru->tail = NULL; +} diff --git a/mru.h b/mru.h new file mode 100644 index 000..42e4aea --- /dev/null +++ b/mru.h @@ -0,0 +1,45 @@ +#ifndef MRU_H +#define MRU_H + +/** + * A simple most-recently-used cache, backed by a doubly-linked list. + * + * Usage is roughly: + * + * // Create a list. Zero-initialization is required. + * static struct mru cache; + * mru_append(, item); + * ... + * + * // Iterate in MRU order. + * struct mru_entry *p; + * for (p = cache.head; p; p = p->next) { + * if (matches(p->item)) + * break; + * } + * + * // Mark an item as used, moving it to the front of the list. + * mru_mark(, p); + * + * // Reset the list to empty, cleaning up all resources. + * mru_clear(); + * + * Note that you SHOULD NOT call mru_mark() and then continue traversing the + * list; it reorders the marked item to the front of the list, and therefore + * you will begin traversing the whole list again. + */ + +struct mru_entry { + void *item; + struct mru_entry *prev, *next; +}; + +struct mru { + struct mru_entry *head, *tail; +}; + +void mru_append(struct mru *mru, void *item); +void mru_mark(struct mru *mru, struct mru_entry *entry); +void mru_clear(struct mru *mru); + +#endif /* MRU_H */ -- 2.9.2.607.g98dce7b -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 2/7] sha1_file: drop free_pack_by_name
The point of this function is to drop an entry from the "packed_git" cache that points to a file we might be overwriting, because our contents may not be the same (and hence the only caller was pack-objects as it moved a temporary packfile into place). In older versions of git, this could happen because the names of packfiles were derived from the set of objects they contained, not the actual bits on disk. But since 1190a1a (pack-objects: name pack files after trailer hash, 2013-12-05), the name reflects the actual bits on disk, and any two packfiles with the same name can be used interchangeably. Dropping this function not only saves a few lines of code, it makes the lifetime of "struct packed_git" much easier to reason about: namely, we now do not ever free these structs. Signed-off-by: Jeff King--- cache.h | 1 - pack-write.c | 1 - sha1_file.c | 30 -- 3 files changed, 32 deletions(-) diff --git a/cache.h b/cache.h index 3855ddf..57ef726 100644 --- a/cache.h +++ b/cache.h @@ -1416,7 +1416,6 @@ extern unsigned char *use_pack(struct packed_git *, struct pack_window **, off_t extern void close_pack_windows(struct packed_git *); extern void close_all_packs(void); extern void unuse_pack(struct pack_window **); -extern void free_pack_by_name(const char *); extern void clear_delta_base_cache(void); extern struct packed_git *add_packed_git(const char *path, size_t path_len, int local); diff --git a/pack-write.c b/pack-write.c index 33293ce..ea0b788 100644 --- a/pack-write.c +++ b/pack-write.c @@ -354,7 +354,6 @@ void finish_tmp_packfile(struct strbuf *name_buffer, die_errno("unable to make temporary index file readable"); strbuf_addf(name_buffer, "%s.pack", sha1_to_hex(sha1)); - free_pack_by_name(name_buffer->buf); if (rename(pack_tmp_name, name_buffer->buf)) die_errno("unable to rename temporary pack file"); diff --git a/sha1_file.c b/sha1_file.c index d5e1121..e045d2f 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -891,36 +891,6 @@ void close_pack_index(struct packed_git *p) } } -/* - * This is used by git-repack in case a newly created pack happens to - * contain the same set of objects as an existing one. In that case - * the resulting file might be different even if its name would be the - * same. It is best to close any reference to the old pack before it is - * replaced on disk. Of course no index pointers or windows for given pack - * must subsist at this point. If ever objects from this pack are requested - * again, the new version of the pack will be reinitialized through - * reprepare_packed_git(). - */ -void free_pack_by_name(const char *pack_name) -{ - struct packed_git *p, **pp = _git; - - while (*pp) { - p = *pp; - if (strcmp(pack_name, p->pack_name) == 0) { - clear_delta_base_cache(); - close_pack(p); - free(p->bad_object_sha1); - *pp = p->next; - if (last_found_pack == p) - last_found_pack = NULL; - free(p); - return; - } - pp = >next; - } -} - static unsigned int get_max_fd_limit(void) { #ifdef RLIMIT_NOFILE -- 2.9.2.607.g98dce7b -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 1/7] t/perf: add tests for many-pack scenarios
Git's pack storage does efficient (log n) lookups in a single packfile's index, but if we have multiple packfiles, we have to linearly search each for a given object. This patch introduces some timing tests for cases where we have a large number of packs, so that we can measure any improvements we make in the following patches. The main thing we want to time is object lookup. To do this, we measure "git rev-list --objects --all", which does a fairly large number of object lookups (essentially one per object in the repository). However, we also measure the time to do a full repack, which is interesting for two reasons. One is that in addition to the usual pack lookup, it has its own linear iteration over the list of packs. And two is that because it it is the tool one uses to go from an inefficient many-pack situation back to a single pack, we care about its performance not only at marginal numbers of packs, but at the extreme cases (e.g., if you somehow end up with 5,000 packs, it is the only way to get back to 1 pack, so we need to make sure it performs well). We measure the performance of each command in three scenarios: 1 pack, 50 packs, and 1,000 packs. The 1-pack case is a baseline; any optimizations we do to handle multiple packs cannot possibly perform better than this. The 50-pack case is as far as Git should generally allow your repository to go, if you have auto-gc enabled with the default settings. So this represents the maximum performance improvement we would expect under normal circumstances. The 1,000-pack case is hopefully rare, though I have seen it in the wild where automatic maintenance was broken for some time (and the repository continued to receive pushes). This represents cases where we care less about general performance, but want to make sure that a full repack command does not take excessively long. Signed-off-by: Jeff King--- By the way, a caution before you run this. It takes one the order of an hour to run on linux.git. So if you're comparing a few builds, be patient. :) t/perf/p5303-many-packs.sh | 87 ++ 1 file changed, 87 insertions(+) create mode 100755 t/perf/p5303-many-packs.sh diff --git a/t/perf/p5303-many-packs.sh b/t/perf/p5303-many-packs.sh new file mode 100755 index 000..3779851 --- /dev/null +++ b/t/perf/p5303-many-packs.sh @@ -0,0 +1,87 @@ +#!/bin/sh + +test_description='performance with large numbers of packs' +. ./perf-lib.sh + +test_perf_large_repo + +# A real many-pack situation would probably come from having a lot of pushes +# over time. We don't know how big each push would be, but we can fake it by +# just walking the first-parent chain and having every 5 commits be their own +# "push". This isn't _entirely_ accurate, as real pushes would have some +# duplicate objects due to thin-pack fixing, but it's a reasonable +# approximation. +# +# And then all of the rest of the objects can go in a single packfile that +# represents the state before any of those pushes (actually, we'll generate +# that first because in such a setup it would be the oldest pack, and we sort +# the packs by reverse mtime inside git). +repack_into_n () { + rm -rf staging && + mkdir staging && + + git rev-list --first-parent HEAD | + sed -n '1~5p' | + head -n "$1" | + perl -e 'print reverse <>' \ + >pushes + + # create base packfile + head -n 1 pushes | + git pack-objects --delta-base-offset --revs staging/pack + + # and then incrementals between each pair of commits + last= && + while read rev + do + if test -n "$last"; then + { + echo "$rev" && + echo "^$last" + } | + git pack-objects --delta-base-offset --revs \ + staging/pack || return 1 + fi + last=$rev + done /dev/null + ' + + # This simulates the interesting part of the repack, which is the + # actual pack generation, without smudging the on-disk setup + # between trials. + test_perf "repack ($nr_packs)" ' + git pack-objects --keep-true-parents \ + --honor-pack-keep --non-empty --all \ + --reflog --indexed-objects --delta-base-offset \ + --stdout /dev/null + ' +done + +test_done -- 2.9.2.607.g98dce7b -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 0/7] speed up pack-objects counting with many packs
This is a follow-up to the patches in http://public-inbox.org/git/20160725184938.ga12...@sigill.intra.peff.net/ that are currently queued in jk/pack-objects-optim-skimming. Roughly, they try to optimize a loop that is O(nr_objects * nr_packs) by breaking out early in some cases. I had written those patches a while ago and confirmed that they did speed up a particular nasty case I had. But when I tried to write a t/perf test to show off the improvement, I found that they didn't help! The reason is that the optimizations are heavily dependent on the order of the packs, and which objects go in which pack. The loop has the same worst-case complexity as it always did, but we rely on getting lucky to break out early. I think the perf test I've included here is more representative of a real-world workloads, and with an extra optimization, I was able to show good numbers with it. The general strategy is to order the pack lookups in most-recently-used order. This replaces an existing 1-element MRU cache in the normal pack lookup code, and replaces a straight reverse-chronological iteration in pack-objects. All credit for thinking of this scheme goes to Michael Haggerty, who suggested the idea to me about six months ago. It seemed like a lot of work at the time, so I didn't do it. :) But as I started to implement the same 1-element cache in pack-objects, I found that the code actually gets rather awkward. The MRU solution makes the callers easier to read, and of course it turns out to be faster, to boot. Anyway, enough chit-chat. The patches are: [1/7]: t/perf: add tests for many-pack scenarios [2/7]: sha1_file: drop free_pack_by_name [3/7]: add generic most-recently-used list [4/7]: find_pack_entry: replace last_found_pack with MRU cache [5/7]: pack-objects: break out of want_object loop early [6/7]: pack-objects: compute local/ignore_pack_keep early [7/7]: pack-objects: use mru list when iterating over packs The actual optimizations are in patches 4 and 7, which have their own numbers. But here are end-to-end numbers for the series against the tip of master (for the meanings, see the discussion in patch 1, and the analysis in 4 and 7): [p5303, linux.git] Test originHEAD - 5303.3: rev-list (1) 31.48(31.20+0.27) 31.18(30.95+0.22) -1.0% 5303.4: repack (1)40.74(39.27+2.56) 40.30(38.96+2.47) -1.1% 5303.6: rev-list (50) 31.65(31.38+0.26) 31.26(31.02+0.23) -1.2% 5303.7: repack (50) 60.90(71.03+2.13) 46.95(57.46+1.85) -22.9% 5303.9: rev-list (1000) 38.63(38.25+0.37) 31.91(31.61+0.28) -17.4% 5303.10: repack (1000)392.52(467.09+5.05) 87.38(159.98+2.92) -77.7% [p5303, git.git] Test origin HEAD - 5303.3: rev-list (1) 1.55(1.54+0.00) 1.56(1.54+0.01) +0.6% 5303.4: repack (1)1.83(1.82+0.06) 1.82(1.82+0.05) -0.5% 5303.6: rev-list (50) 1.58(1.56+0.02) 1.58(1.57+0.00) +0.0% 5303.7: repack (50) 2.50(3.16+0.04) 2.32(2.92+0.09) -7.2% 5303.9: rev-list (1000) 2.64(2.61+0.02) 2.23(2.21+0.01) -15.5% 5303.10: repack (1000)12.68(19.07+0.30) 7.51(13.86+0.20) -40.8% For curiosity, I also ran the git.git case with 10,000 packs. This is even more silly, but it shows that the problem does get worse and worse as the number grows, but that the patches do continue to help: Testorigin HEAD - 5303.12: rev-list (10,000) 26.00(25.86+0.13)15.76(15.62+0.13) -39.4% 5303.13: repack (10,000) 164.11(175.30+1.34) 51.48(62.96+1.18) -68.6% -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] git-format-patch: default to --from to avoid spoofed mails?
On Thu, Jul 28, 2016 at 08:16:19PM -0400, Jeff King wrote: > The question in my mind is whether people actually use format-patch for > things besides emailing, and if the final destination is something other > than "git am". It is a handy format because it is the least-lossy way > to move commits around external to git itself. That's why "rebase" used > it originally. If the final destination is "am" (as it is for rebase), > then in-body headers are OK, because we know it understands those. If > not, then it's a regression. > > I think on the whole that defaulting to "--from" would help more people > than hurt them, but if we do believe there are scripts that would be > regressed, it probably needs a deprecation period. I don't think it's likely that there are scripts that would be regressed (and I think it's likely that there are scripts that would be progressed), but I'd also have no objection to a deprecation period. I just confirmed that with the default changed, --no-from works to return to the current behavior, so we don't need a new option. And --no-from has worked for a long time, so scripts won't need to care if they're working with an old version of git. I can provide a patch implementing a new config option to set the format-patch --from default ("false" for --no-from, "true" for --from, or a string value for --from=value). Do you think this needs the kind of very noisy deprecation period that push.default had, where anyone without the git-config option set gets a warning to stderr? Or do you think it would suffice to provide a warning in the release notes for a while and then change the default? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv1] completion: add option '--recurse-submodules' to 'git clone'
On Thu, Jul 28, 2016 at 9:22 AM, Junio C Hamanowrote: > Stefan Beller writes: > >>> Anyway, I'll apply the "addition to the completion" patch. >>> >>> Thanks. >> >> Thanks for this patch! >> >> Note: if we ever decide to resurrect sb/submodule-default-path, >> we run into a merge conflict. The reasoning for using >> "--recurse-submodules" instead of a plain "--recurse" makes sense >> as well, so that merge conflict will be resolved in favor of this patch. > > Thanks for an advance warning. My rerere database has already been > taught about this conflict ;-) > > As to sb/submodule-default-path topic, which has been blocked on > still-in-flux attribute work, I am tempted to declare that the > attribute work is not yet thread-ready but it is in a good enough > shape to base other works on, and have them advance to 'next'. I am tempted to ask for delaying sb/submodule-default-path a bit more and see if there is an alternative solution. Inspired by Duys series on submodules and worktree I found a new way how to do the submodule management (separation of the submodule URL as a flag indicating the existence and the actual URL), and that will be very similar to what is implemented in sb/submodule-default-path just even more streamlined. (That said the state of that new series is still vapor and not yet solid, so ... it's hopes and dreams.) > > The traditional pattern of allowing the callers to randomly allocate > an array of "struct git_attr_check" and passing the pointer to its > first element to git_check_attr() function was impossible to extend > without having to update the callers, but we have migrated away from > the pattern and the attribute subsystem can be enhanced without > impacting the callers too much. > -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCHv3 6/7] submodule--helper: add remote-branch helper
In a later patch we want to enhance the logic for the branch selection. Rewrite the current logic to be in C, so we can directly use C when we enhance the logic. Signed-off-by: Stefan Beller--- builtin/submodule--helper.c | 28 +++- git-submodule.sh| 2 +- 2 files changed, 28 insertions(+), 2 deletions(-) diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c index fb90c64..710048f 100644 --- a/builtin/submodule--helper.c +++ b/builtin/submodule--helper.c @@ -899,6 +899,31 @@ static int resolve_relative_path(int argc, const char **argv, const char *prefix return 0; } +static const char *remote_submodule_branch(const char *path) +{ + const struct submodule *sub; + gitmodules_config(); + git_config(submodule_config, NULL); + + sub = submodule_from_path(null_sha1, path); + if (!sub->branch) + return "master"; + + return sub->branch; +} + +static int resolve_remote_submodule_branch(int argc, const char **argv, + const char *prefix) +{ + struct strbuf sb = STRBUF_INIT; + if (argc != 2) + die("submodule--helper remote-branch takes exactly one arguments, got %d", argc); + + printf("%s", remote_submodule_branch(argv[1])); + strbuf_release(); + return 0; +} + struct cmd_struct { const char *cmd; int (*fn)(int, const char **, const char *); @@ -912,7 +937,8 @@ static struct cmd_struct commands[] = { {"relative-path", resolve_relative_path}, {"resolve-relative-url", resolve_relative_url}, {"resolve-relative-url-test", resolve_relative_url_test}, - {"init", module_init} + {"init", module_init}, + {"remote-branch", resolve_remote_submodule_branch} }; int cmd_submodule__helper(int argc, const char **argv, const char *prefix) diff --git a/git-submodule.sh b/git-submodule.sh index 0d4021f..1e493a8 100755 --- a/git-submodule.sh +++ b/git-submodule.sh @@ -616,7 +616,7 @@ cmd_update() if test -n "$remote" then - branch=$(get_submodule_config "$name" branch master) + branch=$(git submodule--helper remote-branch "$sm_path") if test -z "$nofetch" then # Fetch remote before determining tracking $sha1 -- 2.9.2.472.g1ffb07c.dirty -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCHv3 3/7] submodule update: narrow scope of local variable
Signed-off-by: Stefan Beller--- git-submodule.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/git-submodule.sh b/git-submodule.sh index 174f4ea..0d4021f 100755 --- a/git-submodule.sh +++ b/git-submodule.sh @@ -591,7 +591,6 @@ cmd_update() name=$(git submodule--helper name "$sm_path") || exit url=$(git config submodule."$name".url) - branch=$(get_submodule_config "$name" branch master) if ! test -z "$update" then update_module=$update @@ -617,6 +616,7 @@ cmd_update() if test -n "$remote" then + branch=$(get_submodule_config "$name" branch master) if test -z "$nofetch" then # Fetch remote before determining tracking $sha1 -- 2.9.2.472.g1ffb07c.dirty -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCHv3 5/7] submodule-config: keep configured branch around
The branch field will be used in a later patch by `submodule update`. Signed-off-by: Stefan Beller--- submodule-config.c | 11 ++- submodule-config.h | 1 + 2 files changed, 11 insertions(+), 1 deletion(-) diff --git a/submodule-config.c b/submodule-config.c index 077db40..ebee1e4 100644 --- a/submodule-config.c +++ b/submodule-config.c @@ -59,6 +59,7 @@ static void free_one_config(struct submodule_entry *entry) { free((void *) entry->config->path); free((void *) entry->config->name); + free((void *) entry->config->branch); free((void *) entry->config->update_strategy.command); free(entry->config); } @@ -199,6 +200,7 @@ static struct submodule *lookup_or_create_by_name(struct submodule_cache *cache, submodule->update_strategy.command = NULL; submodule->fetch_recurse = RECURSE_SUBMODULES_NONE; submodule->ignore = NULL; + submodule->branch = NULL; submodule->recommend_shallow = -1; hashcpy(submodule->gitmodules_sha1, gitmodules_sha1); @@ -358,9 +360,16 @@ static int parse_config(const char *var, const char *value, void *data) if (!me->overwrite && submodule->recommend_shallow != -1) warn_multiple_config(me->commit_sha1, submodule->name, "shallow"); - else { + else submodule->recommend_shallow = git_config_bool(var, value); + } else if (!strcmp(item.buf, "branch")) { + if (!me->overwrite && submodule->branch) + warn_multiple_config(me->commit_sha1, submodule->name, +"branch"); + else { + free((void *)submodule->branch); + submodule->branch = xstrdup(value); } } diff --git a/submodule-config.h b/submodule-config.h index b1fdcc0..d05c542 100644 --- a/submodule-config.h +++ b/submodule-config.h @@ -15,6 +15,7 @@ struct submodule { const char *url; int fetch_recurse; const char *ignore; + const char *branch; struct submodule_update_strategy update_strategy; /* the sha1 blob id of the responsible .gitmodules file */ unsigned char gitmodules_sha1[20]; -- 2.9.2.472.g1ffb07c.dirty -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCHv3 7/7] submodule update: allow '.' for branch value
Gerrit has a "superproject subscription" feature[1], that triggers a commit in a superproject that is subscribed to its submodules. Conceptually this Gerrit feature can be done on the client side with Git via (except for raciness, error handling etc): while [ true ]; do git -C submodule update --remote --force git -C commit -a -m "Update submodules" git -C push done for each branch in the superproject. To ease the configuration in Gerrit a special value of "." has been introduced for the submodule..branch to mean the same branch as the superproject[2], such that you can create a new branch on both superproject and the submodule and this feature continues to work on that new branch. Now we find projects in the wild with such a .gitmodules file. The .gitmodules used in these Gerrit projects do not conform to Gits understanding of how .gitmodules should look like. This teaches Git to deal gracefully with this syntax as well. The redefinition of "." does no harm to existing projects unaware of this change, as "." is an invalid branch name in Git, so we do not expect such projects to exist. Signed-off-by: Stefan Beller--- builtin/submodule--helper.c | 18 ++ t/t7406-submodule-update.sh | 36 +++- 2 files changed, 53 insertions(+), 1 deletion(-) diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c index 710048f..ae88eff 100644 --- a/builtin/submodule--helper.c +++ b/builtin/submodule--helper.c @@ -909,6 +909,24 @@ static const char *remote_submodule_branch(const char *path) if (!sub->branch) return "master"; + if (!strcmp(sub->branch, ".")) { + unsigned char sha1[20]; + const char *refname = resolve_ref_unsafe("HEAD", 0, sha1, NULL); + + if (!refname) + die(_("No such ref: %s"), "HEAD"); + + /* detached HEAD */ + if (!strcmp(refname, "HEAD")) + die(_("Submodule (%s) branch configured to inherit " + "branch from superproject, but the superproject " + "is not on any branch"), sub->name); + + if (!skip_prefix(refname, "refs/heads/", )) + die(_("Expecting a full ref name, got %s"), refname); + return refname; + } + return sub->branch; } diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh index 1bb1f43..1c4c1f2 100755 --- a/t/t7406-submodule-update.sh +++ b/t/t7406-submodule-update.sh @@ -209,9 +209,43 @@ test_expect_success 'submodule update --remote should fetch upstream changes' ' ) ' +test_expect_success 'submodule update --remote should fetch upstream changes with .' ' + ( + cd super && + git config -f .gitmodules submodule."submodule".branch "." && + git add .gitmodules && + git commit -m "submodules: update from the respective superproject branch" + ) && + ( + cd submodule && + echo line4a >> file && + git add file && + test_tick && + git commit -m "upstream line4a" && + git checkout -b test-branch && + test_commit on-test-branch + ) && + ( + cd super && + git submodule update --remote --force submodule && + git -C submodule log -1 --oneline >actual + git -C ../submodule log -1 --oneline master >expect + test_cmp expect actual && + git checkout -b test-branch && + git submodule update --remote --force submodule && + git -C submodule log -1 --oneline >actual + git -C ../submodule log -1 --oneline test-branch >expect + test_cmp expect actual && + git checkout master && + git branch -d test-branch && + #~ git -C ../submodule branch -d test-branch && + git reset --hard HEAD^ + ) +' + test_expect_success 'local config should override .gitmodules branch' ' (cd submodule && -git checkout -b test-branch && +git checkout test-branch && echo line5 >> file && git add file && test_tick && -- 2.9.2.472.g1ffb07c.dirty -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCHv3 1/7] t7406: future proof tests with hard coded depth
The prior hard coded depth was chosen to be exactly the length from the recorded gitlink to the tip of the remote, so if you add more commits to the remote before, this test will not test its intention any more. Signed-off-by: Stefan Beller--- t/t7406-submodule-update.sh | 19 +++ 1 file changed, 11 insertions(+), 8 deletions(-) diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh index 88e9750..8fc3a25 100755 --- a/t/t7406-submodule-update.sh +++ b/t/t7406-submodule-update.sh @@ -841,16 +841,19 @@ test_expect_success SYMLINKS 'submodule update can handle symbolic links in pwd' ' test_expect_success 'submodule update clone shallow submodule' ' + test_when_finished "rm -rf super3" && + first=$(git -C cloned submodule status submodule |cut -c2-41) && + second=$(git -C submodule rev-parse HEAD) && + commit_count=$(git -C submodule rev-list $first^..$second | wc -l) && git clone cloned super3 && pwd=$(pwd) && - (cd super3 && -sed -e "s#url = ../#url = file://$pwd/#" <.gitmodules >.gitmodules.tmp && -mv -f .gitmodules.tmp .gitmodules && -git submodule update --init --depth=3 -(cd submodule && - test 1 = $(git log --oneline | wc -l) -) -) + ( + cd super3 && + sed -e "s#url = ../#url = file://$pwd/#" <.gitmodules >.gitmodules.tmp && + mv -f .gitmodules.tmp .gitmodules && + git submodule update --init --depth=$commit_count && + test 1 = $(git -C submodule log --oneline | wc -l) + ) ' test_expect_success 'submodule update --recursive drops module name before recursing' ' -- 2.9.2.472.g1ffb07c.dirty -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCHv3 2/7] submodule update: respect depth in subsequent fetches
When depth is given the user may have a reasonable expectation that any remote operation is using the given depth. Add a test to demonstrate we still get the desired sha1 even if the depth is too short to include the actual commit. Signed-off-by: Stefan Beller--- git-submodule.sh| 9 + t/t7406-submodule-update.sh | 16 2 files changed, 21 insertions(+), 4 deletions(-) diff --git a/git-submodule.sh b/git-submodule.sh index 4ec7546..174f4ea 100755 --- a/git-submodule.sh +++ b/git-submodule.sh @@ -481,7 +481,8 @@ fetch_in_submodule () ( '') git fetch ;; *) - git fetch $(get_default_remote) "$2" ;; + shift + git fetch $(get_default_remote) "$@" ;; esac ) @@ -619,7 +620,7 @@ cmd_update() if test -z "$nofetch" then # Fetch remote before determining tracking $sha1 - fetch_in_submodule "$sm_path" || + fetch_in_submodule "$sm_path" $depth || die "$(eval_gettext "Unable to fetch in submodule path '\$sm_path'")" fi remote_name=$(sanitize_submodule_env; cd "$sm_path" && get_default_remote) @@ -642,13 +643,13 @@ cmd_update() # Run fetch only if $sha1 isn't present or it # is not reachable from a ref. is_tip_reachable "$sm_path" "$sha1" || - fetch_in_submodule "$sm_path" || + fetch_in_submodule "$sm_path" $depth || die "$(eval_gettext "Unable to fetch in submodule path '\$displaypath'")" # Now we tried the usual fetch, but $sha1 may # not be reachable from any of the refs is_tip_reachable "$sm_path" "$sha1" || - fetch_in_submodule "$sm_path" "$sha1" || + fetch_in_submodule "$sm_path" $depth "$sha1" || die "$(eval_gettext "Fetched in submodule path '\$displaypath', but it did not contain \$sha1. Direct fetching of that commit failed.")" fi diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh index 8fc3a25..1bb1f43 100755 --- a/t/t7406-submodule-update.sh +++ b/t/t7406-submodule-update.sh @@ -856,6 +856,22 @@ test_expect_success 'submodule update clone shallow submodule' ' ) ' +test_expect_success 'submodule update clone shallow submodule outside of depth' ' + test_when_finished "rm -rf super3" && + git clone cloned super3 && + pwd=$(pwd) && + ( + cd super3 && + sed -e "s#url = ../#url = file://$pwd/#" <.gitmodules >.gitmodules.tmp && + mv -f .gitmodules.tmp .gitmodules && + test_must_fail git submodule update --init --depth=1 2>actual && + test_i18ngrep "Direct fetching of that commit failed." actual && + git -C ../submodule config uploadpack.allowReachableSHA1InWant true && + git submodule update --init --depth=1 >actual && + test 1 = $(git -C submodule log --oneline | wc -l) + ) +' + test_expect_success 'submodule update --recursive drops module name before recursing' ' (cd super2 && (cd deeper/submodule/subsubmodule && -- 2.9.2.472.g1ffb07c.dirty -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCHv3 4/7] submodule--helper: fix usage string for relative-path
Internally we call the underscore version of relative_path, but externally we present an API with no underscores. Signed-off-by: Stefan Beller--- builtin/submodule--helper.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c index b22352b..fb90c64 100644 --- a/builtin/submodule--helper.c +++ b/builtin/submodule--helper.c @@ -892,7 +892,7 @@ static int resolve_relative_path(int argc, const char **argv, const char *prefix { struct strbuf sb = STRBUF_INIT; if (argc != 3) - die("submodule--helper relative_path takes exactly 2 arguments, got %d", argc); + die("submodule--helper relative-path takes exactly 2 arguments, got %d", argc); printf("%s", relative_path(argv[1], argv[2], )); strbuf_release(); -- 2.9.2.472.g1ffb07c.dirty -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCHv3 0/7] submodule update: allow '.' for branch value
This got bigger than expected, but I am happier with the results. The meat is found in the last patch. (At least what I am interested in; others may be more interested in the second patch which could be argued to be a real bug fix to be merged down to maint.) Thanks Junio for the thourough review of the first patches, Stefan Stefan Beller (7): t7406: future proof tests with hard coded depth submodule update: respect depth in subsequent fetches submodule update: narrow scope of local variable submodule--helper: fix usage string for relative-path submodule-config: keep configured branch around submodule--helper: add remote-branch helper submodule update: allow '.' for branch value builtin/submodule--helper.c | 48 -- git-submodule.sh| 11 +++ submodule-config.c | 11 ++- submodule-config.h | 1 + t/t7406-submodule-update.sh | 71 +++-- 5 files changed, 125 insertions(+), 17 deletions(-) -- 2.9.2.472.g1ffb07c.dirty -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Small trivial annoyance with the nice new builtin "git am"
On Thu, Jul 28, 2016 at 5:29 PM, Jeff Kingwrote: > > I guess we want something like: > > +void reset_ident_date(void) > +{ > + strbuf_reset(_default_date); > +} Looks sane. > and then to sprinkle calls liberally through builtin-ified programs when > they move from one unit of work to the next. Maybe we can just add it to the end of commit_tree_extended(), and just say "the cache is reset between commits". That way there is no sprinking in random places. Linus -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Small trivial annoyance with the nice new builtin "git am"
On Thu, Jul 28, 2016 at 5:21 PM, Jeff Kingwrote: > > I do wonder if you would be happier giving each commit a "fake" > monotonically increasing time, so they are correctly ordered by commit > date. No, that would be nasty, partly for the corner case you mention, but partly because I just think it's wrong to try to lie about reality. The reason I noticed this in the first place was actually that I just wanted to take a look whether things had gotten slower or faster over time, and see how many patches per second I get from the patch-bombs Andrew sends me. So getting real time was what I was looking for. Also, before somebody asks: the reason git has always cached the "default time" string is because there's a reverse annoying thing, which is looking up time twice, and getting a difference of a second between author times and committer times just because of bad luck. That makes no sense either, and is actually why we have that "ident_default_date()" cache thing going on. So we do want to cache things for a single commit, it's just that for things like "git am" (or, like Junio wondered, "git rebase" - I didn't check) we probabyl just just flush the cache in between commits. Linus -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Small trivial annoyance with the nice new builtin "git am"
On Thu, Jul 28, 2016 at 04:47:17PM -0700, Junio C Hamano wrote: > Also makes me wonder if "git cherry-pick A..B" shares the same > breakage. Probably. I guess we want something like: diff --git a/ident.c b/ident.c index 139c528..e20a772 100644 --- a/ident.c +++ b/ident.c @@ -184,6 +184,11 @@ static const char *ident_default_date(void) return git_default_date.buf; } +void reset_ident_date(void) +{ + strbuf_reset(_default_date); +} + static int crud(unsigned char c) { return c <= 32 || and then to sprinkle calls liberally through builtin-ified programs when they move from one unit of work to the next. We could also support resetting the whole ident string, but I don't think there's any reason to (and unlike the datestamp, it has to do a bit more looking around to come up with its values). -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Small trivial annoyance with the nice new builtin "git am"
On Thu, Jul 28, 2016 at 04:35:58PM -0700, Linus Torvalds wrote: > Now, it would be lovely if the new builtin git-am really was *so* fast > that it applies a 100+-patch series in under a second, but no, that's > not it. It's just that it only looks up the current time once. > > That seems entirely accidental, I think that what happened is that > "ident_default_date()" just ends up initializing the default date > string once, and then the date is cached there, because it's now run > as a single process for the whole series. > > I think I'd rather get the "real" commit dates, even if they show that > git only does a handful of commits a second rather than hundreds of > commits.. > > Something that just clears git_default_date in between "git am" > iterations, perhaps? Yeah, your analysis makes sense and I think clearing the date between patches is a reasonable solution. I do wonder if you would be happier giving each commit a "fake" monotonically increasing time, so they are correctly ordered by commit date. I think that runs into some bizarre corner cases, though, like adding 100 patches in 5 seconds, and ending up with commit timestamps just slightly in the future (which is fine if you're then quiet, but skews if you then follow-up in the next 95 seconds with another commit). Compared to skew, having chunks of 20 commits with identical time stamps is probably slightly less bad. At least it reflects reality. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] git-format-patch: default to --from to avoid spoofed mails?
On Thu, Jul 28, 2016 at 04:53:16PM -0700, Josh Triplett wrote: > I can think of aesthetic reasons to want the non-"--from" format (for > instance, sticking patch files into a non-git-based tool like quilt or a > distribution packaging system, and not wanting your own email address > included), but I can't think of any tool that would produce incorrect > results if handed the --from format. That seems like an argument for > switching the default, and adding a --from-author option or similar to > get the current output. I covered most of my view elsewhere in the thread, but I want to be clear that I also do not know of any widespread tool that would have a problem. I am mostly worried about breaking people's private scripts. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] git-format-patch: default to --from to avoid spoofed mails?
On Thu, Jul 28, 2016 at 03:14:48PM -0700, Junio C Hamano wrote: > > I think the original reason I did not make "--from" the default is that > > I was worried about breaking consumers which do not know how to handle > > in-body headers. > > That's a fair concern. > > So going back to Josh's original problem description: > > While git-send-email knows how to change the patch mails to use your own > address as "From:" and add a "From:" line to the body for the author, > any other tool used to send emails doesn't do that. > > I wonder how these "any other tool" (that reads the format-patch > output, i.e. mbox file with one mail per file each, and sends each > as a piece of e-mail, without paying attention who you, the tool's > user, are and blindly send them with the original "From:" and other > headers intact in the header part of the message) are used in the > wild to send patch submissions. /usr/bin/mail or /usr/bin/Mail > would not be among them, as I suspect they would place everything in > the body part, and the would do so without stripping the "From " > line that exists before each e-mail message. I cannot speak for everybody, of course, but the reason I implemented "--from" is because my workflow is basically: git format-patch --from --stdout @{u}..HEAD >mbox mutt -f mbox and then I use mutt's "resend" command to send each one. Mutt uses the "From" header written by format-patch as the default (and so I would have to manually move the headers around if not for "--from"). The commands above are wrapped in a script, so I have no problem remembering to type "--from", but I can see how it would be irritating for general use. I would go so far as to say that any time the patches are going to be mailed, that "--from" is the right thing to do (because otherwise you are relying on your MUA to avoid impersonating the original author). The question in my mind is whether people actually use format-patch for things besides emailing, and if the final destination is something other than "git am". It is a handy format because it is the least-lossy way to move commits around external to git itself. That's why "rebase" used it originally. If the final destination is "am" (as it is for rebase), then in-body headers are OK, because we know it understands those. If not, then it's a regression. I think on the whole that defaulting to "--from" would help more people than hurt them, but if we do believe there are scripts that would be regressed, it probably needs a deprecation period. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] git-format-patch: default to --from to avoid spoofed mails?
On Thu, Jul 28, 2016 at 02:37:04PM -0700, Junio C Hamano wrote: > Josh Triplettwrites: > > > I'd like to propose changing the default behavior of git-format-patch to > > --from (and adding a --from-author option to override, and perhaps a > > config setting). This will not change the output *except* when > > formatting patches authored by someone else. git-am and git-send-email > > both handle the --from format without any issues. > > I see this in "format-patch --help": > >Note that this option is only useful if you are actually >sending the emails and want to identify yourself as the >sender, but retain the original author (and git am will >correctly pick up the in-body header). Note also that >git send-email already handles this transformation for >you, and this option should not be used if you are >feeding the result to git send-email. > > The first one says "only useful", but it seems what it really means > is "it becomes no-op if you are sending your own patch anyway". So > that one does not worry me. What is most worrysome is the latter > half of the last sentence. Is it really "should not be", or is it > merely "use of this option is just a waste of time, as you would get > exactly the same result anyway"? If it is the latter, that is fine. As far as I can tell, it's the latter. git send-email can do this same transformation, but handles mails that already have the transformation done to them without any issue. > One thing I absolutely do not want to see is people to start > repeating their own ident on in-body "From: " header when they send > their own patch. That would waste everybody's time pointing out > "You do not have to do that, it merely adds noise". As long as you > can guarantee that your change won't increase the rate of that, I am > fine with the proposal. git format-patch with --from *only* adds an in-body "From:" if the commit author differs from the local committer identity. So, as far as I can tell, the only scenario that would produce additional in-body "From:" headers here would be if someone had failed to configure their git identity, and manually set the author for their own commits. (In which case, they'd also have a broken "From:" in any cover letter they generated.) So, it seems exceedingly unlikely to me that this would result in unnecessary in-body "From:" headers. - Josh Triplett -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] git-format-patch: default to --from to avoid spoofed mails?
On Thu, Jul 28, 2016 at 05:56:03PM -0400, Jeff King wrote: > Another way to think about it is that "--from" is a no-brainer when you > really are going to email the patches (and that's why it is has always > been the default behavior in git-send-email). But if you _aren't_ going > to mail the patches, retaining the original headers is more convenient. > It's not clear to me how many non-mail users of format-patch there are > (certainly rebase is one of them, but because it uses "am" on the > receiving side, I think everything should Just Work). I've seen various people use git format-patch to produce a patch file to email around, or a patch file to commit to a patches directory in a distribution package. I don't think any such tools would work incorrectly with the --from output, though for purely aesthetic reasons, I can imagine not wanting to include your own email address when not sending the patch by email. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] git-format-patch: default to --from to avoid spoofed mails?
On Thu, Jul 28, 2016 at 03:14:48PM -0700, Junio C Hamano wrote: > Jeff Kingwrites: > > I think the original reason I did not make "--from" the default is that > > I was worried about breaking consumers which do not know how to handle > > in-body headers. > > That's a fair concern. > > So going back to Josh's original problem description: > > While git-send-email knows how to change the patch mails to use your own > address as "From:" and add a "From:" line to the body for the author, > any other tool used to send emails doesn't do that. > > I wonder how these "any other tool" (that reads the format-patch > output, i.e. mbox file with one mail per file each, and sends each > as a piece of e-mail, without paying attention who you, the tool's > user, are and blindly send them with the original "From:" and other > headers intact in the header part of the message) are used in the > wild to send patch submissions. /usr/bin/mail or /usr/bin/Mail > would not be among them, as I suspect they would place everything in > the body part, and the would do so without stripping the "From " > line that exists before each e-mail message. mutt -H would be one example; I regularly use that to send mails. (It'll override "From:" if it doesn't know the address in it, which loses the author information entirely; it'll work fine with the --from format.) git-imap-send would be another example; its behavior would vary by mail client. Both of those should always work fine with a mail produced via --from; they'll just ignore the in-body "From:" and send the mail. They'd tend to do the wrong thing with a mail produced without using --from though. I don't know what people who end up sending From-spoofed mails to LKML are using, but I've seen such mails regularly. I also get occasional blowback from someone who sent such mails including patches I authored with my address spoofed as "From:". And I've also seen someone flamed for sending patches to a mailing list for review with spoofed "From:" addresses. I can think of aesthetic reasons to want the non-"--from" format (for instance, sticking patch files into a non-git-based tool like quilt or a distribution packaging system, and not wanting your own email address included), but I can't think of any tool that would produce incorrect results if handed the --from format. That seems like an argument for switching the default, and adding a --from-author option or similar to get the current output. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Small trivial annoyance with the nice new builtin "git am"
On Thu, Jul 28, 2016 at 4:35 PM, Linus Torvaldswrote: > Ok, it's no longer *that* new, but I only now noticed.. > > So I noticed that when I applied the last patch-bomb series from > Andrew, all the commit date-stamps are idential. > ... > That seems entirely accidental, I think that what happened is that > "ident_default_date()" just ends up initializing the default date > string once, and then the date is cached there, because it's now run > as a single process for the whole series. Ouch, sorry, that certainly sounds utterly broken. I have a suspicion that we would see more and more breakages like this in the near future, as there are folks trying to redo multi-commit commands in a single process. We need to be very careful to avoid the same mistake. Also makes me wonder if "git cherry-pick A..B" shares the same breakage. Thanks. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Small trivial annoyance with the nice new builtin "git am"
Ok, it's no longer *that* new, but I only now noticed.. So I noticed that when I applied the last patch-bomb series from Andrew, all the commit date-stamps are idential. Now, it would be lovely if the new builtin git-am really was *so* fast that it applies a 100+-patch series in under a second, but no, that's not it. It's just that it only looks up the current time once. That seems entirely accidental, I think that what happened is that "ident_default_date()" just ends up initializing the default date string once, and then the date is cached there, because it's now run as a single process for the whole series. I think I'd rather get the "real" commit dates, even if they show that git only does a handful of commits a second rather than hundreds of commits.. Something that just clears git_default_date in between "git am" iterations, perhaps? Linus -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git-testadd: Execute a command with only the staged changes in Git applied
On 29 July 2016 at 00:38, Junio C Hamanowrote: > Øyvind A. Holm writes: > > Jakub Narębski wrote: > > > I wonder if using `git worktree` instead of `git clone` (well, > > > local clone uses hardlinks, so it is not that costly as it looks > > > like) would be a better solution. > > > > That's an interesting idea. Have to test it out. This is the result > > from the current master in linux.git: > > > > With clone: > > ... > > With worktree: > > ... > > > > Both tests were run with cold cache ("echo 3 > > >/proc/sys/vm/drop_caches" as root). It seems as there's no > > >difference, and that git clone is as fast as it can get without > > >breaking physical laws. And we probably shouldn't do that. :) > > I expect that writing the 55k+ files in the working tree would > dominate the cost. Local clone would make some hardlinks in .git > (including ones in .git/object/*) but the cost of that would not be > too high as long as the repository is well packed; "git worktree" > would reduce that part of the cost from "git clone", but both incur > the cost of "checkout", i.e. actually populating the new working tree. > > Does the test directory even need to look anything like Git? In other > words, would it suffice if it resembled the result of running "git > archive | tar xf -"? I suspect that it would not make much > difference, either, for the same reason, though ;-). Using git archive saved 1.6 seconds: $ mkdir testdir; git archive HEAD | (cd testdir && time tar x) real0m8.881s user0m0.440s sys 0m2.740s $ But when .git is missing in the subdir, git apply doesn't work. Can't think of any way to get that to work without involving patch(1) and friends, and then the binary diffs goes out of the window. But I'm quite satisfied with 10.4 seconds with cold diskcache and >55K files. Very impressive, actually. - Øyvind -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git-testadd: Execute a command with only the staged changes in Git applied
On 28 July 2016 at 21:31, Jakub Narębskiwrote: > W dniu 2016-07-28 o 18:56, Øyvind A. Holm pisze: > > Øyvind A. Holm writes: > > > This is a script I created some weeks ago, and I've found it to be > > > immensely useful. Here is a snippet from git-testadd --help: > > > [...] > > > This script clones the repository to the directory > > > ".testadd.tmp" in the current directory and applies the staged > > > chenges there (unless -u/--unmodified or -p/--pristine is > > > specified), chdirs to the same relative directory in the clone > > > and executes the command specified on the command line there. > > > > That's correct, the test clone is entirely separated from the > > working copy, and you can keep working while the tests are running > > in the clone. Combined with git-gui and/or "git add -p/git reset > > -p", it's easy to tweak the staged changes until things are ok. > > I wonder if using `git worktree` instead of `git clone` (well, local > clone uses hardlinks, so it is not that costly as it looks like) would > be a better solution. That's an interesting idea. Have to test it out. This is the result from the current master in linux.git: With clone: $ time git testadd pwd git-testadd: Using ".testadd.tmp" as destination directory Cloning into '.testadd.tmp'... done. Checking out files: 100% (55256/55256), done. git-testadd: Applying staged changes git-testadd: Executing "pwd" in /home/sunny/src/test-wt/.testadd.tmp /home/sunny/src/test-wt/.testadd.tmp real0m10.464s user0m5.983s sys 0m2.790s $ With worktree: $ time git worktree add testaddtmp Preparing testaddtmp (identifier testaddtmp) Checking out files: 100% (55256/55256), done. HEAD is now at 194dc87 Add braces to avoid "ambiguous ‘else’" compiler warnings real0m10.343s user0m6.010s sys 0m2.523s $ Both tests were run with cold cache ("echo 3 >/proc/sys/vm/drop_caches" as root). It seems as there's no difference, and that git clone is as fast as it can get without breaking physical laws. And we probably shouldn't do that. :) Z poważaniem, Øyvind -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[Git 2.9.2] diff inserted by commit.verbose not removed when rewording
I'm just a random Git user, not very familiar with the Git development process and so on, so please by gentle. Setup: - OS X 10.11.6 - Git 2.9.2 via Homebrew Curiosity triggered me to enable 'commit.verbose' globally with 'git config --global commit.verbose true'. I have not configured it on the repo. It works as expected when committing; the diff is appended to the commit message template, yet doesn't make it into the commit when untouched. Today I noticed that when I then rebase interactively ('git rebase -i HEAD~2'; whatever) and reword a specific commit, the diff that is appended then is _not_ removed and _does_ make it into the commit. The template looks OK (' >8 ' and so on is there) but it seems that the logic to actually cut along that line is not executed then. I'm not sure what more I can say about it.-- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] git-format-patch: default to --from to avoid spoofed mails?
Jeff Kingwrites: >> ... What is most worrysome is the latter >> half of the last sentence. Is it really "should not be", or is it >> merely "use of this option is just a waste of time, as you would get >> exactly the same result anyway"? If it is the latter, that is fine. > > It does what you want, and omits the in-body header when it would be > redundant. OK, then I would no longer be worried about that one. > I think the original reason I did not make "--from" the default is that > I was worried about breaking consumers which do not know how to handle > in-body headers. That's a fair concern. So going back to Josh's original problem description: While git-send-email knows how to change the patch mails to use your own address as "From:" and add a "From:" line to the body for the author, any other tool used to send emails doesn't do that. I wonder how these "any other tool" (that reads the format-patch output, i.e. mbox file with one mail per file each, and sends each as a piece of e-mail, without paying attention who you, the tool's user, are and blindly send them with the original "From:" and other headers intact in the header part of the message) are used in the wild to send patch submissions. /usr/bin/mail or /usr/bin/Mail would not be among them, as I suspect they would place everything in the body part, and the would do so without stripping the "From " line that exists before each e-mail message. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] git-svn: allow --version to work anywhere
Junio C Hamanowrote: > Subject: [PATCH] t9100: portability fix > > Do not say "export VAR=VAL"; "VAR=VAL && export VAR" is always more > portable. Oops, sorry I should've caught that :x -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Alternatives to mid.gmane.org?
Duy Nguyenwrote: > I read this and thought "temporarily" but apparently it's not [1]. A > lot of our links in the mail archive are gmane's :( > > [1] https://lars.ingebrigtsen.no/2016/07/28/the-end-of-gmane/ I may not have time to integrate this extensibly into the public-inbox search engine today, but at least here is an NNTP article number to Message-ID mapping for non-NNTP users: https://public-inbox.org/.temp/gmane.comp.version-control.git-300599.txt.gz (~5 MB) Script used to generate this: https://public-inbox.org/meta/20160728220145.13024-...@80x24.org/ -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] git-format-patch: default to --from to avoid spoofed mails?
On Thu, Jul 28, 2016 at 02:37:04PM -0700, Junio C Hamano wrote: > Josh Triplettwrites: > > > I'd like to propose changing the default behavior of git-format-patch to > > --from (and adding a --from-author option to override, and perhaps a > > config setting). This will not change the output *except* when > > formatting patches authored by someone else. git-am and git-send-email > > both handle the --from format without any issues. > > I see this in "format-patch --help": > >Note that this option is only useful if you are actually >sending the emails and want to identify yourself as the >sender, but retain the original author (and git am will >correctly pick up the in-body header). Note also that >git send-email already handles this transformation for >you, and this option should not be used if you are >feeding the result to git send-email. > > The first one says "only useful", but it seems what it really means > is "it becomes no-op if you are sending your own patch anyway". So > that one does not worry me. What is most worrysome is the latter > half of the last sentence. Is it really "should not be", or is it > merely "use of this option is just a waste of time, as you would get > exactly the same result anyway"? If it is the latter, that is fine. It does what you want, and omits the in-body header when it would be redundant. I think the original reason I did not make "--from" the default is that I was worried about breaking consumers which do not know how to handle in-body headers. "git am" knows how to handle them, but if you have a one-off script that parses only the mail headers, it will start claiming you as the author of every patch. E.g., if you do: git format-patch -o output ... grep -hm1 ^From: output/* right now that gets you a list of patch authors. With "--from", it would return your name N times. That's obviously a toy, but I wonder if people have scripts which behave similarly. Another way to think about it is that "--from" is a no-brainer when you really are going to email the patches (and that's why it is has always been the default behavior in git-send-email). But if you _aren't_ going to mail the patches, retaining the original headers is more convenient. It's not clear to me how many non-mail users of format-patch there are (certainly rebase is one of them, but because it uses "am" on the receiving side, I think everything should Just Work). -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] git-format-patch: default to --from to avoid spoofed mails?
Josh Triplettwrites: > I'd like to propose changing the default behavior of git-format-patch to > --from (and adding a --from-author option to override, and perhaps a > config setting). This will not change the output *except* when > formatting patches authored by someone else. git-am and git-send-email > both handle the --from format without any issues. I see this in "format-patch --help": Note that this option is only useful if you are actually sending the emails and want to identify yourself as the sender, but retain the original author (and git am will correctly pick up the in-body header). Note also that git send-email already handles this transformation for you, and this option should not be used if you are feeding the result to git send-email. The first one says "only useful", but it seems what it really means is "it becomes no-op if you are sending your own patch anyway". So that one does not worry me. What is most worrysome is the latter half of the last sentence. Is it really "should not be", or is it merely "use of this option is just a waste of time, as you would get exactly the same result anyway"? If it is the latter, that is fine. One thing I absolutely do not want to see is people to start repeating their own ident on in-body "From: " header when they send their own patch. That would waste everybody's time pointing out "You do not have to do that, it merely adds noise". As long as you can guarantee that your change won't increase the rate of that, I am fine with the proposal. Thanks. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] git-svn: allow --version to work anywhere
Junio C Hamanowrites: > Eric Wong writes: > >> Repushed my master in case it's a convenient time to pull. >> >> The following changes since commit 29493589e97a2de0c4c1c314f61ccafaee3b5caf: >> >> archive-tar: huge offset and future timestamps would not work on 32-bit >> (2016-07-15 10:51:55 -0700) >> >> are available in the git repository at: >> >> git://bogomips.org/git-svn.git master >> >> for you to fetch changes up to c0071ae5dc1c610ab3791ece7ccf7d4772fde151: >> >> git-svn: allow --version to work anywhere (2016-07-22 20:38:11 +) > > Thanks, pulled. I was too deep in today's integration cycle to waste half-day's work, so I added this hotfix after the merge, directly on 'master'. -- >8 -- Subject: [PATCH] t9100: portability fix Do not say "export VAR=VAL"; "VAR=VAL && export VAR" is always more portable. Signed-off-by: Junio C Hamano --- t/t9100-git-svn-basic.sh | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/t/t9100-git-svn-basic.sh b/t/t9100-git-svn-basic.sh index ab6593b..d29f601 100755 --- a/t/t9100-git-svn-basic.sh +++ b/t/t9100-git-svn-basic.sh @@ -24,7 +24,8 @@ ceiling=$PWD test_expect_success 'git svn --version works anywhere' ' mkdir -p "$deepdir" && ( - export GIT_CEILING_DIRECTORIES="$ceiling" && + GIT_CEILING_DIRECTORIES="$ceiling" && + export GIT_CEILING_DIRECTORIES && cd "$deepdir" && git svn --version ) @@ -32,7 +33,8 @@ test_expect_success 'git svn --version works anywhere' ' test_expect_success 'git svn help works anywhere' ' mkdir -p "$deepdir" && ( - export GIT_CEILING_DIRECTORIES="$ceiling" && + GIT_CEILING_DIRECTORIES="$ceiling" && + export GIT_CEILING_DIRECTORIES && cd "$deepdir" && git svn help ) -- 2.9.2-662-gbb82c05 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] pack-objects: Use reachability bitmap index when generating non-stdout pack too
Kirill Smelkovwrites: > I'm waiting so long for main patch to be at least queued to pu, that I'm > now a bit frustrated and ready to do something not related to main goal :) Perhaps the first step would be to stop putting multiple patches in a single e-mail buried after a few pages of discussion. I will not even find that there _are_ multiple patches in the message if I am not involved directly in the discussion, and the discussion is still ongoing, because it is likely that I'd skim just a few paragraphs at the top before going on to other messages. I won't touch the message I am responding to, as your -- 8< -- cut mark does not even seem to be a reliable marker between patches (i.e. I see something like this that is clearly not a message boundary: than `git pack-objects file.pack`. Extracting erp5.git pack from lab.nexedi.com backup repository: 8< $ time echo 0186ac99 | git pack-objects --stdout --revs >erp5pack-stdout.pack real0m22.309s ... ) -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC] git-format-patch: default to --from to avoid spoofed mails?
When git-format-patch formats a patch authored by someone other than yourself, it defaults to filling in the "From:" field of the email from the commit author. If you explicitly pass the --from option, git-format-patch will instead use your own committer identity as the "From:", and then put a "From:" line at the top of the body if the commit author differs. (git-am know to use that as the commit author when applying.) While git-send-email knows how to change the patch mails to use your own address as "From:" and add a "From:" line to the body for the author, any other tool used to send emails doesn't do that. I've seen more than a few mails sent to various mailing lists and patch review tools with a spoofed "From:" field pointing to the commit author, typically without the knowledge of the author, which can lead to interesting surprises. I'd like to propose changing the default behavior of git-format-patch to --from (and adding a --from-author option to override, and perhaps a config setting). This will not change the output *except* when formatting patches authored by someone else. git-am and git-send-email both handle the --from format without any issues. Before I write such a patch: does anyone see a problem with such a change? - Josh Triplett -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [ANNOUNCE] more archives of this list
Eric Wongwrote: > Code is AGPL-3.0+: git clone https://public-inbox.org/ > > > Additional mirrors or forks (perhaps different UIs) are very welcome, Btw, it's possible to do quote highlighting with user-side CSS: https://public-inbox.org/meta/20160709-user-side-css-example@11/ Will probably add classes for diff colors, too, since a git repository browser with mailing list integration will happen. For the moment, cgit + examples/cgit-commit-filter.lua allows searching subjects (at least non-merge ones). So the commit subject below is a link: https://bogomips.org/mirrors/git.git/commit/?id=7b35efd734e501f to: https://public-inbox.org/git/?x=t="fsck_walk():+optionally+name+objects+on+the+go" -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/RFC 0/7] Add possibility to clone specific subdirectories
Duy Nguyenwrites: >> 4. Fsck complains about missing blobs. Should be fairly easy to fix. > > Not really. You'll have to associate path information with blobs > before you decide that a blob should exist or not. Also the same blob or the tree can exist both inside and outside the narrowed area, as people reorganize their trees all the time. I am not quite convinced a path-based approach (either yours or Robin's) is workable in the longer term. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH va/i18n-even-more] rebase-interactive: trim leading whitespace from progress count
Jakub Narębskiwrites: > On the gripping hand, the number of currently processed commits > (instructions) in an interactive rebase is a number, and arithmetic > expansion can be understood as shell equivalent of casting to integer. I get that argument; it is a bit too cute a justification for my taste, but the resulting code is consistent with how $total loses the potential leading whitespace, so I'll queue it as-is. Thanks. > >>> Signed-off-by: Johannes Sixt >>> --- >>> diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh >>> @@ -121,7 +121,7 @@ mark_action_done () { >>> sed -e 1q < "$todo" >> "$done" >>> sed -e 1d < "$todo" >> "$todo".new >>> mv -f "$todo".new "$todo" >>> - new_count=$(git stripspace --strip-comments <"$done" | wc -l) >>> + new_count=$(( $(git stripspace --strip-comments <"$done" | wc -l) )) >>> echo $new_count >"$msgnum" >>> total=$(($new_count + $(git stripspace --strip-comments <"$todo" | >>> wc -l))) >>> echo $total >"$end" -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] pack-objects: Use reachability bitmap index when generating non-stdout pack too
Junio, first of all thanks for feedback, On Wed, Jul 27, 2016 at 01:40:36PM -0700, Junio C Hamano wrote: > Kirill Smelkovwrites: > > > > From: Kirill Smelkov > > Subject: [PATCH 1/2] pack-objects: Make sure use_bitmap_index is not active > > under > > --local or --honor-pack-keep > > > > Since 6b8fda2d (pack-objects: use bitmaps when packing objects) there > > are two codepaths in pack-objects: with & without using bitmap > > reachability index. > > > > However add_object_entry_from_bitmap(), despite its non-bitmapped > > counterpart add_object_entry(), in no way does check for whether --local > > or --honor-pack-keep should be respected. In non-bitmapped codepath this > > is handled in want_object_in_pack(), but bitmapped codepath has simply > > no such checking at all. > > > > The bitmapped codepath however was allowing to pass --local and > > --honor-pack-keep and bitmap indices were still used under such > > conditions - potentially giving wrong output (including objects from > > non-local or .keep'ed pack). > > > > Instead of fixing bitmapped codepath to respect those options, since > > currently no one actually need or use them in combination with bitmaps, > > let's just force use_bitmap_index=0 when any of --local or > > --honor-pack-keep are used and add appropriate comment about > > not-checking for those in add_object_entry_from_bitmap() > > > > Suggested-by: Jeff King > > --- > > builtin/pack-objects.c | 15 +++ > > 1 file changed, 15 insertions(+) > > > > diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c > > index 15866d7..d7cf782 100644 > > --- a/builtin/pack-objects.c > > +++ b/builtin/pack-objects.c > > @@ -1055,6 +1055,12 @@ static int add_object_entry_from_bitmap(const > > unsigned char *sha1, > > if (have_duplicate_entry(sha1, 0, _pos)) > > return 0; > > > > + /* > > +* for simplicity we always want object to be in pack, as > > +* use_bitmap_index codepath assumes neither --local nor > > --honor-pack-keep > > +* is active. > > +*/ > > I am not sure this comment is useful to readers. > > Unless the readers are comparing add_object_entry() and this > function and wondering why this side lacks a check here, iow, when > they are merely following from a caller of this function through > this function down to its callee to understand what goes on, this > comment would not help them and only confuse them. > > If we were to say something to help those who are comparing these > two functions, I think we should be more explicit, i.e. > > The caller disables use-bitmap-index when --local or > --honor-pack-keep options are in effect because bitmap code is > not prepared to handle them. Because the control does not reach > here if these options are in effect, the check with > want_object_in_pack() to skip objects is not done. > > or something like that. You are probably right. > Or is the rest of the bitmap codepath prepared to handle these > options and it is just the matter of adding the missing check with > want_object_in_pack() here to make it work correctly? I'm waiting so long for main patch to be at least queued to pu, that I'm now a bit frustrated and ready to do something not related to main goal :) (they say every joke contains part of a joke). Here is something from sleepy me: 8< From: Kirill Smelkov Date: Wed, 27 Jul 2016 22:18:04 +0300 Subject: [PATCH 1/2] pack-objects: Teach --use-bitmap-index codepath to respect --local, --honor-pack-keep and --incremental Since 6b8fda2d (pack-objects: use bitmaps when packing objects) there are two codepaths in pack-objects: with & without using bitmap reachability index. However add_object_entry_from_bitmap(), despite its non-bitmapped counterpart add_object_entry(), in no way does check for whether --local or --honor-pack-keep or --incremental should be respected. In non-bitmapped codepath this is handled in want_object_in_pack(), but bitmapped codepath has simply no such checking at all. The bitmapped codepath however was allowing to pass in all those options and with bitmap indices still being used under such conditions - potentially giving wrong output (e.g. including objects from non-local or .keep'ed pack). We can easily fix this by noting the following: when an object comes to add_object_entry_from_bitmap() it can come for two reasons: 1. entries coming from main pack covered by bitmap index, and 2. object coming from, possibly alternate, loose or other packs. For "2" we always have pack not yet found by bitmap traversal code, and thus we can simply reuse non-bitmapped want_object_in_pack() to find in which pack an object lives and also for taking omitting decision. For "1" we always have pack already found by bitmap traversal code and we only need to check that pack for same omission criteria used in want_object_in_pack() for found_pack. Suggested-by:
Re: [PATCH] git-svn: allow --version to work anywhere
Eric Wongwrites: > Repushed my master in case it's a convenient time to pull. > > The following changes since commit 29493589e97a2de0c4c1c314f61ccafaee3b5caf: > > archive-tar: huge offset and future timestamps would not work on 32-bit > (2016-07-15 10:51:55 -0700) > > are available in the git repository at: > > git://bogomips.org/git-svn.git master > > for you to fetch changes up to c0071ae5dc1c610ab3791ece7ccf7d4772fde151: > > git-svn: allow --version to work anywhere (2016-07-22 20:38:11 +) Thanks, pulled. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv2 2/2] submodule update: allow '.' for branch value
Stefan Bellerwrites: > Well I wanted to express: > > The .gitmodules used in these Gerrit projects do not conform > to Gits understanding of how .gitmodules should look like. > Let's make Git understand this Gerrit corner case (which you > would only need when using Gerrit). > > Adding special treatment of the "." value is safe as this is an > invalid branch name in Git. Yup, I got it after reading it twice. My point was that you shouldn't have to read it twice to get it. >> I wonder if the above 8-line block wants to be encapsulated to >> become a part of "git submodule--helper" interface, though. IOW, >> branch=$(git submodule--helper branch "$name") or something? > > There is already get_submodule_config that we implement in shell, > which is also used in cmd_summary for reading the .ignore > field. > > So having the "." check in that method (whether in shell or in C) > doesn't make sense to me. That's an excuse from the helper implementor's side, isn't it? I was coming from the opposite direction, i.e. potential caller of a helper. Whenever I want to know "is there a branch configured for this submodule, and if so what is it?", wouldn't I be entitled to a helper that consistently gets the real branch name with the magic "." resolved for me? >>> + test "$(git log -1 --oneline)" = "$(GIT_DIR=../../submodule/.git >>> git log -1 --oneline master)" >> >> A few issues: >> >> * A crash in "git log" would not be noticed with this. Perhaps >> >> git log -1 --oneline $one_way_to_invoke >expect && >> git log -1 --oneline $the_other_way >actual && >> test_cmp expect actual >> >> would be better? > > Of course. I tried to blend in with the code after looking at the surrounding > code. Maybe I need to modernize that whole test file as a preparatory step. >> >> * What exactly is this testing? The current branch (in submodule) >> pointing at the same commit as the tip of 'master'? Or the >> current branch _is_ 'master'? >> >> * What exactly is the reason why one has GIT_DIR=... and the other >> does not? I do not think this a place to test that "gitdir: " >> in .git points at the right place, so it must be testing >> something else, but I cannot guess. > > It is testing that the local state is at the same commit as the > master branch on the remote side. Ahh, OK, I totally misread that. "git -C ../../submodule log" would have been the more modern way to say that, I would guess, but now it makes sense. >>> + # update is not confused by branch="." even if the the superproject >>> + # is not on any branch currently >>> + git submodule update && >>> + git revert HEAD && >> >> "revert" is rather unusual thing to see in the test. > > The tests are so long that I tried to get back in a state that is as least > different from before to not break the following tests. I guessed that much; I just expected to see "git reset --hard $some_old_state" if you want to rewind to the previous state the next test expects and "revert" looked unusual. >> Also I am not >> sure why cmd_update that now has an explicit check to die when >> branch is set to "." and the head is detached is expected "not" to >> be confused. Perhaps I misread the main part of the patch? > > Well you *only* explicitly die(..) when you ask for --remote. OK, I _did_ misread the patch, then. It would help to have "when giving no --remote, git submodule" before the comment that begins with "update is not confused" to avoid the same confusion. Thanks. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] t7406: correct depth test in shallow test
Stefan Bellerwrites: > The depth of 2 I chose originally turns out to be a lucky > accident too, as the depth from "Commit 2" is 2, > so that we would observe the same depth no matter if > a --depth 2 was given and working or not. > > I'll redo this test (as 2 tests, one is testing the situation as > we have now, i.e. the desired tip is reachable from within > the depth argument, the second test will test if it is not > reachable.) Good that I was puzzled ;-) Looking forward to see an improved test. Thanks. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv2 2/2] submodule update: allow '.' for branch value
On Thu, Jul 28, 2016 at 12:10 PM, Junio C Hamanowrote: > Stefan Beller writes: > >> Gerrit has a "superproject subscription" feature[1], that triggers a >> commit in a superproject that is subscribed to its submodules. >> Conceptually this Gerrit feature can be done on the client side with >> Git via: >> >> git -C submodule update --remote --force >> git -C commit -a -m "Update submodules" >> git -C push >> >> for each branch in the superproject. > > Which I imagine would be useful if you only have one submodule. If > you work on submodules A and B and then want to give the updated > superproject that binds the latest in both A and B as an atomic > update, the three lines above would not quite work, no? When using Gerrit you can submit A,B together bound by a "topic" and that will trigger a superproject update that contains one atomic commit updating A and B at the same time. To fully emulate that Gerrit feature you would need to put the 3 lines above in an infinite loop that polls the remote submodules all the time. If you wanted to do that without that Gerrit feature, this becomes racy as "submodule update --remote" fetches the submodules racily (by design) and it may end up putting A and B in the same commit or not. > >> To ease the configuration in Gerrit >> a special value of "." has been introduced for the submodule..branch >> to mean the same branch as the superproject[2], such that you can create a >> new branch on both superproject and the submodule and this feature >> continues to work on that new branch. >> >> Now we have find projects in the wild with such a .gitmodules file. I'll fix the typo. (delete have or find) > > That's annoying. > >> To have Git working well with these, we imitate the behavior and >> look up the superprojects branch name if the submodules branch is >> configured to ".". In projects that do not use Gerrit, this value >> whould be never configured as "." is not a valid branch name. > > I find that the last sentence is somewhat misleading. I agree it is > justifiable that using "." as the name to trigger a new (to us) > feature is safe, because such a setting wouldn't have meant anything > useful without this change, but I initially misread it and thought > you added "are we using Gerrit? Error out if we are not" logic, > which is not the case here. Well I wanted to express: The .gitmodules used in these Gerrit projects do not conform to Gits understanding of how .gitmodules should look like. Let's make Git understand this Gerrit corner case (which you would only need when using Gerrit). Adding special treatment of the "." value is safe as this is an invalid branch name in Git. > >> diff --git a/git-submodule.sh b/git-submodule.sh >> index 4ec7546..1eb33ad 100755 >> --- a/git-submodule.sh >> +++ b/git-submodule.sh >> @@ -590,7 +590,6 @@ cmd_update() >> >> name=$(git submodule--helper name "$sm_path") || exit >> url=$(git config submodule."$name".url) >> - branch=$(get_submodule_config "$name" branch master) >> if ! test -z "$update" >> then >> update_module=$update >> @@ -616,6 +615,14 @@ cmd_update() >> >> if test -n "$remote" >> then >> + branch=$(get_submodule_config "$name" branch master) >> + if test "$branch" = "." >> + then >> + if ! branch=$(git symbolic-ref --short -q HEAD) >> + then >> + die "$(eval_gettext "submodule branch >> configured to inherit branch from superproject, but it's not on any branch")" >> + fi >> + fi > > I see that you narrowed the scope of "$branch" (which is only used > when $remote exists), but it is a bit annoying to see that change > mixed with "now a dot means something different" change. > > I wonder if the above 8-line block wants to be encapsulated to > become a part of "git submodule--helper" interface, though. IOW, > branch=$(git submodule--helper branch "$name") or something? There is already get_submodule_config that we implement in shell, which is also used in cmd_summary for reading the .ignore field. So having the "." check in that method (whether in shell or in C) doesn't make sense to me. We could of course have our own method specific for branches, that take the "." into account. I think we can go with that. > >> +test_expect_success 'submodule update --remote should fetch upstream >> changes with .' ' >> + (cd super && >> + git config -f .gitmodules submodule."submodule".branch "." && >> + git add .gitmodules && >> + git commit -m "submodules: update from the respective superproject >> branch" >> + ) && >> (cd submodule && >> + echo line4a >> file && >> + git add file && >> +
Re: git-testadd: Execute a command with only the staged changes in Git applied
W dniu 2016-07-28 o 18:56, Øyvind A. Holm pisze: > On 28 July 2016 at 18:37, Junio C Hamanowrote: >> Øyvind A. Holm writes: >>> This is a script I created some weeks ago, and I've found it to be >>> immensely useful. Here is a snippet from git-testadd --help: >>> >>> If you have lots of unrelated uncommitted changes in the current >>> repository and want to split up the commit, how can you easily >>> check if the changes passes the test suite? With all the other >>> unrelated changes it can be hard to make sure that only relevant >>> changes becomes part of the commit, and that they don't result in >>> regressions. This script clones the repository to the directory >>> ".testadd.tmp" in the current directory and applies the staged >>> chenges there (unless -u/--unmodified or -p/--pristine is >>> specified), chdirs to the same relative directory in the clone and >>> executes the command specified on the command line there. >> >> So in short, this solves the same problem as "git stash --keep" but in >> a more scalable way, in the sense that "git stash --keep" allows you >> to instantiate what you have in the index so that your working tree >> can be used for such a test, but you cannot do anything else while you >> are waiting for the test to finish, and "testadd" allows you to keep >> hacking in the working tree while a test runs in its own temporary >> checkout (and presumably you can have more than one running, which >> would allow you to scale more)? > > That's correct, the test clone is entirely separated from the working > copy, and you can keep working while the tests are running in the clone. > Combined with git-gui and/or "git add -p/git reset -p", it's easy to > tweak the staged changes until things are ok. I wonder if using `git worktree` instead of `git clone` (well, local clone uses hardlinks, so it is not that costly as it looks like) would be a better solution. > Also, there is a -l/--label option that creates a clone directory with > the name ".testadd-[LABEL].tmp", so you can have several test clones at > the same time, all with different staged changes. There is also a > -r/--ref option that tries to apply the staged changes onto another > commit, and the command will only run if the apply succeeds. Also, this > won't create dangling heads like "git stash --keep" does. Nice. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] t7406: correct depth test in shallow test
On Thu, Jul 28, 2016 at 11:47 AM, Stefan Bellerwrote: > On Thu, Jul 28, 2016 at 11:39 AM, Junio C Hamano wrote: >> Stefan Beller writes: >> >>> We used to ask for 3 changes and tested for having 1, so the test >>> seems broken. >> >> I am not sure what to think of "seems broken". > > When asking for depth 3, I would expect a result of 1,2, or 3 commits. > > But when testing the depth argument with a history less than 3, and then > implying: "I got 1, which is less than 3, so the depth works", seems > to be a logical shortcut to me. > > I would have expected a history of >3, then ask for 3 and confirm we did not > get 4 or 5 or 6, but 3 only. > >> >> Asking for 3 and having 1 is broken in what way? Should we be >> expecting for 3 because we asked for that many? Should we expect >> less than three even though we asked for three because the upstream >> side does not even have that many? If the current test that asks >> for 3 and gets only 1 is not failing, why should we expect that >> asking for 2 would get 2? In other words, why is it sane that >> asking for fewer number of commits gives us more? > > I think there is a subtle thing going on, that I did not examine properly but > it is hidden in the modernization from > > test 1 = $(something) > to test_line_count = 2 > > I'll investigate the actual reason. So when I place a test_pause just before that check for the lines we have the upstream submodule: $ git log --oneline 6355002 upstream line4 820877d upstream line3 4301fd3 Commit 2 0c90624 upstream which then allows fetching 6355002 upstream line4 820877d upstream line3 4301fd3 Commit 2 and "Commit 2" is recorded as the sha1, i.e. when checking out "Commit 2" and running (git log --oneline | wc -l) we get 1 as it cuts off after that. When adding a test (in the next patch) that adds more commits to the submodule upstream, we will fetch with depth 3 but will no longer see the sha1, so we try a different approach. Current approach: try fetching again with no depth, and then try again with sha1 given. So one could say there is a bug as the fetching should use the depth argument as well. The depth of 2 I chose originally turns out to be a lucky accident too, as the depth from "Commit 2" is 2, so that we would observe the same depth no matter if a --depth 2 was given and working or not. I'll redo this test (as 2 tests, one is testing the situation as we have now, i.e. the desired tip is reachable from within the depth argument, the second test will test if it is not reachable.) > >> >> Also most of the lines in this subshell seem to be breaking >> &&-chain. > > Thanks for pointing that out, will fix while at it. > >> >> >> >>> Correct the test by using test_line_count that exists in the test suite. >>> >>> Signed-off-by: Stefan Beller >>> --- >>> t/t7406-submodule-update.sh | 5 +++-- >>> 1 file changed, 3 insertions(+), 2 deletions(-) >>> >>> diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh >>> index 88e9750..bd261ac 100755 >>> --- a/t/t7406-submodule-update.sh >>> +++ b/t/t7406-submodule-update.sh >>> @@ -846,9 +846,10 @@ test_expect_success 'submodule update clone shallow >>> submodule' ' >>> (cd super3 && >>>sed -e "s#url = ../#url = file://$pwd/#" <.gitmodules >>> >.gitmodules.tmp && >>>mv -f .gitmodules.tmp .gitmodules && >>> - git submodule update --init --depth=3 >>> + git submodule update --init --depth=2 >>>(cd submodule && >>> - test 1 = $(git log --oneline | wc -l) >>> + git log --oneline >lines >>> + test_line_count = 2 lines >>>) >>> ) >>> ' -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv2 2/2] submodule update: allow '.' for branch value
Stefan Bellerwrites: > Gerrit has a "superproject subscription" feature[1], that triggers a > commit in a superproject that is subscribed to its submodules. > Conceptually this Gerrit feature can be done on the client side with > Git via: > > git -C submodule update --remote --force > git -C commit -a -m "Update submodules" > git -C push > > for each branch in the superproject. Which I imagine would be useful if you only have one submodule. If you work on submodules A and B and then want to give the updated superproject that binds the latest in both A and B as an atomic update, the three lines above would not quite work, no? > To ease the configuration in Gerrit > a special value of "." has been introduced for the submodule..branch > to mean the same branch as the superproject[2], such that you can create a > new branch on both superproject and the submodule and this feature > continues to work on that new branch. > > Now we have find projects in the wild with such a .gitmodules file. That's annoying. > To have Git working well with these, we imitate the behavior and > look up the superprojects branch name if the submodules branch is > configured to ".". In projects that do not use Gerrit, this value > whould be never configured as "." is not a valid branch name. I find that the last sentence is somewhat misleading. I agree it is justifiable that using "." as the name to trigger a new (to us) feature is safe, because such a setting wouldn't have meant anything useful without this change, but I initially misread it and thought you added "are we using Gerrit? Error out if we are not" logic, which is not the case here. > diff --git a/git-submodule.sh b/git-submodule.sh > index 4ec7546..1eb33ad 100755 > --- a/git-submodule.sh > +++ b/git-submodule.sh > @@ -590,7 +590,6 @@ cmd_update() > > name=$(git submodule--helper name "$sm_path") || exit > url=$(git config submodule."$name".url) > - branch=$(get_submodule_config "$name" branch master) > if ! test -z "$update" > then > update_module=$update > @@ -616,6 +615,14 @@ cmd_update() > > if test -n "$remote" > then > + branch=$(get_submodule_config "$name" branch master) > + if test "$branch" = "." > + then > + if ! branch=$(git symbolic-ref --short -q HEAD) > + then > + die "$(eval_gettext "submodule branch > configured to inherit branch from superproject, but it's not on any branch")" > + fi > + fi I see that you narrowed the scope of "$branch" (which is only used when $remote exists), but it is a bit annoying to see that change mixed with "now a dot means something different" change. I wonder if the above 8-line block wants to be encapsulated to become a part of "git submodule--helper" interface, though. IOW, branch=$(git submodule--helper branch "$name") or something? > +test_expect_success 'submodule update --remote should fetch upstream changes > with .' ' > + (cd super && > + git config -f .gitmodules submodule."submodule".branch "." && > + git add .gitmodules && > + git commit -m "submodules: update from the respective superproject > branch" > + ) && > (cd submodule && > + echo line4a >> file && > + git add file && > + test_tick && > + git commit -m "upstream line4a" && > + git checkout -b test-branch && > + test_commit on-test-branch > + ) && > + (cd super && > + git submodule update --remote --force submodule && > + (cd submodule && > + test "$(git log -1 --oneline)" = "$(GIT_DIR=../../submodule/.git git > log -1 --oneline master)" A few issues: * A crash in "git log" would not be noticed with this. Perhaps git log -1 --oneline $one_way_to_invoke >expect && git log -1 --oneline $the_other_way >actual && test_cmp expect actual would be better? * What exactly is this testing? The current branch (in submodule) pointing at the same commit as the tip of 'master'? Or the current branch _is_ 'master'? * What exactly is the reason why one has GIT_DIR=... and the other does not? I do not think this a place to test that "gitdir: " in .git points at the right place, so it must be testing something else, but I cannot guess. > + ) && >git checkout -b test-branch && > + git submodule update --remote --force submodule && > + (cd submodule && > + test "$(git log -1 --oneline)" = "$(GIT_DIR=../../submodule/.git git > log -1 --oneline test-branch)" > + ) && > + git checkout master && > + git branch -d test-branch > + ) > +' > + > +test_expect_success 'branch = . does not confuse the rest of update' ' > +
Re: Alternatives to mid.gmane.org?
Duy Nguyenwrote: > On Thu, Jul 28, 2016 at 11:11 AM, Lars Schneider > wrote: > > Hi, > > > > gmane is down At least the following should work in case public-inbox fails: https://mid.mail-archive.com/%s https://marc.info/?i=%s public-inbox falls back to them on missing MIDs, along with https://lists.debian.org/msgid-search/%s I trip over power cords all the time, I'm hoping other people mirror/replicate the setup Jeff also linked to: https://public-inbox.org/git/20160710004813.ga20...@dcvr.yhbt.net/ Right now my email subscription is still an SPOF for the mirror, along with vger.kernel.org... > I read this and thought "temporarily" but apparently it's not [1]. A > lot of our links in the mail archive are gmane's :( > > [1] https://lars.ingebrigtsen.no/2016/07/28/the-end-of-gmane/ :< I'll see what I can do about getting a gmane NUM -> MID mapping up while NNTP still works... -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] t7406: correct depth test in shallow test
On Thu, Jul 28, 2016 at 11:39 AM, Junio C Hamanowrote: > Stefan Beller writes: > >> We used to ask for 3 changes and tested for having 1, so the test >> seems broken. > > I am not sure what to think of "seems broken". When asking for depth 3, I would expect a result of 1,2, or 3 commits. But when testing the depth argument with a history less than 3, and then implying: "I got 1, which is less than 3, so the depth works", seems to be a logical shortcut to me. I would have expected a history of >3, then ask for 3 and confirm we did not get 4 or 5 or 6, but 3 only. > > Asking for 3 and having 1 is broken in what way? Should we be > expecting for 3 because we asked for that many? Should we expect > less than three even though we asked for three because the upstream > side does not even have that many? If the current test that asks > for 3 and gets only 1 is not failing, why should we expect that > asking for 2 would get 2? In other words, why is it sane that > asking for fewer number of commits gives us more? I think there is a subtle thing going on, that I did not examine properly but it is hidden in the modernization from test 1 = $(something) to test_line_count = 2 I'll investigate the actual reason. > > Also most of the lines in this subshell seem to be breaking > &&-chain. Thanks for pointing that out, will fix while at it. > > > >> Correct the test by using test_line_count that exists in the test suite. >> >> Signed-off-by: Stefan Beller >> --- >> t/t7406-submodule-update.sh | 5 +++-- >> 1 file changed, 3 insertions(+), 2 deletions(-) >> >> diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh >> index 88e9750..bd261ac 100755 >> --- a/t/t7406-submodule-update.sh >> +++ b/t/t7406-submodule-update.sh >> @@ -846,9 +846,10 @@ test_expect_success 'submodule update clone shallow >> submodule' ' >> (cd super3 && >>sed -e "s#url = ../#url = file://$pwd/#" <.gitmodules >> >.gitmodules.tmp && >>mv -f .gitmodules.tmp .gitmodules && >> - git submodule update --init --depth=3 >> + git submodule update --init --depth=2 >>(cd submodule && >> - test 1 = $(git log --oneline | wc -l) >> + git log --oneline >lines >> + test_line_count = 2 lines >>) >> ) >> ' -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH va/i18n-even-more] rebase-interactive: trim leading whitespace from progress count
W dniu 2016-07-28 o 20:22, Eric Sunshine pisze: > On Thu, Jul 28, 2016 at 1:47 PM, Johannes Sixtwrote: >> Interactive rebase uses 'wc -l' to write the current patch number >> in a progress report. Some implementations of 'wc -l' produce spaces >> before the number, leading to ugly output such as >> >> Rebasing ( 3/8) >> >> Remove the spaces using a trivial arithmetic evaluation. >> >> Before 9588c52 (i18n: rebase-interactive: mark strings for >> translation) this was not a problem because printf was used to >> generate the text. Since that commit, the count is interpolated >> directly from a shell variable into the text, where the spaces >> remain. The total number of patches does not have this problem >> even though it is interpolated from a shell variable in the same >> manner, because the variable is set by an arithmetic evaluation. >> >> Later in the script, there is a virtually identical case where >> leading spaces are trimmed, but it uses a pattern substitution: >> >> todocount=$(git stripspace --strip-comments <"$todo" | wc -l) >> todocount=${todocount##* } >> >> I did not choose this idiom because it adds a line of code, and >> there is already an arithmetic evaluation in the vicinity of the >> line that is changed here. > > On the other hand, to a newcomer (not familiar with this patch), > ${foo##* } is an obvious and intentional stripping of whitespace, > whereas taking advantage of a side-effect of arithmetic evaluation to > achieve the same is quite subtle and likely to be interpreted as > pointless, thus forces the reader to consult 'blame' to understand why > the code is the way it is. On the gripping hand, the number of currently processed commits (instructions) in an interactive rebase is a number, and arithmetic expansion can be understood as shell equivalent of casting to integer. >> Signed-off-by: Johannes Sixt >> --- >> diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh >> @@ -121,7 +121,7 @@ mark_action_done () { >> sed -e 1q < "$todo" >> "$done" >> sed -e 1d < "$todo" >> "$todo".new >> mv -f "$todo".new "$todo" >> - new_count=$(git stripspace --strip-comments <"$done" | wc -l) >> + new_count=$(( $(git stripspace --strip-comments <"$done" | wc -l) )) >> echo $new_count >"$msgnum" >> total=$(($new_count + $(git stripspace --strip-comments <"$todo" | >> wc -l))) >> echo $total >"$end" -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] t7406: correct depth test in shallow test
Stefan Bellerwrites: > We used to ask for 3 changes and tested for having 1, so the test > seems broken. I am not sure what to think of "seems broken". Asking for 3 and having 1 is broken in what way? Should we be expecting for 3 because we asked for that many? Should we expect less than three even though we asked for three because the upstream side does not even have that many? If the current test that asks for 3 and gets only 1 is not failing, why should we expect that asking for 2 would get 2? In other words, why is it sane that asking for fewer number of commits gives us more? Also most of the lines in this subshell seem to be breaking &&-chain. > Correct the test by using test_line_count that exists in the test suite. > > Signed-off-by: Stefan Beller > --- > t/t7406-submodule-update.sh | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh > index 88e9750..bd261ac 100755 > --- a/t/t7406-submodule-update.sh > +++ b/t/t7406-submodule-update.sh > @@ -846,9 +846,10 @@ test_expect_success 'submodule update clone shallow > submodule' ' > (cd super3 && >sed -e "s#url = ../#url = file://$pwd/#" <.gitmodules >.gitmodules.tmp > && >mv -f .gitmodules.tmp .gitmodules && > - git submodule update --init --depth=3 > + git submodule update --init --depth=2 >(cd submodule && > - test 1 = $(git log --oneline | wc -l) > + git log --oneline >lines > + test_line_count = 2 lines >) > ) > ' -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH va/i18n-even-more] rebase-interactive: trim leading whitespace from progress count
On Thu, Jul 28, 2016 at 1:47 PM, Johannes Sixtwrote: > Interactive rebase uses 'wc -l' to write the current patch number > in a progress report. Some implementations of 'wc -l' produce spaces > before the number, leading to ugly output such as > > Rebasing ( 3/8) > > Remove the spaces using a trivial arithmetic evaluation. > > Before 9588c52 (i18n: rebase-interactive: mark strings for > translation) this was not a problem because printf was used to > generate the text. Since that commit, the count is interpolated > directly from a shell variable into the text, where the spaces > remain. The total number of patches does not have this problem > even though it is interpolated from a shell variable in the same > manner, because the variable is set by an arithmetic evaluation. > > Later in the script, there is a virtually identical case where > leading spaces are trimmed, but it uses a pattern substitution: > > todocount=$(git stripspace --strip-comments <"$todo" | wc -l) > todocount=${todocount##* } > > I did not choose this idiom because it adds a line of code, and > there is already an arithmetic evaluation in the vicinity of the > line that is changed here. On the other hand, to a newcomer (not familiar with this patch), ${foo##* } is an obvious and intentional stripping of whitespace, whereas taking advantage of a side-effect of arithmetic evaluation to achieve the same is quite subtle and likely to be interpreted as pointless, thus forces the reader to consult 'blame' to understand why the code is the way it is. > Signed-off-by: Johannes Sixt > --- > diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh > @@ -121,7 +121,7 @@ mark_action_done () { > sed -e 1q < "$todo" >> "$done" > sed -e 1d < "$todo" >> "$todo".new > mv -f "$todo".new "$todo" > - new_count=$(git stripspace --strip-comments <"$done" | wc -l) > + new_count=$(( $(git stripspace --strip-comments <"$done" | wc -l) )) > echo $new_count >"$msgnum" > total=$(($new_count + $(git stripspace --strip-comments <"$todo" | wc > -l))) > echo $total >"$end" -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCHv2 2/2] submodule update: allow '.' for branch value
Gerrit has a "superproject subscription" feature[1], that triggers a commit in a superproject that is subscribed to its submodules. Conceptually this Gerrit feature can be done on the client side with Git via: git -C submodule update --remote --force git -C commit -a -m "Update submodules" git -C push for each branch in the superproject. To ease the configuration in Gerrit a special value of "." has been introduced for the submodule..branch to mean the same branch as the superproject[2], such that you can create a new branch on both superproject and the submodule and this feature continues to work on that new branch. Now we have find projects in the wild with such a .gitmodules file. To have Git working well with these, we imitate the behavior and look up the superprojects branch name if the submodules branch is configured to ".". In projects that do not use Gerrit, this value whould be never configured as "." is not a valid branch name. [1] introduced around 2012-01, e.g. https://gerrit-review.googlesource.com/#/c/30810/ [2] excerpt from [1]: > The branch value could be '.' if the submodule project branch > has the same name as the destination branch of the commit having > gitlinks/.gitmodules file. CC: Avery PennarunSigned-off-by: Stefan Beller --- This comes with another test that I run into while using this code. Please replace patch 2 with this v2. Thanks, Stefan git-submodule.sh| 9 - t/t7406-submodule-update.sh | 42 +- 2 files changed, 49 insertions(+), 2 deletions(-) diff --git a/git-submodule.sh b/git-submodule.sh index 4ec7546..1eb33ad 100755 --- a/git-submodule.sh +++ b/git-submodule.sh @@ -590,7 +590,6 @@ cmd_update() name=$(git submodule--helper name "$sm_path") || exit url=$(git config submodule."$name".url) - branch=$(get_submodule_config "$name" branch master) if ! test -z "$update" then update_module=$update @@ -616,6 +615,14 @@ cmd_update() if test -n "$remote" then + branch=$(get_submodule_config "$name" branch master) + if test "$branch" = "." + then + if ! branch=$(git symbolic-ref --short -q HEAD) + then + die "$(eval_gettext "submodule branch configured to inherit branch from superproject, but it's not on any branch")" + fi + fi if test -z "$nofetch" then # Fetch remote before determining tracking $sha1 diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh index bd261ac..953c486 100755 --- a/t/t7406-submodule-update.sh +++ b/t/t7406-submodule-update.sh @@ -209,9 +209,49 @@ test_expect_success 'submodule update --remote should fetch upstream changes' ' ) ' -test_expect_success 'local config should override .gitmodules branch' ' +test_expect_success 'submodule update --remote should fetch upstream changes with .' ' + (cd super && +git config -f .gitmodules submodule."submodule".branch "." && +git add .gitmodules && +git commit -m "submodules: update from the respective superproject branch" + ) && (cd submodule && +echo line4a >> file && +git add file && +test_tick && +git commit -m "upstream line4a" && +git checkout -b test-branch && +test_commit on-test-branch + ) && + (cd super && +git submodule update --remote --force submodule && +(cd submodule && + test "$(git log -1 --oneline)" = "$(GIT_DIR=../../submodule/.git git log -1 --oneline master)" +) && git checkout -b test-branch && +git submodule update --remote --force submodule && +(cd submodule && + test "$(git log -1 --oneline)" = "$(GIT_DIR=../../submodule/.git git log -1 --oneline test-branch)" +) && +git checkout master && +git branch -d test-branch + ) +' + +test_expect_success 'branch = . does not confuse the rest of update' ' + (cd super && +git checkout --detach && +# update is not confused by branch="." even if the the superproject +# is not on any branch currently +git submodule update && +git revert HEAD && +git checkout master + ) +' + +test_expect_success 'local config should override .gitmodules branch' ' + (cd submodule && +git checkout test-branch && echo line5 >> file && git add file && test_tick && -- 2.9.2.468.g3d9025b -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to
[PATCH va/i18n-even-more] rebase-interactive: trim leading whitespace from progress count
Interactive rebase uses 'wc -l' to write the current patch number in a progress report. Some implementations of 'wc -l' produce spaces before the number, leading to ugly output such as Rebasing ( 3/8) Remove the spaces using a trivial arithmetic evaluation. Before 9588c52 (i18n: rebase-interactive: mark strings for translation) this was not a problem because printf was used to generate the text. Since that commit, the count is interpolated directly from a shell variable into the text, where the spaces remain. The total number of patches does not have this problem even though it is interpolated from a shell variable in the same manner, because the variable is set by an arithmetic evaluation. Later in the script, there is a virtually identical case where leading spaces are trimmed, but it uses a pattern substitution: todocount=$(git stripspace --strip-comments <"$todo" | wc -l) todocount=${todocount##* } I did not choose this idiom because it adds a line of code, and there is already an arithmetic evaluation in the vicinity of the line that is changed here. Signed-off-by: Johannes Sixt--- git-rebase--interactive.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh index ded4595..e2da524 100644 --- a/git-rebase--interactive.sh +++ b/git-rebase--interactive.sh @@ -121,7 +121,7 @@ mark_action_done () { sed -e 1q < "$todo" >> "$done" sed -e 1d < "$todo" >> "$todo".new mv -f "$todo".new "$todo" - new_count=$(git stripspace --strip-comments <"$done" | wc -l) + new_count=$(( $(git stripspace --strip-comments <"$done" | wc -l) )) echo $new_count >"$msgnum" total=$(($new_count + $(git stripspace --strip-comments <"$todo" | wc -l))) echo $total >"$end" -- 2.9.0.443.ga8520ad -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Bug? git rebase -i --autostash fails to restore working files
On 28/07/16 14:02, Remi Galan Alfonso wrote: > Hi Phillip, > > Phillip Woodwrites: >> When running ‘git rebase -i --autostash’ if the editor fails then the >> rebase fails but stash is not applied so the working files are not >> restored. Running ‘git rebase --abort’ replies that there’s no rebase >> in progress. If you notice what’s happened it’s not a problem if you >> know to do ‘git stash apply ’ but it’s confusing as >> normally the stash would be automatically applied. This happened to me >> when I messed up my editor configuration but you can reproduce it with >> >> $ GIT_SEQUENCE_EDITOR=false git rebase -i --autostash HEAD^^ >> Created autostash: ff960d4 >> HEAD is now at f1b8af7 [git] Turn on rebase.missingCommitsCheck > > [I'm happy to see rebase.missingCommitsCheck being used.] > >> Could not execute editor >> >> $ git rebase --abort >> No rebase in progress? > > I remembered having read something about it recently and a quick > search confirmed it. It was corrected in commit 33ba9c6 (29/06/2016, > rebase -i: restore autostash on abort) that recently graduated to > master (on 19/07/2016 if I read the various "What's cooking" > correctly). So you will need to build git from source to have this > corrected. Hi Rémi Thanks for the reply, it's good to know it's been fixed. Best Wishes Phillip -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] t7406: correct depth test in shallow test
We used to ask for 3 changes and tested for having 1, so the test seems broken. Correct the test by using test_line_count that exists in the test suite. Signed-off-by: Stefan Beller--- t/t7406-submodule-update.sh | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh index 88e9750..bd261ac 100755 --- a/t/t7406-submodule-update.sh +++ b/t/t7406-submodule-update.sh @@ -846,9 +846,10 @@ test_expect_success 'submodule update clone shallow submodule' ' (cd super3 && sed -e "s#url = ../#url = file://$pwd/#" <.gitmodules >.gitmodules.tmp && mv -f .gitmodules.tmp .gitmodules && -git submodule update --init --depth=3 +git submodule update --init --depth=2 (cd submodule && - test 1 = $(git log --oneline | wc -l) + git log --oneline >lines + test_line_count = 2 lines ) ) ' -- 2.9.2.466.g8c6d1f9.dirty -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/2] submodule update: allow '.' for branch value
The meat is in patch 2 and helps Git and Gerrit work well together. patch 1 looks unrelated but it is not, as when left out the broken test, breaks with patch 2. This is because we add more commits in the submodule. Thanks, Stefan Stefan Beller (2): t7406: correct depth test in shallow test submodule update: allow '.' for branch value git-submodule.sh| 7 +++ t/t7406-submodule-update.sh | 37 ++--- 2 files changed, 41 insertions(+), 3 deletions(-) -- 2.9.2.466.g8c6d1f9.dirty -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] submodule update: allow '.' for branch value
Gerrit has a "superproject subscription" feature[1], that triggers a commit in a superproject that is subscribed to its submodules. Conceptually this Gerrit feature can be done on the client side with Git via: git -C submodule update --remote --force git -C commit -a -m "Update submodules" git -C push for each branch in the superproject. To ease the configuration in Gerrit a special value of "." has been introduced for the submodule..branch to mean the same branch as the superproject[2], such that you can create a new branch on both superproject and the submodule and this feature continues to work on that new branch. Now we have find projects in the wild with such a .gitmodules file. To have Git working well with these, we imitate the behavior and look up the superprojects branch name if the submodules branch is configured to ".". In projects that do not use Gerrit, this value whould be never configured as "." is not a valid branch name. [1] introduced around 2012-01, e.g. https://gerrit-review.googlesource.com/#/c/30810/ [2] excerpt from [1]: > The branch value could be '.' if the submodule project branch > has the same name as the destination branch of the commit having > gitlinks/.gitmodules file. CC: Avery PennarunSigned-off-by: Stefan Beller --- git-submodule.sh| 7 +++ t/t7406-submodule-update.sh | 32 +++- 2 files changed, 38 insertions(+), 1 deletion(-) diff --git a/git-submodule.sh b/git-submodule.sh index 4ec7546..12bb9e8 100755 --- a/git-submodule.sh +++ b/git-submodule.sh @@ -591,6 +591,13 @@ cmd_update() name=$(git submodule--helper name "$sm_path") || exit url=$(git config submodule."$name".url) branch=$(get_submodule_config "$name" branch master) + if test "$branch" = "." + then + if ! branch=$(git symbolic-ref --short -q HEAD) + then + die "$(eval_gettext "submodule branch configured to inherit branch from superproject, but it's not on any branch")" + fi + fi if ! test -z "$update" then update_module=$update diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh index bd261ac..41d69e4 100755 --- a/t/t7406-submodule-update.sh +++ b/t/t7406-submodule-update.sh @@ -209,9 +209,39 @@ test_expect_success 'submodule update --remote should fetch upstream changes' ' ) ' -test_expect_success 'local config should override .gitmodules branch' ' +test_expect_success 'submodule update --remote should fetch upstream changes with .' ' + (cd super && +git config -f .gitmodules submodule."submodule".branch "." && +git add .gitmodules && +git commit -m "submodules: update from the respective superproject branch" + ) && (cd submodule && +echo line4a >> file && +git add file && +test_tick && +git commit -m "upstream line4a" && git checkout -b test-branch && +test_commit on-test-branch + ) && + (cd super && +git submodule update --remote --force submodule && +(cd submodule && + test "$(git log -1 --oneline)" = "$(GIT_DIR=../../submodule/.git git log -1 --oneline master)" +) && +git checkout -b test-branch && +git submodule update --remote --force submodule && +(cd submodule && + test "$(git log -1 --oneline)" = "$(GIT_DIR=../../submodule/.git git log -1 --oneline test-branch)" +) && +git checkout master && +git branch -d test-branch && +git revert HEAD + ) +' + +test_expect_success 'local config should override .gitmodules branch' ' + (cd submodule && +git checkout test-branch && echo line5 >> file && git add file && test_tick && -- 2.9.2.466.g8c6d1f9.dirty -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Alternatives to mid.gmane.org?
On Thu, Jul 28, 2016 at 11:11 AM, Lars Schneiderwrote: > Hi, > > gmane is down I read this and thought "temporarily" but apparently it's not [1]. A lot of our links in the mail archive are gmane's :( [1] https://lars.ingebrigtsen.no/2016/07/28/the-end-of-gmane/ -- Duy -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/RFC 0/7] Add possibility to clone specific subdirectories
Corrections.. On Thu, Jul 28, 2016 at 6:59 PM, Duy Nguyenwrote: > Ah.. this is what I call narrow checkout [1] (but gmane is down at the moment) s/checkout/clone/ > [2] https://github.com/pclouds/git/commits/lanh/narrow-checkout s,lanh/,, -- Duy -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/RFC 0/7] Add possibility to clone specific subdirectories
On Thu, Jul 28, 2016 at 6:02 PM, Robin Ruedewrote: > This patch series adds a `--sparse-prefix=` option to multiple commands, > allowing fetching repository contents from only a subdirectory of a remote. > > This works along with sparse-checkout, and is especially useful for > repositories > where a subdirectory has meaning when standing alone. Ah.. this is what I call narrow checkout [1] (but gmane is down at the moment) [1] http://thread.gmane.org/gmane.comp.version-control.git/155427 > * Motivation (example use cases) > > ... nods nods.. all good stuff > * Open problems: > > 1. Currently all trees are still included. It would be possible to > include only the trees relevant to the sparse files, which would significantly > reduce the pack sizes for repositories containing a lot of small files > changing > often. For example package managers using git. Not sure in how many places all > trees are presumed present. You can limit some trees by passing a pathspec to "git rev-list" (in your "list-objects" patch). All trees completely outside sub/dir will be excluded. Trees leading to it (e.g. root tree and "sub") are still included. Not having all trees open up a new set of problems.. This is what I did in narrow clone: pass some directories (as pathspec) to rev-list on the server side, then deal with lack of trees on client side. > 2. This patch set implements it as a simple single prefix check command line > option. > Using the exclude_list format (same as in sparse-checkout) might be useful. > The server needs to check these patterns for all files in history, so I'm not > sure if allowing multiple/complex patterns is a good idea. I would go with something else than sparse-checkout, which I call narrow checkout: instead of flattening the entire tree in index and keep only files there, we keep trees that we don't have as trees. Those trees have the same "sparse checkout" attributes, e.g. ignore worktree and some of submodules e.g. don't bother checking the associated hash. This approach [2] eliminates changes in cache-tree.c (i.e. 3/7). And you would need something like that, when you don't have all the trees (from open problem 1), because you just can't flatten trees when you don't have them. [2] https://github.com/pclouds/git/commits/lanh/narrow-checkout (I think core functionality is in place, but narrow operation still needs more work) > 3. This patch set assumes the sparse-prefix and sparse-checkout does not > change. > running clone and fetch both need to have the --sparse-prefix= option, > otherwise > complete packs will be fetched. Not sure what the best way to store the > information is, possibly create a new file `.git/sparse` similar to > `.git/shallow` containing the path(s). Something like .git/shallow, yes. It's similar in nature anyway (shallow cuts depth, you cut the side) > 3. Bitmap indices cannot be used, because they do not contain the paths of the > objects. So for creating packs, the whole DAG has to be walked. And shallow clones have this same problem. Something to be sorted out :) > 4. Fsck complains about missing blobs. Should be fairly easy to fix. Not really. You'll have to associate path information with blobs before you decide that a blob should exist or not. Sparse patterns are just not designed for that (tree walking). If you narrow (heh) down to just path prefix not full blown sparse patterns, then it's feasible to walk tree and filter. A subset of pathspec would be good because we can already filter by pathspec, but I would not go full pathspec at the first step. > 5. Tests and documentation is missing. Personally I would go with my narrow clone approach, but the ability to selectively exclude some large blobs is still good, I think. However, another approach to excluding some blobs is the external object database [3]. It gives you what you need with a lot less code impact (but you will not be able to work offline 100% the time like what you can now with git) [3] https://public-inbox.org/git/20160613085546.11784-1-chriscool%40tuxfamily.org/ -- Duy -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git-testadd: Execute a command with only the staged changes in Git applied
On 28 July 2016 at 18:37, Junio C Hamanowrote: > Øyvind A. Holm writes: > > This is a script I created some weeks ago, and I've found it to be > > immensely useful. Here is a snippet from git-testadd --help: > > > > If you have lots of unrelated uncommitted changes in the current > > repository and want to split up the commit, how can you easily > > check if the changes passes the test suite? With all the other > > unrelated changes it can be hard to make sure that only relevant > > changes becomes part of the commit, and that they don't result in > > regressions. This script clones the repository to the directory > > ".testadd.tmp" in the current directory and applies the staged > > chenges there (unless -u/--unmodified or -p/--pristine is > > specified), chdirs to the same relative directory in the clone and > > executes the command specified on the command line there. > > So in short, this solves the same problem as "git stash --keep" but in > a more scalable way, in the sense that "git stash --keep" allows you > to instantiate what you have in the index so that your working tree > can be used for such a test, but you cannot do anything else while you > are waiting for the test to finish, and "testadd" allows you to keep > hacking in the working tree while a test runs in its own temporary > checkout (and presumably you can have more than one running, which > would allow you to scale more)? That's correct, the test clone is entirely separated from the working copy, and you can keep working while the tests are running in the clone. Combined with git-gui and/or "git add -p/git reset -p", it's easy to tweak the staged changes until things are ok. Also, there is a -l/--label option that creates a clone directory with the name ".testadd-[LABEL].tmp", so you can have several test clones at the same time, all with different staged changes. There is also a -r/--ref option that tries to apply the staged changes onto another commit, and the command will only run if the apply succeeds. Also, this won't create dangling heads like "git stash --keep" does. Øyvind -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv1] completion: add option '--recurse-submodules' to 'git clone'
Stefan Bellerwrites: >> Anyway, I'll apply the "addition to the completion" patch. >> >> Thanks. > > Thanks for this patch! > > Note: if we ever decide to resurrect sb/submodule-default-path, > we run into a merge conflict. The reasoning for using > "--recurse-submodules" instead of a plain "--recurse" makes sense > as well, so that merge conflict will be resolved in favor of this patch. Thanks for an advance warning. My rerere database has already been taught about this conflict ;-) As to sb/submodule-default-path topic, which has been blocked on still-in-flux attribute work, I am tempted to declare that the attribute work is not yet thread-ready but it is in a good enough shape to base other works on, and have them advance to 'next'. The traditional pattern of allowing the callers to randomly allocate an array of "struct git_attr_check" and passing the pointer to its first element to git_check_attr() function was impossible to extend without having to update the callers, but we have migrated away from the pattern and the attribute subsystem can be enhanced without impacting the callers too much. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/3] diff-highlight: add support for git log --graph output.
Hi, I've been working with Jeff King on this patch, but he encouraged me to email the list. I recently learned about the diff-highlight tool and find it very helpful, however, I frequently use the --graph option to git log which breaks the tool. This patch looks to fix this. I wanted to try and add some tests as well, but I was unsure what number to pick for the second digit. As there were limited tests in the contrib directory (only t93xx and one t7900) I just chose the next number in the 9xxx range. Please let me know if I need to change it. I'm also not super happy about my test case for the graph option. If anyone has any better ideas, please let me know! Brian Henderson (3): diff-highlight: add some tests. diff-highlight: add failing test for handling --graph output. diff-highlight: add support for --graph output. contrib/diff-highlight/Makefile | 5 + contrib/diff-highlight/diff-highlight| 13 +-- contrib/diff-highlight/t/Makefile| 19 contrib/diff-highlight/t/t9400-diff-highlight.sh | 76 contrib/diff-highlight/t/test-diff-highlight.sh | 111 +++ 5 files changed, 218 insertions(+), 6 deletions(-) create mode 100644 contrib/diff-highlight/Makefile create mode 100644 contrib/diff-highlight/t/Makefile create mode 100644 contrib/diff-highlight/t/t9400-diff-highlight.sh create mode 100644 contrib/diff-highlight/t/test-diff-highlight.sh -- 2.9.0 >From f2468205f5f43b17bfd0398959a1ae66588a8f0d Mon Sep 17 00:00:00 2001 From: Brian HendersonDate: Thu, 21 Jul 2016 09:43:54 -0700 Subject: [PATCH 1/3] diff-highlight: add some tests. --- contrib/diff-highlight/Makefile | 5 ++ contrib/diff-highlight/t/Makefile| 19 +++ contrib/diff-highlight/t/t9400-diff-highlight.sh | 63 ++ contrib/diff-highlight/t/test-diff-highlight.sh | 69 4 files changed, 156 insertions(+) create mode 100644 contrib/diff-highlight/Makefile create mode 100644 contrib/diff-highlight/t/Makefile create mode 100644 contrib/diff-highlight/t/t9400-diff-highlight.sh create mode 100644 contrib/diff-highlight/t/test-diff-highlight.sh diff --git a/contrib/diff-highlight/Makefile b/contrib/diff-highlight/Makefile new file mode 100644 index 000..b866259 --- /dev/null +++ b/contrib/diff-highlight/Makefile @@ -0,0 +1,5 @@ +# nothing to build +all:; + +test: + $(MAKE) -C t diff --git a/contrib/diff-highlight/t/Makefile b/contrib/diff-highlight/t/Makefile new file mode 100644 index 000..ac81fc0 --- /dev/null +++ b/contrib/diff-highlight/t/Makefile @@ -0,0 +1,19 @@ +-include ../../../config.mak.autogen +-include ../../../config.mak + +T = $(wildcard t[0-9][0-9][0-9][0-9]-*.sh) + +all: test +test: $(T) + +.PHONY: help clean all test $(T) + +help: + @echo 'Run "$(MAKE) test" to launch test scripts' + @echo 'Run "$(MAKE) clean" to remove trash folders' + +$(T): + @echo "*** $@ ***"; sh $@ $(GIT_TEST_OPTS) + +clean: + $(RM) -r 'trash directory'.* diff --git a/contrib/diff-highlight/t/t9400-diff-highlight.sh b/contrib/diff-highlight/t/t9400-diff-highlight.sh new file mode 100644 index 000..ca7605f --- /dev/null +++ b/contrib/diff-highlight/t/t9400-diff-highlight.sh @@ -0,0 +1,63 @@ +#!/bin/sh +# +# Copyright (C) 2016 + +test_description='Test diff-highlight' + +. ./test-diff-highlight.sh +. $TEST_DIRECTORY/test-lib.sh + +# PERL is required, but assumed to be present, although not necessarily modern +# some tests require 5.8 + +test_expect_success 'diff-highlight highlightes the beginning of a line' ' + dh_test \ +"aaa\nbbb\nccc\n" \ +"aaa\n0bb\nccc\n" \ +" + aaa +-${CW}b${CR}bb ++${CW}0${CR}bb + ccc +" +' + +test_expect_success 'diff-highlight highlightes the end of a line' ' + dh_test \ +"aaa\nbbb\nccc\n" \ +"aaa\nbb0\nccc\n" \ +" + aaa +-bb${CW}b${CR} ++bb${CW}0${CR} + ccc +" +' + +test_expect_success 'diff-highlight highlightes the middle of a line' ' + dh_test \ +"aaa\nbbb\nccc\n" \ +"aaa\nb0b\nccc\n" \ +" + aaa +-b${CW}b${CR}b ++b${CW}0${CR}b + ccc +" +' + +test_expect_success 'diff-highlight does not highlight whole line' ' + dh_test \ +"aaa\nbbb\nccc\n" \ +"aaa\n000\nccc\n" +' + +test_expect_success 'diff-highlight does not highlight mismatched hunk size' ' + dh_test \ +"aaa\nbbb\n" \ +"aaa\nb0b\nccc\n" +' + +# TODO add multi-byte test + +test_done diff --git a/contrib/diff-highlight/t/test-diff-highlight.sh b/contrib/diff-highlight/t/test-diff-highlight.sh new file mode 100644 index 000..9b26271 --- /dev/null +++ b/contrib/diff-highlight/t/test-diff-highlight.sh @@ -0,0 +1,69 @@ +#!/bin/sh +# +# Copyright (C) 2016 + +CURR_DIR=$(pwd) +TEST_OUTPUT_DIRECTORY=$(pwd) +TEST_DIRECTORY="$CURR_DIR"/../../../t +cmd=diff-highlight +CMD="$CURR_DIR"/../$cmd + +CW="\033[7m" +CR="\033[27m" + +export TEST_OUTPUT_DIRECTORY TEST_DIRECTORY CW CR + +dh_test() { + dh_diff_test "$@" +
Re: [PATCH 1/3] submodule-config: passing name reference for .gitmodule blobs
On Thu, Jul 28, 2016 at 5:49 AM, Heiko Voigtwrote: > Commit 959b5455 (submodule: implement a config API for lookup of > .gitmodules values, 2015-08-18) implemented the initial version of the > submodule config cache. During development of that initial version we > extracted the function gitmodule_sha1_from_commit(). During that process > we missed that the strbuf rev was still used in config_from() and now is > left empty. Lets fix this by also returning this string. > > This means that now when reading .gitmodules from revisions, the error > messages also contain a reference to the blob they are from. > > Signed-off-by: Heiko Voigt > --- > Here you go. Now including a test. All 3 patches look good to me, thanks! > > +test_expect_success 'error message contains blob reference' ' > + (cd super && > + sha1=$(git rev-parse HEAD) && > + test-submodule-config \ > + HEAD b \ > + HEAD submodule \ > + 2>actual_err && > + grep "submodule-blob $sha1:.gitmodules" actual_err >/dev/null Makes sense! -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
git-testadd: Execute a command with only the staged changes in Git applied
This is a script I created some weeks ago, and I've found it to be immensely useful. Here is a snippet from git-testadd --help: If you have lots of unrelated uncommitted changes in the current repository and want to split up the commit, how can you easily check if the changes passes the test suite? With all the other unrelated changes it can be hard to make sure that only relevant changes becomes part of the commit, and that they don't result in regressions. This script clones the repository to the directory ".testadd.tmp" in the current directory and applies the staged chenges there (unless -u/--unmodified or -p/--pristine is specified), chdirs to the same relative directory in the clone and executes the command specified on the command line there. The script is well-tested, and also have a test suite you can run to make sure it works on your *nix system. Place git-testadd.t in a subdirectory one level under the script location, chdir to that directory and execute "./git-testadd.t". It also works with binary files. Available from https://gitlab.com/sunny256/utils/raw/master/git-testadd https://gitlab.com/sunny256/utils/raw/master/tests/git-testadd.t It's also on GitHub, just replace "gitlab" with "github" in the URLs. And of course, ideas and patches for new functionality/fixes are always welcome. Øyvind -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] i18n: config: unfold error messages marked for translation
On Thu, Jul 28, 2016 at 01:14:03PM +, Vasco Almeida wrote: > static void die_bad_number(const char *name, const char *value) > { > - const char *reason = errno == ERANGE ? > - "out of range" : > - "invalid unit"; > if (!value) > value = ""; > > - if (cf && cf->origin_type && cf->name) > - die(_("bad numeric config value '%s' for '%s' in %s %s: %s"), > - value, name, cf->origin_type, cf->name, reason); > - die(_("bad numeric config value '%s' for '%s': %s"), value, name, > reason); > + if (!(cf && cf->name)) > + die(errno == ERANGE > + ? _("bad numeric config value '%s' for '%s': out of range") > + : _("bad numeric config value '%s' for '%s': invalid unit"), > + value, name); > + > + switch (cf->origin_type) { > + case CONFIG_ORIGIN_BLOB: > + die(errno == ERANGE > + ? _("bad numeric config value '%s' for '%s' in blob %s: out > of range") > + : _("bad numeric config value '%s' for '%s' in blob %s: > invalid unit"), > + value, name, cf->name); I get that the point of this patch is to make more complete sentences for translation, rather than the lego-brick construction. And that makes sense to me. But it seems like the ":" is a natural separator, and it would be OK to make these: die(_("bad numeric config value '%s' for '%s' in blob %s: %s"), value, name, cf->name, reason); instead of having two separate strings for the errno values. After all, that's what we do everywhere else where "reason" is supplied by strerror(). It's just in this case that there is no errno value matching "invalid unit", so we have to fill it in ourselves. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] i18n: config: unfold error messages marked for translation
Vasco Almeidawrites: > Unfold the message into several templates for each known origin_type. > That would result in better translation at the expense of code > verbosity. Looks good now, except one minor nit I'll tweak out while queuing. > if (cf->die_on_error) > - die(_("bad config line %d in %s %s"), cf->linenr, > cf->origin_type, cf->name); > + die("%s", error_msg); > else > - return error(_("bad config line %d in %s %s"), cf->linenr, > cf->origin_type, cf->name); > + error_return = error("%s", error_msg); s/ = / = /; Thanks. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4] i18n: notes: mark comment for translation
Vasco Almeidawrites: > Mark comment displayed when editing a note for translation. > > Signed-off-by: Vasco Almeida > --- > > This patch follows the original output and Ævar Arnfjörð Bjarmason > sugestion to remove \n from the source string in order to assure that the > ouput layout is not change by one translator forgetting to add \n, for > instance. Well, that cuts both ways. A translater adding an extra \n would also break the layout, so I am not convinced that is a very good justification. As a parameter to strbuf_add_commented_lines(), an extra or a missing \n does not really matter, though, because the whole thing is a line-oriented comment ;-) As to the patch text, it looks like it would produce more correct output than what I queued tentatively on 'pu', so I'd replace it with this one. > builtin/notes.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/builtin/notes.c b/builtin/notes.c > index 0572051..f848b89 100644 > --- a/builtin/notes.c > +++ b/builtin/notes.c > @@ -91,7 +91,7 @@ static const char * const git_notes_get_ref_usage[] = { > }; > > static const char note_template[] = > - "\nWrite/edit the notes for the following object:\n"; > + N_("Write/edit the notes for the following object:"); > > struct note_data { > int given; > @@ -179,7 +179,8 @@ static void prepare_note_data(const unsigned char > *object, struct note_data *d, > copy_obj_to_fd(fd, old_note); > > strbuf_addch(, '\n'); > - strbuf_add_commented_lines(, note_template, > strlen(note_template)); > + strbuf_add_commented_lines(, "\n", strlen("\n")); > + strbuf_add_commented_lines(, _(note_template), > strlen(_(note_template))); > strbuf_addch(, '\n'); > write_or_die(fd, buf.buf, buf.len); -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH/RFC 5/7] fetch: add sparse-prefix option
also pass sparse-prefix option from fetch to rev-list while checking connectivity Signed-off-by: Robin Ruede--- builtin/fetch.c | 19 ++- connected.c | 7 ++- transport.c | 4 transport.h | 4 4 files changed, 28 insertions(+), 6 deletions(-) diff --git a/builtin/fetch.c b/builtin/fetch.c index acd0cf1..b48537f 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -41,6 +41,7 @@ static int tags = TAGS_DEFAULT, unshallow, update_shallow; static int max_children = -1; static enum transport_family family; static const char *depth; +static const char *sparse_prefix; static const char *upload_pack; static struct strbuf default_rla = STRBUF_INIT; static struct transport *gtransport; @@ -117,6 +118,8 @@ static struct option builtin_fetch_options[] = { OPT_BOOL(0, "progress", , N_("force progress reporting")), OPT_STRING(0, "depth", , N_("depth"), N_("deepen history of shallow clone")), + OPT_STRING(0, "sparse-prefix", _prefix, N_("path-prefix"), + N_("only fetch blobs for the specified path-prefix")), { OPTION_SET_INT, 0, "unshallow", , NULL, N_("convert to a complete repository"), PARSE_OPT_NONEG | PARSE_OPT_NOARG, NULL, 1 }, @@ -706,9 +709,11 @@ static int iterate_ref_map(void *cb_data, unsigned char sha1[20]) return 0; } -static int store_updated_refs(const char *raw_url, const char *remote_name, +static int store_updated_refs(struct transport *transport, struct ref *ref_map) { + const char *raw_url = transport->url; + const char *remote_name = transport->remote->name; FILE *fp; struct commit *commit; int url_len, i, rc = 0; @@ -729,7 +734,8 @@ static int store_updated_refs(const char *raw_url, const char *remote_name, url = xstrdup("foreign"); rm = ref_map; - if (check_everything_connected(iterate_ref_map, 0, )) { + if (check_everything_connected_with_transport(iterate_ref_map, 0, , + transport)) { rc = error(_("%s did not send all necessary objects\n"), url); goto abort; } @@ -885,9 +891,7 @@ static int fetch_refs(struct transport *transport, struct ref *ref_map) if (ret) ret = transport_fetch_refs(transport, ref_map); if (!ret) - ret |= store_updated_refs(transport->url, - transport->remote->name, - ref_map); + ret |= store_updated_refs(transport, ref_map); transport_unlock_pack(transport); return ret; } @@ -993,6 +997,11 @@ static struct transport *prepare_transport(struct remote *remote) set_option(transport, TRANS_OPT_KEEP, "yes"); if (depth) set_option(transport, TRANS_OPT_DEPTH, depth); + if (sparse_prefix) { + if(sparse_prefix[0] != '/') + die(N_("sparse prefix must start with /")); + set_option(transport, TRANS_OPT_SPARSE_PREFIX, sparse_prefix); + } if (update_shallow) set_option(transport, TRANS_OPT_UPDATE_SHALLOW, "yes"); return transport; diff --git a/connected.c b/connected.c index bf1b12e..1534c5c 100644 --- a/connected.c +++ b/connected.c @@ -26,7 +26,7 @@ static int check_everything_connected_real(sha1_iterate_fn fn, const char *shallow_file) { struct child_process rev_list = CHILD_PROCESS_INIT; - const char *argv[9]; + const char *argv[11]; char commit[41]; unsigned char sha1[20]; int err = 0, ac = 0; @@ -56,6 +56,11 @@ static int check_everything_connected_real(sha1_iterate_fn fn, argv[ac++] = "--stdin"; argv[ac++] = "--not"; argv[ac++] = "--all"; + if(transport && transport->smart_options && + transport->smart_options->sparse_prefix) { + argv[ac++] = "--sparse-prefix"; + argv[ac++] = transport->smart_options->sparse_prefix; + } if (quiet) argv[ac++] = "--quiet"; argv[ac] = NULL; diff --git a/transport.c b/transport.c index b233e3e..ce7e2e1 100644 --- a/transport.c +++ b/transport.c @@ -141,6 +141,9 @@ static int set_git_option(struct git_transport_options *opts, } else if (!strcmp(name, TRANS_OPT_UPDATE_SHALLOW)) { opts->update_shallow = !!value; return 0; + } else if (!strcmp(name, TRANS_OPT_SPARSE_PREFIX)) { + opts->sparse_prefix = value; + return 0; } else if (!strcmp(name, TRANS_OPT_DEPTH)) { if (!value) opts->depth = 0; @@ -211,6 +214,7 @@ static int fetch_refs_via_pack(struct transport *transport, args.quiet = (transport->verbose < 0);
[PATCH/RFC 4/7] fetch-pack: add sparse prefix to smart protocol
For example git fetch-pack --sparse-prefix=/contrib/ origin HEAD Should fetch a pack that is generated on the remote by echo HEAD | git pack-objects --revs --stdout --sparse-prefix=/contrib/ Signed-off-by: Robin Ruede--- builtin/fetch-pack.c | 6 ++ fetch-pack.c | 4 fetch-pack.h | 1 + remote-curl.c| 2 ++ upload-pack.c| 15 ++- 5 files changed, 27 insertions(+), 1 deletion(-) diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c index bfd0be4..7f10001 100644 --- a/builtin/fetch-pack.c +++ b/builtin/fetch-pack.c @@ -64,6 +64,12 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix) args.uploadpack = arg + 14; continue; } + if (starts_with(arg, "--sparse-prefix=")) { + args.sparse_prefix = arg + 16; + if(args.sparse_prefix[0] != '/') + die(N_("sparse prefix must start with /")); + continue; + } if (starts_with(arg, "--exec=")) { args.uploadpack = arg + 7; continue; diff --git a/fetch-pack.c b/fetch-pack.c index b501d5c..8571b02 100644 --- a/fetch-pack.c +++ b/fetch-pack.c @@ -326,6 +326,8 @@ static int find_common(struct fetch_pack_args *args, return 1; } + if(args->sparse_prefix) + packet_buf_write(_buf, "sparse-prefix %s", args->sparse_prefix); if (is_repository_shallow()) write_shallow_commits(_buf, 1, NULL); if (args->depth > 0) @@ -811,6 +813,8 @@ static struct ref *do_fetch_pack(struct fetch_pack_args *args, if ((args->depth > 0 || is_repository_shallow()) && !server_supports("shallow")) die("Server does not support shallow clients"); + if (args->sparse_prefix && !server_supports("sparse-prefix")) + die("Server does not support sparse prefix"); if (server_supports("multi_ack_detailed")) { if (args->verbose) fprintf(stderr, "Server supports multi_ack_detailed\n"); diff --git a/fetch-pack.h b/fetch-pack.h index bb7fd76..8f36ef4 100644 --- a/fetch-pack.h +++ b/fetch-pack.h @@ -8,6 +8,7 @@ struct sha1_array; struct fetch_pack_args { const char *uploadpack; + const char *sparse_prefix; int unpacklimit; int depth; unsigned quiet:1; diff --git a/remote-curl.c b/remote-curl.c index 6b83b77..e181e62 100644 --- a/remote-curl.c +++ b/remote-curl.c @@ -727,6 +727,8 @@ static int fetch_dumb(int nr_heads, struct ref **to_fetch) ALLOC_ARRAY(targets, nr_heads); if (options.depth) die("dumb http transport does not support --depth"); + if (options.sparse_prefix) + die("dumb http transport does not support --sparse-prefix"); for (i = 0; i < nr_heads; i++) targets[i] = xstrdup(oid_to_hex(_fetch[i]->old_oid)); diff --git a/upload-pack.c b/upload-pack.c index d4cc414..56d8c1a 100644 --- a/upload-pack.c +++ b/upload-pack.c @@ -57,6 +57,7 @@ static int use_sideband; static int advertise_refs; static int stateless_rpc; static const char *pack_objects_hook; +static char *sparse_prefix; static void reset_timeout(void) { @@ -125,6 +126,12 @@ static void create_pack_file(void) argv_array_push(_objects.args, "--delta-base-offset"); if (use_include_tag) argv_array_push(_objects.args, "--include-tag"); + if (sparse_prefix) { + argv_array_push(_objects.args, "--sparse-prefix"); + argv_array_push(_objects.args, sparse_prefix); + free(sparse_prefix); + sparse_prefix = NULL; + } pack_objects.in = -1; pack_objects.out = -1; @@ -582,6 +589,12 @@ static void receive_needs(void) die("Invalid deepen: %s", line); continue; } + if (starts_with(line, "sparse-prefix ")) { + if(sparse_prefix) + die("Only single sparse-prefix is allowed"); + sparse_prefix = xstrdup(line + 14); + continue; + } if (!starts_with(line, "want ") || get_sha1_hex(line+5, sha1_buf)) die("git upload-pack: protocol error, " @@ -730,7 +743,7 @@ static int send_ref(const char *refname, const struct object_id *oid, { static const char *capabilities = "multi_ack thin-pack side-band" " side-band-64k ofs-delta shallow no-progress" - " include-tag multi_ack_detailed"; + " include-tag multi_ack_detailed sparse-prefix"; const char *refname_nons = strip_namespace(refname); struct
[PATCH/RFC 7/7] remote-curl: add sparse prefix
Signed-off-by: Robin Ruede--- remote-curl.c | 15 ++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/remote-curl.c b/remote-curl.c index e181e62..b9f7cf1 100644 --- a/remote-curl.c +++ b/remote-curl.c @@ -20,6 +20,7 @@ static struct strbuf url = STRBUF_INIT; struct options { int verbosity; unsigned long depth; + const char *sparse_prefix; unsigned progress : 1, check_self_contained_and_connected : 1, cloning : 1, @@ -60,6 +61,10 @@ static int set_option(const char *name, const char *value) options.depth = v; return 0; } + else if (!strcmp(name, "sparse-prefix")) { + options.sparse_prefix = xstrdup(value); + return 0; + } else if (!strcmp(name, "followtags")) { if (!strcmp(value, "true")) options.followtags = 1; @@ -754,8 +759,9 @@ static int fetch_git(struct discovery *heads, struct rpc_state rpc; struct strbuf preamble = STRBUF_INIT; char *depth_arg = NULL; + char *sparse_arg = NULL; int argc = 0, i, err; - const char *argv[17]; + const char *argv[19]; argv[argc++] = "fetch-pack"; argv[argc++] = "--stateless-rpc"; @@ -783,6 +789,12 @@ static int fetch_git(struct discovery *heads, depth_arg = strbuf_detach(, NULL); argv[argc++] = depth_arg; } + if (options.sparse_prefix) { + struct strbuf buf = STRBUF_INIT; + strbuf_addf(, "--sparse-prefix=%s", options.sparse_prefix); + sparse_arg = strbuf_detach(, NULL); + argv[argc++] = sparse_arg; + } argv[argc++] = url.buf; argv[argc++] = NULL; @@ -807,6 +819,7 @@ static int fetch_git(struct discovery *heads, strbuf_release(); strbuf_release(); free(depth_arg); + free(sparse_arg); return err; } -- 2.9.1.283.g3ca5b4c.dirty -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH/RFC 6/7] clone: add sparse-prefix option
For example git clone git@remote:repo --sparse-prefix=/contrib/ would create a repository, cloning only the relevant files for the contrib subdirectory, and also setting the sparse-checkout option, so only those files will be checked out. Signed-off-by: Robin Ruede--- builtin/clone.c | 27 --- 1 file changed, 24 insertions(+), 3 deletions(-) diff --git a/builtin/clone.c b/builtin/clone.c index 31ea247..dc0d364 100644 --- a/builtin/clone.c +++ b/builtin/clone.c @@ -41,7 +41,7 @@ static const char * const builtin_clone_usage[] = { static int option_no_checkout, option_bare, option_mirror, option_single_branch = -1; static int option_local = -1, option_no_hardlinks, option_shared, option_recursive; static int option_shallow_submodules; -static char *option_template, *option_depth; +static char *option_template, *option_depth, *option_sparse_prefix; static char *option_origin = NULL; static char *option_branch = NULL; static const char *real_git_dir; @@ -91,6 +91,8 @@ static struct option builtin_clone_options[] = { N_("path to git-upload-pack on the remote")), OPT_STRING(0, "depth", _depth, N_("depth"), N_("create a shallow clone of that depth")), + OPT_STRING(0, "sparse-prefix", _sparse_prefix, N_("path-prefix"), + N_("only fetch the blobs for the specified path-prefix")), OPT_BOOL(0, "single-branch", _single_branch, N_("clone only one branch, HEAD or --branch")), OPT_BOOL(0, "shallow-submodules", _shallow_submodules, @@ -959,9 +961,22 @@ int cmd_clone(int argc, const char **argv, const char *prefix) } init_db(option_template, INIT_DB_QUIET); write_config(_config); - + if (option_sparse_prefix) { + FILE *f; + const char *name; + if (option_sparse_prefix[0] != '/') + die(N_("sparse prefix must start with /")); + name = mkpath("%s/info/sparse-checkout", git_dir); + git_config_set("core.sparsecheckout", "true"); + safe_create_leading_directories_const(name); + f = fopen(name, "w"); + if(f == NULL) die("Could not open %s", name); + fprintf(f, "%s\n", option_sparse_prefix); + fclose(f); + } git_config(git_default_config, NULL); + if (option_bare) { if (option_mirror) src_ref_prefix = "refs/"; @@ -977,6 +992,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix) git_config_set(key.buf, repo); strbuf_reset(); + if (option_reference.nr) setup_reference(); @@ -995,6 +1011,8 @@ int cmd_clone(int argc, const char **argv, const char *prefix) if (is_local) { if (option_depth) warning(_("--depth is ignored in local clones; use file:// instead.")); + if (option_sparse_prefix) + warning(_("--sparse-prefix is ignored in local clones; use file:// instead.")); if (!access(mkpath("%s/shallow", path), F_OK)) { if (option_local > 0) warning(_("source repository is shallow, ignoring --local")); @@ -1013,6 +1031,9 @@ int cmd_clone(int argc, const char **argv, const char *prefix) if (option_depth) transport_set_option(transport, TRANS_OPT_DEPTH, option_depth); + if (option_sparse_prefix) + transport_set_option(transport, TRANS_OPT_SPARSE_PREFIX, +option_sparse_prefix); if (option_single_branch) transport_set_option(transport, TRANS_OPT_FOLLOWTAGS, "1"); @@ -1020,7 +1041,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix) transport_set_option(transport, TRANS_OPT_UPLOADPACK, option_upload_pack); - if (transport->smart_options && !option_depth) + if (transport->smart_options && !option_depth && !option_sparse_prefix) transport->smart_options->check_self_contained_and_connected = 1; refs = transport_get_remote_refs(transport); -- 2.9.1.283.g3ca5b4c.dirty -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH/RFC 3/7] Skip checking integrity of files ignored by sparse
(this might be wrong, not sure if this is the right place) Signed-off-by: Robin Ruede--- cache-tree.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/cache-tree.c b/cache-tree.c index f28b1f4..ab01ae5 100644 --- a/cache-tree.c +++ b/cache-tree.c @@ -354,7 +354,8 @@ static int update_one(struct cache_tree *it, entlen = pathlen - baselen; i++; } - if (mode != S_IFGITLINK && !missing_ok && !has_sha1_file(sha1)) { + if (!ce_skip_worktree(ce) && mode != S_IFGITLINK + && !missing_ok && !has_sha1_file(sha1)) { strbuf_release(); if (expected_missing) return -1; -- 2.9.1.283.g3ca5b4c.dirty -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH/RFC 2/7] pack-objects: add sparse-prefix
this allows creating packs that contain only the blobs relevant to a specific subdirectory, e.g. echo HEAD | git pack-objects --revs --sparse-prefix=/contrib/ will create a pack containing the complete history of HEAD, including all commits, all trees, and the files in the contrib directory. Signed-off-by: Robin Ruede--- builtin/pack-objects.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index a2f8cfd..0674f57 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -2621,6 +2621,8 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) struct argv_array rp = ARGV_ARRAY_INIT; int rev_list_unpacked = 0, rev_list_all = 0, rev_list_reflog = 0; int rev_list_index = 0; + const char *sparse_prefix = NULL; + struct option pack_objects_options[] = { OPT_SET_INT('q', "quiet", , N_("do not show progress meter"), 0), @@ -2685,6 +2687,8 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) N_("create thin packs")), OPT_BOOL(0, "shallow", , N_("create packs suitable for shallow fetches")), + OPT_STRING(0, "sparse-prefix", _prefix, N_("path"), + N_("only include blobs relevant for sparse checkout (implies --revs)")), OPT_BOOL(0, "honor-pack-keep", _packed_keep, N_("ignore packs that have companion .keep file")), OPT_INTEGER(0, "compression", _compression_level, @@ -2725,6 +2729,13 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) } else argv_array_push(, "--objects"); + if (sparse_prefix) { + use_internal_rev_list = 1; + // with bitmaps the path of the blobs is not known + use_bitmap_index = 0; + argv_array_push(, "--sparse-prefix"); + argv_array_push(, sparse_prefix); + } if (rev_list_all) { use_internal_rev_list = 1; argv_array_push(, "--all"); -- 2.9.1.283.g3ca5b4c.dirty -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH/RFC 1/7] list-objects: add sparse-prefix option to rev_info
this skips all blob objects who's path does not begin with the specified prefix Signed-off-by: Robin Ruede--- list-objects.c | 4 +++- revision.c | 4 revision.h | 1 + 3 files changed, 8 insertions(+), 1 deletion(-) diff --git a/list-objects.c b/list-objects.c index f3ca6aa..91a6091 100644 --- a/list-objects.c +++ b/list-objects.c @@ -28,7 +28,9 @@ static void process_blob(struct rev_info *revs, pathlen = path->len; strbuf_addstr(path, name); - show(obj, path->buf, cb_data); + if (!revs->sparse_prefix || starts_with(path->buf, revs->sparse_prefix + 1)) { + show(obj, path->buf, cb_data); + } strbuf_setlen(path, pathlen); } diff --git a/revision.c b/revision.c index edba5b7..a36a796 100644 --- a/revision.c +++ b/revision.c @@ -1664,6 +1664,10 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg } else if ((argcount = parse_long_opt("skip", argv, ))) { revs->skip_count = atoi(optarg); return argcount; + } else if ((argcount = parse_long_opt("sparse-prefix", argv, ))) { + if(optarg[0] != '/') return error(N_("sparse prefix must start with /")); + revs->sparse_prefix = optarg; + return argcount; } else if ((*arg == '-') && isdigit(arg[1])) { /* accept -, like traditional "head" */ if (strtol_i(arg + 1, 10, >max_count) < 0 || diff --git a/revision.h b/revision.h index 9fac1a6..2c7c5f2 100644 --- a/revision.h +++ b/revision.h @@ -113,6 +113,7 @@ struct rev_info { ancestry_path:1, first_parent_only:1, line_level_traverse:1; + const char *sparse_prefix; /* Diff flags */ unsigned intdiff:1, -- 2.9.1.283.g3ca5b4c.dirty -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH/RFC 0/7] Add possibility to clone specific subdirectories
This patch series adds a `--sparse-prefix=` option to multiple commands, allowing fetching repository contents from only a subdirectory of a remote. This works along with sparse-checkout, and is especially useful for repositories where a subdirectory has meaning when standing alone. * Motivation (example use cases) 1. Git repositories used for managing large/binary files My university has a repository containing lecture slides etc. as pdfs, with a subdirectory for each lecture. The bandwith for getting the whole repository (even with --depth=1) is 4GiB with significant processing time, getting the complete history of a single lecture uses 25MiB and completes instantly. 2. package-manager-like repositories. Examples: a) Arch Linux package build files repository [1] b) Rust crates.io packages [2] c) TypeScript type definitions [3] 3. Excluding a specific directory containing e.g. large binary assets Not currently possible with this patch set, but could be added (see problem 2 below). 4. Getting the history of a single file 5. Other uses As a non kernel developer, I wanted to quickly search through the code of only the btrfs filesystem using the git tools, but I do not have a local clone of the complete repository. Using `--depth=100` in combination with `--sparse-prefix=/fs/btrfs` allows me to have little bandwidth usage while still retaining some history. 6. This is trivial in SVN, and searching on the internet, there are multiple questions about this feature [4-7] * Examples usage: Getting the source of the btrfs filesystem with a bit of history: $ git clone git@server:linux --depth=100 # shallow, not sparse Receiving objects: 100% (814945/814945), 438.55 MiB | 35.21 MiB/s, done. ... $ git clone git@server:linux --depth=100 --sparse-prefix=/fs/btrfs # sparse and shallow Receiving objects: 100% (503747/503747), 121.45 MiB | 59.75 MiB/s, done. ... $ cd linux && ls ./ fs $ ls fs/ btrfs $ git log --oneline (repo behaves the same as a full clone with sparse-checkout /fs/btrfs) * Open problems: 1. Currently all trees are still included. It would be possible to include only the trees relevant to the sparse files, which would significantly reduce the pack sizes for repositories containing a lot of small files changing often. For example package managers using git. Not sure in how many places all trees are presumed present. 2. This patch set implements it as a simple single prefix check command line option. Using the exclude_list format (same as in sparse-checkout) might be useful. The server needs to check these patterns for all files in history, so I'm not sure if allowing multiple/complex patterns is a good idea. 3. This patch set assumes the sparse-prefix and sparse-checkout does not change. running clone and fetch both need to have the --sparse-prefix= option, otherwise complete packs will be fetched. Not sure what the best way to store the information is, possibly create a new file `.git/sparse` similar to `.git/shallow` containing the path(s). 3. Bitmap indices cannot be used, because they do not contain the paths of the objects. So for creating packs, the whole DAG has to be walked. 4. Fsck complains about missing blobs. Should be fairly easy to fix. 5. Tests and documentation is missing. [1]: https://git.archlinux.org/svntogit/packages.git/ [2]: https://github.com/rust-lang/crates.io-index [3]: https://github.com/DefinitelyTyped/DefinitelyTyped [4]: https://stackoverflow.com/questions/600079/is-there-any-way-to-clone-a-git-repositorys-sub-directory-only [5]: https://stackoverflow.com/questions/11834386/cloning-only-a-subdirectory-with-git [6]: https://askubuntu.com/questions/460885/how-to-clone-git-repository-only-some-directories [7]: https://coderwall.com/p/o2fasg/how-to-download-a-project-subdirectory-from-github Robin Ruede (7): list-objects: add sparse-prefix option to rev_info pack-objects: add sparse-prefix Skip checking integrity of files ignored by sparse fetch-pack: add sparse prefix to smart protocol fetch: add sparse-prefix option clone: add sparse-prefix option remote-curl: add sparse prefix builtin/clone.c| 27 --- builtin/fetch-pack.c | 6 ++ builtin/fetch.c| 19 ++- builtin/pack-objects.c | 11 +++ cache-tree.c | 3 ++- connected.c| 7 ++- fetch-pack.c | 4 fetch-pack.h | 1 + list-objects.c | 4 +++- remote-curl.c | 17 - revision.c | 4 revision.h | 1 + transport.c| 4 transport.h| 4 upload-pack.c | 15 ++- 15 files changed, 114 insertions(+), 13 deletions(-) -- 2.9.1.283.g3ca5b4c.dirty -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info
Re: [PATCH] document how to reference previous commits
On Thu, Jul 28, 2016 at 8:38 AM, Junio C Hamanowrote: > Heiko Voigt writes: > >> @@ -121,6 +121,11 @@ its behaviour. Try to make sure your explanation can >> be understood >> without external resources. Instead of giving a URL to a mailing list >> archive, summarize the relevant points of the discussion. >> >> +If you want to reference a previous commit in the history of a stable >> +branch use the format "abbreviated sha1 (subject, date)". So for example >> +like this: "Commit f86a374 (pack-bitmap.c: fix a memleak, 2015-03-30) >> +noticed [...]". >> + > Thanks! -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] document how to reference previous commits
Heiko Voigtwrites: > @@ -121,6 +121,11 @@ its behaviour. Try to make sure your explanation can be > understood > without external resources. Instead of giving a URL to a mailing list > archive, summarize the relevant points of the discussion. > > +If you want to reference a previous commit in the history of a stable > +branch use the format "abbreviated sha1 (subject, date)". So for example > +like this: "Commit f86a374 (pack-bitmap.c: fix a memleak, 2015-03-30) > +noticed [...]". > + I earlier said that our goal should not be to spell out all the conventions, which would lead to unreadably long document, but this does not look too bad. Small additions however tend to accumulate over time, though ;-) I wondered if we need to also say "why", but we probably shouldn't. Thanks. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Find a topic branch containing a commit
Michael Haggertywrites: > On 07/27/2016 08:03 PM, Duy Nguyen wrote: >>> >>> https://github.com/mhagger/git-when-merged ? >> >> Beautiful. If it had an option to show a topic (i.e. git-log from >> merge base to merge point) I would be ecstatic. > > That's a good idea. I just created a pull request to add that feature: > > https://github.com/mhagger/git-when-merged/pull/13 > > Let me know what you think! It would be nice to have a mode to give the result in a more human readable way. I use when-merged to find where to queue a follow-on fix to a topic that has already graduated to 'master'. When the patch fixes an earlier mistake of the commit X, "git when-merged X master" gives me the merge's 40-hex object name, and the next step is almost always to pass it to "git describe [--contains]". This tells me down to which maintenace track the fix can go. Of course I can get that information by running a separate "git describe [--contains] X", but when I'll be running "when-merged" anyway, it is a wasted step. Then "git show" of the found merge tells me where the tip of the topic is and what the topic was called (older topic branches are culled sometime after they are merged to 'master' and relevant maintenance tracks), and allows me to queue the patch to the correct place (after re-creating the topic if necessary). -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] fix passing a name for config from submodules
Heiko Voigtwrites: >> Apparently we put the subject first and then the date. I always did it >> the other way >> round, to there is no strict coding guide line,... Please don't say "this is not spelled out in the guidelines, so I can do whatever I like, *EVEN* *THOUGH* I know that is different from what most others do". Without the "follow the best-current-practice and be consistent", which I think is in an early part of the guideline document (and if not, it should be spelled out as the most important rule), the guidelines document will become too long that nobody would read. >> understanding for a) how long are we in the "broken" state already as well as >> b) what was the rationale for introducing it. > > Ah ok did not know about this format. Will change that. In other words, In commit 959b5455 we implemented the initial version of ... would become something like: 959b5455 (submodule: implement a config API for lookup of .gitmodules values, 2015-08-17) implemented the initial version of ... Thanks. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Alternatives to mid.gmane.org?
On Thu, Jul 28, 2016 at 11:11:54AM +0200, Lars Schneider wrote: > gmane is down and I wonder if there are alternatives to > find a message based on the message ID in the Git > mailing list archive? Try: https://public-inbox.org/git/20160710004813.ga20...@dcvr.yhbt.net -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Find a topic branch containing a commit
On 07/27/2016 08:03 PM, Duy Nguyen wrote: > On Wed, Jul 27, 2016 at 7:50 PM, Stefan Bellerwrote: >> On Wed, Jul 27, 2016 at 10:42 AM, Duy Nguyen wrote: >>> Before I start doing anything silly because I don't know it can >>> already be done without waving my C wand like a mad man... >>> >>> I often do this: find a commit of interest, the commit itself is not >>> enough so I need a full patch series to figure out what's going, so I >>> fire up "git log --graph --oneline" and manually search that commit >>> and trace back to the merge point, then I can "git log --patch". Is >>> there an automatic way to accomplish that? Something like "git branch >>> --contains" (or "git merge --contains")? >> >> https://github.com/mhagger/git-when-merged ? > > Beautiful. If it had an option to show a topic (i.e. git-log from > merge base to merge point) I would be ecstatic. That's a good idea. I just created a pull request to add that feature: https://github.com/mhagger/git-when-merged/pull/13 Let me know what you think! > Michael, any plans on > bringing this in C Git? For many topic-based projects this would be > very helpful addition, and I think it's not so hard to do it in C > either. I had no such plans, and I don't have time to work on that now. But I certainly don't object if somebody else wants to work on it. It might make a nice GSoC-sized project (though I'm not volunteering to be a mentor at this time :-P ). Michael -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 5/5] convert: add filter..process option
On Thu, Jul 28, 2016 at 02:10:12PM +0200, Lars Schneider wrote: > > I think that's orthogonal. I just mean that using zero for success puts > > you in our usual style, and then accumulating errors can be done with > > "|=". > > Ah. I guess I was misguided by the way errors are currently handled > in `apply_filter` (success = 1; failure = 0): > https://github.com/git/git/blob/8c6d1f9807c67532e7fb545a944b064faff0f70b/convert.c#L437-L479 > > I wouldn't like if the different filter protocols would use different > error exit codes. Would it be OK to adjust the existing `apply_filter` > function in a cleanup patch? Ah, I see. I think those return codes are a little different. They are not "success or error", but "did convert or did not convert" (or "would convert" when no buffer is given). And unless the filter is required, we quietly turn errors into "did not convert" (and if it is, we die()). So I'm not sure if changing them is a good idea. I agree with you that it's probably inviting confusion to have the two sets of filter functions have opposite return codes. So I think I retract my suggestion. :) -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 0/5] Git filter protocol
On Thu, Jul 28, 2016 at 09:16:18AM +0200, Lars Schneider wrote: > But Peff ($gmane/299902), Duy, and Eric, seemed to prefer the pkt-line > solution (gmane is down - otherwise I would have given you the links). FWIW, I think there are arguments for transmitting size + content (namely, that it is simpler); the downside is that it doesn't allow streaming. So I think there are two viable alternatives: 1. Total size of data in ASCII decimal, newline, then that many bytes of content. 2. No size header, then a series of pkt-lines followed by a flush packet. And you should choose between the two based on whether it's more important to allow streaming, or more important to make the filter implementations simple[1]. Any solution that is in between those (like sending a size header and then using pktlines anyway) is sacrificing simplicity but not getting the streaming benefits. -Peff [1] I haven't thought hard enough about it to have a real opinion. My gut says to go with the streaming, just because we've had to retrofit streaming in other areas when dealing with blobs, so I think we'll end up there eventually. So choosing a simpler protocol like (1) would probably mean eventually implementing a next-version protocol that does (2), and having to support both. PS Jakub asked for links, but gmane is down. Here are the relevant threads: http://public-inbox.org/git/20160720134916.gb19...@sigill.intra.peff.net http://public-inbox.org/git/20160722154900.19477-1-larsxschneider%40gmail.com/t/#u -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2] i18n: config: unfold error messages marked for translation
Introduced in 473166b ("config: add 'origin_type' to config_source struct", 2016-02-19), Git can inform the user about the origin of a config error, but the implementation does not allow translators to translate the keywords 'file', 'blob, 'standard input', and 'submodule-blob'. Moreover, for the second message, a reason for the error is appended to the message, not allowing translators to translate that reason either. Unfold the message into several templates for each known origin_type. That would result in better translation at the expense of code verbosity. Add enum config_oringin_type to ease management of the various configuration origin types (blob, file, etc). Previously origin type was considered from command line if cf->origin_type == NULL, i.e., uninitialized. Now we set origin_type to CONFIG_ORIGIN_CMDLINE in git_config_from_parameters() and configset_add_value(). For error message in git_parse_source(), use xstrfmt() function to prepare the message string, instead of doing something like it's done for die_bad_number(), because intelligibility and code conciseness are improved for that instance. Signed-off-by: Vasco Almeida--- cache.h| 12 +- config.c | 117 - submodule-config.c | 2 +- 3 files changed, 109 insertions(+), 22 deletions(-) diff --git a/cache.h b/cache.h index 3855ddf..c802098 100644 --- a/cache.h +++ b/cache.h @@ -1566,10 +1566,18 @@ struct git_config_source { const char *blob; }; +enum config_origin_type { + CONFIG_ORIGIN_BLOB, + CONFIG_ORIGIN_FILE, + CONFIG_ORIGIN_STDIN, + CONFIG_ORIGIN_SUBMODULE_BLOB, + CONFIG_ORIGIN_CMDLINE +}; + typedef int (*config_fn_t)(const char *, const char *, void *); extern int git_default_config(const char *, const char *, void *); extern int git_config_from_file(config_fn_t fn, const char *, void *); -extern int git_config_from_mem(config_fn_t fn, const char *origin_type, +extern int git_config_from_mem(config_fn_t fn, const enum config_origin_type, const char *name, const char *buf, size_t len, void *data); extern void git_config_push_parameter(const char *text); extern int git_config_from_parameters(config_fn_t fn, void *data); @@ -1713,7 +1721,7 @@ extern int ignore_untracked_cache_config; struct key_value_info { const char *filename; int linenr; - const char *origin_type; + enum config_origin_type origin_type; enum config_scope scope; }; diff --git a/config.c b/config.c index bea937e..62b83d9 100644 --- a/config.c +++ b/config.c @@ -24,7 +24,7 @@ struct config_source { size_t pos; } buf; } u; - const char *origin_type; + enum config_origin_type origin_type; const char *name; const char *path; int die_on_error; @@ -245,6 +245,7 @@ int git_config_from_parameters(config_fn_t fn, void *data) memset(, 0, sizeof(source)); source.prev = cf; + source.origin_type = CONFIG_ORIGIN_CMDLINE; cf = /* sq_dequote will write over it */ @@ -453,6 +454,8 @@ static int git_parse_source(config_fn_t fn, void *data) int comment = 0; int baselen = 0; struct strbuf *var = >var; + int error_return = 0; + char *error_msg = NULL; /* U+FEFF Byte Order Mark in UTF8 */ const char *bomptr = utf8_bom; @@ -507,10 +510,40 @@ static int git_parse_source(config_fn_t fn, void *data) if (get_value(fn, data, var) < 0) break; } + + switch (cf->origin_type) { + case CONFIG_ORIGIN_BLOB: + error_msg = xstrfmt(_("bad config line %d in blob %s"), + cf->linenr, cf->name); + break; + case CONFIG_ORIGIN_FILE: + error_msg = xstrfmt(_("bad config line %d in file %s"), + cf->linenr, cf->name); + break; + case CONFIG_ORIGIN_STDIN: + error_msg = xstrfmt(_("bad config line %d in standard input"), + cf->linenr); + break; + case CONFIG_ORIGIN_SUBMODULE_BLOB: + error_msg = xstrfmt(_("bad config line %d in submodule-blob %s"), + cf->linenr, cf->name); + break; + case CONFIG_ORIGIN_CMDLINE: + error_msg = xstrfmt(_("bad config line %d in command line %s"), + cf->linenr, cf->name); + break; + default: + error_msg = xstrfmt(_("bad config line %d in %s"), + cf->linenr, cf->name); + } + if (cf->die_on_error) - die(_("bad config line %d in %s %s"), cf->linenr, cf->origin_type, cf->name); + die("%s",
[PATCH] document how to reference previous commits
To reference previous commits people used to put just the abbreviated SHA-1 into commit messages. This is what has evolved as a more stable format for referencing commits. So lets document it for everyone to lookup when needed. Signed-off-by: Heiko Voigt--- On Thu, Jul 28, 2016 at 01:16:36PM +0200, Heiko Voigt wrote: > On Tue, Jul 26, 2016 at 10:22:07AM -0700, Stefan Beller wrote: > > Usually we refer to the commit by a triple of "abbrev. sha1 (date, subject). > > See d201a1ecd (2015-05-21, test_bitmap_walk: free bitmap with bitmap_free) > > for an example. Or ce41720ca (2015-04-02, blame, log: format usage strings > > similarly to those in documentation). > > > > Apparently we put the subject first and then the date. I always did it > > the other way > > round, to there is no strict coding guide line, though it helps a lot to > > have an > > understanding for a) how long are we in the "broken" state already as well > > as > > b) what was the rationale for introducing it. > > Ah ok did not know about this format. Will change that. I also will > follow-up with a patch to document this in SubmittingPatches so we can > point others to that... Here we go. Made this a seperate patch, since it is not really connected to the submodule-config-fix series. Cheers Heiko Documentation/SubmittingPatches | 5 + 1 file changed, 5 insertions(+) diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches index e8ad978..500230c 100644 --- a/Documentation/SubmittingPatches +++ b/Documentation/SubmittingPatches @@ -121,6 +121,11 @@ its behaviour. Try to make sure your explanation can be understood without external resources. Instead of giving a URL to a mailing list archive, summarize the relevant points of the discussion. +If you want to reference a previous commit in the history of a stable +branch use the format "abbreviated sha1 (subject, date)". So for example +like this: "Commit f86a374 (pack-bitmap.c: fix a memleak, 2015-03-30) +noticed [...]". + (3) Generate your patch using Git tools out of your commits. -- 2.0.2.832.g083c931 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/3] submodule-config: fix test binary crashing when no arguments given
Since arg[0] will be NULL without any argument here and starts_with() does not like NULL-pointers. Signed-off-by: Heiko Voigt--- A small fix I found while developing the test. test-submodule-config.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/test-submodule-config.c b/test-submodule-config.c index dab8c27..a4e4098 100644 --- a/test-submodule-config.c +++ b/test-submodule-config.c @@ -23,7 +23,7 @@ int main(int argc, char **argv) arg++; my_argc--; - while (starts_with(arg[0], "--")) { + while (arg[0] && starts_with(arg[0], "--")) { if (!strcmp(arg[0], "--url")) output_url = 1; if (!strcmp(arg[0], "--name")) -- 2.0.2.832.g083c931 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/3] submodule-config: combine early return code into one goto
So we have simpler return handling code and all the cleanup code in almost one place. Signed-off-by: Heiko Voigt--- This is an updated cleanup patch. Now with goto so we have simpler code. submodule-config.c | 31 --- 1 file changed, 12 insertions(+), 19 deletions(-) diff --git a/submodule-config.c b/submodule-config.c index 853989e..a887574 100644 --- a/submodule-config.c +++ b/submodule-config.c @@ -376,7 +376,7 @@ static const struct submodule *config_from(struct submodule_cache *cache, { struct strbuf rev = STRBUF_INIT; unsigned long config_size; - char *config; + char *config = NULL; unsigned char sha1[20]; enum object_type type; const struct submodule *submodule = NULL; @@ -397,10 +397,8 @@ static const struct submodule *config_from(struct submodule_cache *cache, return entry->config; } - if (!gitmodule_sha1_from_commit(commit_sha1, sha1, )) { - strbuf_release(); - return NULL; - } + if (!gitmodule_sha1_from_commit(commit_sha1, sha1, )) + goto out; switch (lookup_type) { case lookup_name: @@ -410,22 +408,12 @@ static const struct submodule *config_from(struct submodule_cache *cache, submodule = cache_lookup_path(cache, sha1, key); break; } - if (submodule) { - strbuf_release(); - return submodule; - } + if (submodule) + goto out; config = read_sha1_file(sha1, , _size); - if (!config) { - strbuf_release(); - return NULL; - } - - if (type != OBJ_BLOB) { - strbuf_release(); - free(config); - return NULL; - } + if (!config || type != OBJ_BLOB) + goto out; /* fill the submodule config into the cache */ parameter.cache = cache; @@ -445,6 +433,11 @@ static const struct submodule *config_from(struct submodule_cache *cache, default: return NULL; } + +out: + strbuf_release(); + free(config); + return submodule; } static const struct submodule *config_from_path(struct submodule_cache *cache, -- 2.0.2.832.g083c931 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/3] submodule-config: passing name reference for .gitmodule blobs
Commit 959b5455 (submodule: implement a config API for lookup of .gitmodules values, 2015-08-18) implemented the initial version of the submodule config cache. During development of that initial version we extracted the function gitmodule_sha1_from_commit(). During that process we missed that the strbuf rev was still used in config_from() and now is left empty. Lets fix this by also returning this string. This means that now when reading .gitmodules from revisions, the error messages also contain a reference to the blob they are from. Signed-off-by: Heiko Voigt--- Here you go. Now including a test. submodule-config.c | 26 +- t/t7411-submodule-config.sh | 11 +++ 2 files changed, 28 insertions(+), 9 deletions(-) diff --git a/submodule-config.c b/submodule-config.c index 1210b26..853989e 100644 --- a/submodule-config.c +++ b/submodule-config.c @@ -349,9 +349,9 @@ static int parse_config(const char *var, const char *value, void *data) } static int gitmodule_sha1_from_commit(const unsigned char *commit_sha1, - unsigned char *gitmodules_sha1) + unsigned char *gitmodules_sha1, + struct strbuf *rev) { - struct strbuf rev = STRBUF_INIT; int ret = 0; if (is_null_sha1(commit_sha1)) { @@ -359,11 +359,10 @@ static int gitmodule_sha1_from_commit(const unsigned char *commit_sha1, return 1; } - strbuf_addf(, "%s:.gitmodules", sha1_to_hex(commit_sha1)); - if (get_sha1(rev.buf, gitmodules_sha1) >= 0) + strbuf_addf(rev, "%s:.gitmodules", sha1_to_hex(commit_sha1)); + if (get_sha1(rev->buf, gitmodules_sha1) >= 0) ret = 1; - strbuf_release(); return ret; } @@ -375,6 +374,7 @@ static const struct submodule *config_from(struct submodule_cache *cache, const unsigned char *commit_sha1, const char *key, enum lookup_type lookup_type) { + struct strbuf rev = STRBUF_INIT; unsigned long config_size; char *config; unsigned char sha1[20]; @@ -397,8 +397,10 @@ static const struct submodule *config_from(struct submodule_cache *cache, return entry->config; } - if (!gitmodule_sha1_from_commit(commit_sha1, sha1)) + if (!gitmodule_sha1_from_commit(commit_sha1, sha1, )) { + strbuf_release(); return NULL; + } switch (lookup_type) { case lookup_name: @@ -408,14 +410,19 @@ static const struct submodule *config_from(struct submodule_cache *cache, submodule = cache_lookup_path(cache, sha1, key); break; } - if (submodule) + if (submodule) { + strbuf_release(); return submodule; + } config = read_sha1_file(sha1, , _size); - if (!config) + if (!config) { + strbuf_release(); return NULL; + } if (type != OBJ_BLOB) { + strbuf_release(); free(config); return NULL; } @@ -425,8 +432,9 @@ static const struct submodule *config_from(struct submodule_cache *cache, parameter.commit_sha1 = commit_sha1; parameter.gitmodules_sha1 = sha1; parameter.overwrite = 0; - git_config_from_mem(parse_config, "submodule-blob", "", + git_config_from_mem(parse_config, "submodule-blob", rev.buf, config, config_size, ); + strbuf_release(); free(config); switch (lookup_type) { diff --git a/t/t7411-submodule-config.sh b/t/t7411-submodule-config.sh index fc97c33..400e2b1 100755 --- a/t/t7411-submodule-config.sh +++ b/t/t7411-submodule-config.sh @@ -82,6 +82,17 @@ test_expect_success 'error in one submodule config lets continue' ' ) ' +test_expect_success 'error message contains blob reference' ' + (cd super && + sha1=$(git rev-parse HEAD) && + test-submodule-config \ + HEAD b \ + HEAD submodule \ + 2>actual_err && + grep "submodule-blob $sha1:.gitmodules" actual_err >/dev/null + ) +' + cat >super/expect_url
Re: Bug? git rebase -i --autostash fails to restore working files
Hi Phillip, Phillip Woodwrites: > When running ‘git rebase -i --autostash’ if the editor fails then the > rebase fails but stash is not applied so the working files are not > restored. Running ‘git rebase --abort’ replies that there’s no rebase > in progress. If you notice what’s happened it’s not a problem if you > know to do ‘git stash apply ’ but it’s confusing as > normally the stash would be automatically applied. This happened to me > when I messed up my editor configuration but you can reproduce it with > > $ GIT_SEQUENCE_EDITOR=false git rebase -i --autostash HEAD^^ > Created autostash: ff960d4 > HEAD is now at f1b8af7 [git] Turn on rebase.missingCommitsCheck [I'm happy to see rebase.missingCommitsCheck being used.] > Could not execute editor > > $ git rebase --abort > No rebase in progress? I remembered having read something about it recently and a quick search confirmed it. It was corrected in commit 33ba9c6 (29/06/2016, rebase -i: restore autostash on abort) that recently graduated to master (on 19/07/2016 if I read the various "What's cooking" correctly). So you will need to build git from source to have this corrected. > Best Wishes > > Phillip Thanks, Rémi -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 5/5] convert: add filter..process option
> On 27 Jul 2016, at 20:11, Jeff Kingwrote: > > On Wed, Jul 27, 2016 at 07:31:26PM +0200, Lars Schneider wrote: > + strbuf_grow(sb, size + 1); // we need one extra byte for the packet flush >>> >>> What happens if size is the maximum for size_t here (i.e., 4GB-1 on a >>> 32-bit system)? >> >> Would that be an acceptable solution? >> >> if (size + 1 > SIZE_MAX) >> return die("unrepresentable length for filter buffer"); > > No, because by definition "size" will wrap to 0. :) > > You have to do: > > if (size > SIZE_MAX - 1) > die("whoops"); > >> Can you point me to an example in the Git source how this kind of thing >> should >> be handled? > > The strbuf code itself checks for overflows. So you could do: > > strbuf_grow(sb, size); > ... fill up with size bytes ... > strbuf_addch(sb, ...); /* extra byte for whatever */ > > That does mean _possibly_ making a second allocation just to add the > extra byte, but in practice it's not likely (unless the input exactly > matches the strbuf's growth pattern). > > If you want to do it yourself, I think: > > strbuf_grow(sb, st_add(size, 1)); I like that solution! Thanks! > would work. > + while ( + bytes_read > 0 && // the last packet was no flush + sb->len - total_bytes_read - 1 > 0 // we still have space left in the buffer + ); >>> >>> And I'm not sure if you need to distinguish between "0" and "-1" when >>> checking byte_read here. >> >> We want to finish reading in both cases, no? > > If we get "-1", that's from an unexpected EOF during the packet_read(), > because you set GENTLE_ON_EOF. So there's nothing left to read, and we > should break and return an error. Right. > I guess "0" would come from a flush packet? Why would the filter send > back a flush packet (unless you were using them to signal end-of-input, > but then why does the filter have to send back the number of bytes ahead > of time?). Sending the bytes ahead of time (if available) might be nice for efficient buffer allocation. I am changing the code so that both cases can be handled (size ahead of time and no size ahead of time). >>> Why 8K? The pkt-line format naturally restricts us to just under 64K, so >>> why not take advantage of that and minimize the framing overhead for >>> large data? >> >> I took inspiration from here for 8K MAX_IO_SIZE: >> https://github.com/git/git/blob/master/copy.c#L6 >> >> Is this read limit correct? Should I read 8 times to fill a pkt-line? > > MAX_IO_SIZE is generally 8 _megabytes_, not 8K. The loop in copy.c just > haad to pick an arbitrary size for doing its read/write proxying. I > think in practice you are not likely to get much benefit from going > beyond 8K or so there, just because operating systems tend to do things > in page-sizes anyway, which are usually 4K. > > But since you are formatting the result into a form that has framing > overhead, anything up to LARGE_PACKET_MAX will see benefits (though > admittedly even 4 bytes per 8K is not much). > > I don't think it's worth the complexity of reading 8 times, but just > using a buffer size of LARGE_PACKET_MAX-4 would be the most efficient. > > I doubt it matters _that much_ in practice, but any time I see a magic > number I have to wonder at the "why". At least basing it on > LARGE_PACKET_MAX has some basis, whereas 8K is largely just made-up. :) Sounds good. I will use LARGE_PACKET_MAX-4 ! > >>> We do sometimes do "ret |= something()" but that is in cases where >>> "ret" is zero for success, and non-zero (usually -1) otherwise. Perhaps >>> your function's error-reporting is inverted from our usual style? >> >> I thought it makes the code easier to read and the filter doesn't care >> at what point the error happens anyways. The filter either succeeds >> or fails. What style would you suggest? > > I think that's orthogonal. I just mean that using zero for success puts > you in our usual style, and then accumulating errors can be done with > "|=". Ah. I guess I was misguided by the way errors are currently handled in `apply_filter` (success = 1; failure = 0): https://github.com/git/git/blob/8c6d1f9807c67532e7fb545a944b064faff0f70b/convert.c#L437-L479 I wouldn't like if the different filter protocols would use different error exit codes. Would it be OK to adjust the existing `apply_filter` function in a cleanup patch? > I didn't look carefully at whether the accumulating style you're using > makes sense or not. But something like: > + ret &= write_in_full(out, , sizeof(header)) == sizeof(header); + ret &= write_in_full(out, src, bytes_to_write) == bytes_to_write; > > does mean that we call the second write() even if the first one failed. > That's a waste of time (albeit a minor one), but it also means you could > potentially cover up the value of "errno" from the first one (though in >
[PATCH v4] i18n: notes: mark comment for translation
Mark comment displayed when editing a note for translation. Signed-off-by: Vasco Almeida--- This patch follows the original output and Ævar Arnfjörð Bjarmason sugestion to remove \n from the source string in order to assure that the ouput layout is not change by one translator forgetting to add \n, for instance. builtin/notes.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/builtin/notes.c b/builtin/notes.c index 0572051..f848b89 100644 --- a/builtin/notes.c +++ b/builtin/notes.c @@ -91,7 +91,7 @@ static const char * const git_notes_get_ref_usage[] = { }; static const char note_template[] = - "\nWrite/edit the notes for the following object:\n"; + N_("Write/edit the notes for the following object:"); struct note_data { int given; @@ -179,7 +179,8 @@ static void prepare_note_data(const unsigned char *object, struct note_data *d, copy_obj_to_fd(fd, old_note); strbuf_addch(, '\n'); - strbuf_add_commented_lines(, note_template, strlen(note_template)); + strbuf_add_commented_lines(, "\n", strlen("\n")); + strbuf_add_commented_lines(, _(note_template), strlen(_(note_template))); strbuf_addch(, '\n'); write_or_die(fd, buf.buf, buf.len); -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] fix passing a name for config from submodules
On Tue, Jul 26, 2016 at 10:22:07AM -0700, Stefan Beller wrote: > On Tue, Jul 26, 2016 at 2:49 AM, Heiko Voigtwrote: > > Thanks for continuing on the submodule cache! No worries. Its my code so I am happy to fix any bugs in it. Although it was obviously perfect from the beginning ;) > > In commit 959b5455 we implemented the initial version of the submodule > > Usually we refer to the commit by a triple of "abbrev. sha1 (date, subject). > See d201a1ecd (2015-05-21, test_bitmap_walk: free bitmap with bitmap_free) > for an example. Or ce41720ca (2015-04-02, blame, log: format usage strings > similarly to those in documentation). > > Apparently we put the subject first and then the date. I always did it > the other way > round, to there is no strict coding guide line, though it helps a lot to have > an > understanding for a) how long are we in the "broken" state already as well as > b) what was the rationale for introducing it. Ah ok did not know about this format. Will change that. I also will follow-up with a patch to document this in SubmittingPatches so we can point others to that... > > @@ -397,8 +397,10 @@ static const struct submodule *config_from(struct > > submodule_cache *cache, > > return entry->config; > > } > > > > - if (!gitmodule_sha1_from_commit(commit_sha1, sha1)) > > + if (!gitmodule_sha1_from_commit(commit_sha1, sha1, )) { > > + strbuf_release(); > > return NULL; > > This is a reoccuring pattern below. Maybe it might make sense to > just do a s/return.../ goto out/ and at that label we cleanup `rev` and > `config` > and return a result value? > There are currently 6 early returns (not counting the 3 from the last switch), > 4 of them return NULL, so that would result in just a "goto out", whereas 2 > return an actual value, they would need to assign the result value first > before > jumping out of the logic. I dunno, just food for though. I also though about that but was not sure whether it would actually make things simple. Will look into that as the second patch. > > @@ -425,8 +432,9 @@ static const struct submodule *config_from(struct > > submodule_cache *cache, > > parameter.commit_sha1 = commit_sha1; > > parameter.gitmodules_sha1 = sha1; > > parameter.overwrite = 0; > > - git_config_from_mem(parse_config, "submodule-blob", "", > > + git_config_from_mem(parse_config, "submodule-blob", rev.buf, > > config, config_size, ); > > Ok, this is the actual fix. Do you want to demonstrate its impact by adding > one or two tests that failed before and now work? > (As I was using the submodule config API most of the time with null_sha1 > to indicate we'd be looking at the current .gitmodules file in the worktree, > the actual bug may have not manifested in the users of this API. > But still, it would be nice to see what was broken?) This popped up because of Rene's cleanup patch and I wanted to provide a patch of what was originally supposed to go in there. The name (originally representing the filename of the parsed config) is put into the structure that represents the source. I had a quick look and it seems to mostly be used in error messages. E.g.: * in error message git_parse_source() to reference the file * error message in git_die_config_linenr() (filename is derived from name) There is some more, but I will add a test which checks whether the error message actually contains a reference to the blob instead of nothing. E.g. looks like this: error: bad config line 7 in submodule-blob 39a7458d2b5b3e3d1938b01ff2645b14c94ac284:.gitmodules instead of this error: bad config line 7 in submodule-blob That might be quite helpful to find out where the error is when you have one in the history. So we are fixing a real bug here (not just a theoretical one). Cheers Heiko -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Bug? git rebase -i --autostash fails to restore working files
When running ‘git rebase -i --autostash’ if the editor fails then the rebase fails but stash is not applied so the working files are not restored. Running ‘git rebase --abort’ replies that there’s no rebase in progress. If you notice what’s happened it’s not a problem if you know to do ‘git stash apply ’ but it’s confusing as normally the stash would be automatically applied. This happened to me when I messed up my editor configuration but you can reproduce it with $ GIT_SEQUENCE_EDITOR=false git rebase -i --autostash HEAD^^ Created autostash: ff960d4 HEAD is now at f1b8af7 [git] Turn on rebase.missingCommitsCheck Could not execute editor $ git rebase --abort No rebase in progress? Best Wishes Phillip -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html