[PATCH 08/13] list-objects: add traverse_commit_list_filtered method

2017-10-24 Thread Jeff Hostetler
From: Jeff Hostetler <jeffh...@microsoft.com> Add traverse_commit_list_filtered() wrapper around the various filter methods using common data in object_filter_options. Signed-off-by: Jeff Hostetler <jeffh...@microsoft.com> --- list-ob

[PATCH 00/13] WIP Partial clone part 1: object filtering

2017-10-24 Thread Jeff Hostetler
From: Jeff Hostetler <jeffh...@microsoft.com> I've been working with Jonathan Tan to combine our partial clone proposals. This patch series represents a first step in that effort and introduces an object filtering mechanism to select unwanted objects. [1] traverse_commit_list and list-o

[PATCH 02/13] list-objects-filter-map: extend oidmap to collect omitted objects

2017-10-24 Thread Jeff Hostetler
From: Jeff Hostetler <jeffh...@microsoft.com> Create helper class to extend oidmap to collect a list of omitted or missing objects during traversal. This will be used in a later commit by the list-object filtering code. Signed-off-by: Jeff Hostetler <jeffh...@microsoft.com> -

Re: [PATCH 00/18] Partial clone (from clone to lazy fetch in 18 patches)

2017-10-04 Thread Jeff Hostetler
On 10/3/2017 7:42 PM, Jonathan Tan wrote: On Tue, Oct 3, 2017 at 7:39 AM, Jeff Hostetler <g...@jeffhostetler.com> wrote: As I see it there are the following major parts to partial clone: 1. How to let git-clone (and later git-fetch) specify the desired subset of objects that it

Re: [PATCH 00/18] Partial clone (from clone to lazy fetch in 18 patches)

2017-10-03 Thread Jeff Hostetler
On 10/3/2017 4:50 AM, Junio C Hamano wrote: Christian Couder writes: Could you give a bit more details about the use cases this is designed for? It seems that when people review my work they want a lot of details about the use cases, so I guess they would also be

Re: [idea] File history tracking hints

2017-10-02 Thread Jeff Hostetler
On 10/2/2017 3:18 PM, Stefan Beller wrote: On Mon, Oct 2, 2017 at 11:51 AM, Jeff Hostetler <g...@jeffhostetler.com> wrote: Sorry to re-re-...-re-stir up such an old topic. I wasn't really thinking about commit-to-commit hints. I think these have lots of problems. (If commit A->

Re: [idea] File history tracking hints

2017-10-02 Thread Jeff Hostetler
On 10/2/2017 1:41 PM, Stefan Beller wrote: It would be nice if every file (and tree) had a permanent GUID associated with it. Then the filename/pathname becomes a property of the GUIDs. Then you can exactly know about moves/renames with minimal effort (and no guessing). ...

Re: [idea] File history tracking hints

2017-09-30 Thread Jeff Hostetler
On 9/29/2017 7:12 PM, Johannes Schindelin wrote: Hi Philip, On Fri, 15 Sep 2017, Philip Oakley wrote: From: "Johannes Schindelin" In light of such experiences, I have to admit that the notion that the rename detection can always be improved in hindsight puts

Re: [PATCH 07/13] object-filter: common declarations for object filtering

2017-09-28 Thread Jeff Hostetler
On 9/27/2017 8:05 PM, Jonathan Tan wrote: On Wed, 27 Sep 2017 13:09:42 -0400 Jeff Hostetler <g...@jeffhostetler.com> wrote: On 9/26/2017 6:39 PM, Jonathan Tan wrote: On Fri, 22 Sep 2017 20:30:11 +0000 Jeff Hostetler <g...@jeffhostetler.com> wrote: Makefile| 1

Re: [PATCH 03/13] list-objects: filter objects in traverse_commit_list

2017-09-27 Thread Jeff Hostetler
On 9/27/2017 2:00 PM, Jonathan Tan wrote: On Wed, 27 Sep 2017 13:04:42 -0400 Jeff Hostetler <g...@jeffhostetler.com> wrote: The sparse filter is looking at pathnames and using the same rules as sparse-checkout to decide which to *include* in the result. This is essentially backward

Re: [PATCH 09/13] rev-list: add object filtering support

2017-09-27 Thread Jeff Hostetler
On 9/26/2017 6:44 PM, Jonathan Tan wrote: On Fri, 22 Sep 2017 20:30:13 + Jeff Hostetler <g...@jeffhostetler.com> wrote: + if (filter_options.relax) { Add some documentation about how this differs from ignore_missing_links in struct rev_info. It's unclea

Re: [PATCH 07/13] object-filter: common declarations for object filtering

2017-09-27 Thread Jeff Hostetler
On 9/26/2017 6:39 PM, Jonathan Tan wrote: On Fri, 22 Sep 2017 20:30:11 + Jeff Hostetler <g...@jeffhostetler.com> wrote: Makefile| 1 + object-filter.c | 269 object-filter.h

Re: [PATCH 03/13] list-objects: filter objects in traverse_commit_list

2017-09-27 Thread Jeff Hostetler
On 9/26/2017 6:31 PM, Jonathan Tan wrote: On Fri, 22 Sep 2017 20:26:22 + Jeff Hostetler <g...@jeffhostetler.com> wrote: From: Jeff Hostetler <jeffh...@microsoft.com> Create traverse_commit_list_filtered() and add filtering You mention _filtered() here, but this patch cont

Re: [PATCH 02/13] oidset2: create oidset subclass with object length and pathname

2017-09-27 Thread Jeff Hostetler
On 9/26/2017 6:20 PM, Jonathan Tan wrote: On Fri, 22 Sep 2017 20:26:21 + Jeff Hostetler <g...@jeffhostetler.com> wrote: From: Jeff Hostetler <jeffh...@microsoft.com> Create subclass of oidset where each entry has a field to store the length of the object's content and

Re: [PATCH 00/13] RFC object filtering for parital clone

2017-09-26 Thread Jeff Hostetler
On 9/26/2017 10:55 AM, Jeff Hostetler wrote: On 9/22/2017 8:39 PM, Jonathan Tan wrote: On Fri, 22 Sep 2017 20:26:19 + Jeff Hostetler <g...@jeffhostetler.com> wrote: ... I tried applying your patches and it doesn't apply cleanly on master. Could you try rebasing? In particular, th

Re: [PATCH 00/13] RFC object filtering for parital clone

2017-09-26 Thread Jeff Hostetler
On 9/22/2017 8:39 PM, Jonathan Tan wrote: On Fri, 22 Sep 2017 20:26:19 + Jeff Hostetler <g...@jeffhostetler.com> wrote: This draft contains filters to: () omit all blobs () omit blobs larger than some size () omit blobs using a sparse-checkout specification In addition to spec

Re: [PATCH] git: add --no-optional-locks option

2017-09-26 Thread Jeff Hostetler
On 9/25/2017 12:17 PM, Johannes Schindelin wrote: Hi Kaartic, On Sun, 24 Sep 2017, Kaartic Sivaraam wrote: On Thursday 21 September 2017 10:02 AM, Jeff King wrote: Some tools like IDEs or fancy editors may periodically run commands like "git status" in the background to keep track of the

Re: RFC: Design and code of partial clones (now, missing commits and trees OK) (part 3)

2017-09-26 Thread Jeff Hostetler
On 9/22/2017 6:58 PM, Jonathan Tan wrote: On Fri, 22 Sep 2017 17:32:00 -0400 Jeff Hostetler <g...@jeffhostetler.com> wrote: I guess I'm afraid that the first call to is_promised() is going cause a very long pause as it loads up a very large hash of objects. Yes, the first call will

Re: RFC: Design and code of partial clones (now, missing commits and trees OK) (part 2/3)

2017-09-26 Thread Jeff Hostetler
On 9/22/2017 6:52 PM, Jonathan Tan wrote: On Fri, 22 Sep 2017 17:19:50 -0400 Jeff Hostetler <g...@jeffhostetler.com> wrote: In your specific example, how would rev-list know, on the client, to include (or exclude) a large blob in its output if it does not have it, and thus does not kn

Re: RFC: Design and code of partial clones (now, missing commits and trees OK) (part 3)

2017-09-22 Thread Jeff Hostetler
On 9/21/2017 7:04 PM, Jonathan Tan wrote: On Thu, 21 Sep 2017 14:00:40 -0400 Jeff Hostetler <g...@jeffhostetler.com> wrote: (part 3) Additional overall comments on: https://github.com/jonathantanmy/git/commits/partialclone2 {} WRT the code in is_promised() [1] [1] https://gith

Re: RFC: Design and code of partial clones (now, missing commits and trees OK) (part 2/3)

2017-09-22 Thread Jeff Hostetler
On 9/21/2017 6:51 PM, Jonathan Tan wrote: On Thu, 21 Sep 2017 13:59:43 -0400 Jeff Hostetler <g...@jeffhostetler.com> wrote: (part 2) Additional overall comments on: https://github.com/jonathantanmy/git/commits/partialclone2 {} I think it would help to split the blob-max-bytes fil

Re: RFC: Design and code of partial clones (now, missing commits and trees OK)

2017-09-22 Thread Jeff Hostetler
On 9/21/2017 6:42 PM, Jonathan Tan wrote: On Thu, 21 Sep 2017 13:57:30 -0400 Jeff Hostetler <g...@jeffhostetler.com> wrote: There's a lot in this patch series. I'm still studying it, but here are some notes and questions. I'll start with direct responses to the RFC here and fol

[PATCH 11/13] t6112: rev-list object filtering test

2017-09-22 Thread Jeff Hostetler
From: Jeff Hostetler <jeffh...@microsoft.com> Signed-off-by: Jeff Hostetler <jeffh...@microsoft.com> --- t/t6112-rev-list-filters-objects.sh | 237 1 file changed, 237 insertions(+) create mode 100755 t/t6112-rev-list-filters-objects.sh diff --g

[PATCH 13/13] pack-objects: add filtering help text

2017-09-22 Thread Jeff Hostetler
From: Jeff Hostetler <jeffh...@microsoft.com> Add help text for new object filtering options to pack-objects documentation. Signed-off-by: Jeff Hostetler <jeffh...@microsoft.com> --- Documentation/git-pack-objects.txt | 17 + 1 file changed, 17 insertions(+)

[PATCH 12/13] pack-objects: add object filtering support

2017-09-22 Thread Jeff Hostetler
From: Jeff Hostetler <jeffh...@microsoft.com> Teach pack-objects to use the filtering provided by the traverse_commit_list_filtered() interface to omit unwanted objects from the resulting packfile. This feature is intended for partial clone/fetch. Filtering requires the use of the &qu

[PATCH 09/13] rev-list: add object filtering support

2017-09-22 Thread Jeff Hostetler
From: Jeff Hostetler <jeffh...@microsoft.com> Teach rev-list to use the filtering provided by the traverse_commit_list_filtered() interface to omit unwanted objects from the result. This feature is only enabled when one of the "--objects*" options are used. When the "--

[PATCH 10/13] rev-list: add filtering help text

2017-09-22 Thread Jeff Hostetler
From: Jeff Hostetler <jeffh...@microsoft.com> Signed-off-by: Jeff Hostetler <jeffh...@microsoft.com> --- Documentation/git-rev-list.txt | 9 - Documentation/rev-list-options.txt | 32 2 files changed, 40 insertions(+), 1 deletion(-)

[PATCH 08/13] list-objects: add traverse_commit_list_filtered method

2017-09-22 Thread Jeff Hostetler
From: Jeff Hostetler <jeffh...@microsoft.com> Add traverse_commit_list_filtered() wrapper around the various filter methods using common data in object_filter_options. Signed-off-by: Jeff Hostetler <jeffh...@microsoft.com> --- list-objects.c | 34

[PATCH 06/13] list-objects-filter-sparse: add sparse-checkout based filter

2017-09-22 Thread Jeff Hostetler
From: Jeff Hostetler <jeffh...@microsoft.com> Create a filter for traverse_commit_list_worker() to only include the blobs the would be referenced by a sparse-checkout using the given specification. A future enhancement should be able to also omit unneeded tree objects, but that is not cur

[PATCH 05/13] list-objects-filter-large: add large blob filter to list-objects

2017-09-22 Thread Jeff Hostetler
From: Jeff Hostetler <jeffh...@microsoft.com> Create a filter for traverse_commit_list_worker() to omit blobs larger than a requested size from the result, but always include ".git*" special files. Signed-off-by: Jeff Hostetler @microsoft.com> --- Makefile

[PATCH 07/13] object-filter: common declarations for object filtering

2017-09-22 Thread Jeff Hostetler
From: Jeff Hostetler <jeffh...@microsoft.com> Create common routines and defines for parsing object-filter-related command line arguments and pack-protocol fields. Signed-off-by: Jeff Hostetler <jeffh...@microsoft.com> --- Makefile| 1 + object-fi

[PATCH 00/13] RFC object filtering for parital clone

2017-09-22 Thread Jeff Hostetler
From: Jeff Hostetler <jeffh...@microsoft.com> This patch series contains WIP code demonstrating object (blob) filtering in rev-list and pack-objects using a common filtering API in list-objects and traverse-commit-list that allows both commands to perform the same type of filter oper

[PATCH 01/13] dir: refactor add_excludes()

2017-09-22 Thread Jeff Hostetler
From: Jeff Hostetler <jeffh...@microsoft.com> Refactor add_excludes() to separate the reading of the exclude file into a buffer and the parsing of the buffer into exclude_list items. Add add_excludes_from_blob_to_list() to allow an exclude file be specified with an OID. Signed-off-by

[PATCH 02/13] oidset2: create oidset subclass with object length and pathname

2017-09-22 Thread Jeff Hostetler
From: Jeff Hostetler <jeffh...@microsoft.com> Create subclass of oidset where each entry has a field to store the length of the object's content and an optional pathname. This will be used in a future commit to build a manifest of omitted objects in a partial/narrow clone/fetch. Sign

[PATCH 03/13] list-objects: filter objects in traverse_commit_list

2017-09-22 Thread Jeff Hostetler
From: Jeff Hostetler <jeffh...@microsoft.com> Create traverse_commit_list_filtered() and add filtering interface to allow certain objects to be omitted (not shown) during a traversal. Update traverse_commit_list() to be a wrapper for the above. Filtering will be used in a future commit

[PATCH 04/13] list-objects-filter-all: add filter to omit all blobs

2017-09-22 Thread Jeff Hostetler
From: Jeff Hostetler <jeffh...@microsoft.com> Create a simple filter for traverse_commit_list_worker() to omit all blobs from the result. This filter will be used in a future commit by rev-list and pack-objects to create a "commits and trees" result. This is intended for partial

Re: [PATCH] git: add --no-optional-locks option

2017-09-22 Thread Jeff Hostetler
On 9/22/2017 12:25 AM, Jeff King wrote: On Thu, Sep 21, 2017 at 08:25:50PM +0200, Johannes Sixt wrote: +`GIT_OPTIONAL_LOCKS`:: + If set to `0`, Git will avoid performing any operations which + require taking a lock and which are not required to complete the + requested

Re: RFC: Design and code of partial clones (now, missing commits and trees OK) (part 3)

2017-09-21 Thread Jeff Hostetler
(part 3) Additional overall comments on: https://github.com/jonathantanmy/git/commits/partialclone2 {} WRT the code in is_promised() [1] [1] https://github.com/jonathantanmy/git/commit/7a9c2d9b6e2fce293817b595dee29a7eede0#diff-5d5d5dc185ef37dc30bb7d9a7ae0c4e8R1960 {} it looked like it

Re: RFC: Design and code of partial clones (now, missing commits and trees OK) (part 2/3)

2017-09-21 Thread Jeff Hostetler
(part 2) Additional overall comments on: https://github.com/jonathantanmy/git/commits/partialclone2 {} I think it would help to split the blob-max-bytes filtering and the promisor/promised concepts and discuss them independently. {} Then we can talk about about the promisor/promised

Re: RFC: Design and code of partial clones (now, missing commits and trees OK)

2017-09-21 Thread Jeff Hostetler
There's a lot in this patch series. I'm still studying it, but here are some notes and questions. I'll start with direct responses to the RFC here and follow up in a second email with specific questions and comments to keep this from being too long). On 9/15/2017 4:43 PM, Jonathan Tan wrote:

[PATCH v2] hashmap: address ThreadSanitizer concerns

2017-09-06 Thread Jeff Hostetler
From: Jeff Hostetler <jeffh...@microsoft.com> Version 2 addresses the comments and suggestions on version 1. It removes the explicit disable/enable rehash and just relies on the state of hashmap counting. It changes the declaration of the hashmap_get_size() to be static to avoid issue

[PATCH v2] hashmap: add API to disable item counting when threaded

2017-09-06 Thread Jeff Hostetler
From: Jeff Hostetler <jeffh...@microsoft.com> This is to address concerns raised by ThreadSanitizer on the mailing list about threaded unprotected R/W access to map.size with my previous "disallow rehash" change (0607e10009ee4e37cb49b4cec8d28a9dda1656a4). See: https://publ

Re: [PATCH] hashmap: add API to disable item counting when threaded

2017-09-06 Thread Jeff Hostetler
On 9/5/2017 9:24 PM, Junio C Hamano wrote: Jeff Hostetler <g...@jeffhostetler.com> writes: From: Jeff Hostetler <jeffh...@microsoft.com> I feel somewhat stupid to say this, especially after seeing many people applaud this patch, but I do not seem to be able to ev

Re: [PATCH] hashmap: add API to disable item counting when threaded

2017-09-05 Thread Jeff Hostetler
On 9/2/2017 4:05 AM, Jeff King wrote: On Wed, Aug 30, 2017 at 06:59:22PM +, Jeff Hostetler wrote: From: Jeff Hostetler <jeffh...@microsoft.com> This is to address concerns raised by ThreadSanitizer on the mailing list about threaded unprotected R/W access to map.size with my pr

Re: [PATCH] hashmap: add API to disable item counting when threaded

2017-09-05 Thread Jeff Hostetler
On 9/2/2017 4:17 AM, Jeff King wrote: On Sat, Sep 02, 2017 at 01:31:19AM +0200, Johannes Schindelin wrote: Before anybody can ask for this message to be wrapped in _(...) to be translateable, let me suggest instead to add the prefix "BUG: ". Agreed on both (and Jonathan's suggestion to just

Re: [PATCH] hashmap: add API to disable item counting when threaded

2017-09-05 Thread Jeff Hostetler
On 9/1/2017 7:50 PM, Jonathan Nieder wrote: Hi, Johannes Schindelin wrote: On Wed, 30 Aug 2017, Jeff Hostetler wrote: This is to address concerns raised by ThreadSanitizer on the mailing list about threaded unprotected R/W access to map.size with my previous "disallow rehash&qu

Re: [PATCH] hashmap: add API to disable item counting when threaded

2017-09-05 Thread Jeff Hostetler
On 9/1/2017 7:31 PM, Johannes Schindelin wrote: Hi Jeff, On Wed, 30 Aug 2017, Jeff Hostetler wrote: From: Jeff Hostetler <jeffh...@microsoft.com> This is to address concerns raised by ThreadSanitizer on the mailing list about threaded unprotected R/W access to map.size with my pr

[PATCH] hashmap: add API to disable item counting when threaded

2017-08-30 Thread Jeff Hostetler
From: Jeff Hostetler <jeffh...@microsoft.com> This is to address concerns raised by ThreadSanitizer on the mailing list about threaded unprotected R/W access to map.size with my previous "disallow rehash" change (0607e10009ee4e37cb49b4cec8d28a9dda1656a4). See: https://publ

[PATCH] hashmap: address ThreadSanitizer concerns

2017-08-30 Thread Jeff Hostetler
From: Jeff Hostetler <jeffh...@microsoft.com> This is to address concerns raised by ThreadSanitizer on the mailing list about threaded unprotected R/W access to map.size with my previous "disallow rehash" change (0607e10009ee4e37cb49b4cec8d28a9dda1656a4). See: https://publ

Re: [RFC 0/7] transitioning to protocol v2

2017-08-30 Thread Jeff Hostetler
On 8/29/2017 11:06 PM, Jeff King wrote: On Tue, Aug 29, 2017 at 04:08:25PM -0400, Jeff Hostetler wrote: I just wanted to jump in here and say I've done some initial testing of this against VSTS and so far it seems fine. And yes, we have a custom git server. Great, thank you for checking

Re: [RFC 0/7] transitioning to protocol v2

2017-08-29 Thread Jeff Hostetler
On 8/25/2017 1:35 PM, Jonathan Nieder wrote: Hi, Jeff King wrote: On Thu, Aug 24, 2017 at 03:53:21PM -0700, Brandon Williams wrote: Another version of Git's wire protocol is a topic that has been discussed and attempted by many in the community over the years. The biggest challenge, as

Re: [PATCH v2 0/4] Some ThreadSanitizer-results

2017-08-28 Thread Jeff Hostetler
On 8/21/2017 1:43 PM, Martin Ågren wrote: This is the second version of my series to try to address some issues ... 2) hashmap_add, which I could try my hands on if Jeff doesn't beat me to it -- his proposed change should fix it and I doubt I could come up with anything "better", considering

Re: tsan: t3008: hashmap_add touches size from multiple threads

2017-08-15 Thread Jeff Hostetler
On 8/15/2017 3:21 PM, Martin Ågren wrote: On 15 August 2017 at 20:48, Stefan Beller wrote: /* total number of entries (0 means the hashmap is empty) */ - unsigned int size; + /* -1 means size is unknown for threading reasons */ + int size;

Re: tsan: t3008: hashmap_add touches size from multiple threads

2017-08-15 Thread Jeff Hostetler
On 8/15/2017 8:53 AM, Martin Ågren wrote: Using SANITIZE=thread made t3008-ls-files-lazy-init-name-hash.sh hit the potential race below. What seems to happen is, threaded_lazy_init_name_hash ends up using hashmap_add on the_index.dir_hash from two threads in a way that tsan considers racy.

Re: [PATCH] convert any hard coded .gitmodules file string to the MACRO

2017-08-01 Thread Jeff Hostetler
On 7/31/2017 7:11 PM, Stefan Beller wrote: I used these commands: $ cat sem.cocci @@ @@ - ".gitmodules" + GITMODULES_FILE $ spatch --in-place --sp-file sem.cocci builtin/*.c *.c *.h Feel free to regenerate or squash it in or have it as a separate commit. Signed-off-by:

Re: [RFC PATCH 1/3] promised-blob, fsck: introduce promised blobs

2017-07-14 Thread Jeff Hostetler
On 7/13/2017 3:39 PM, Jonathan Tan wrote: On Wed, 12 Jul 2017 13:29:11 -0400 Jeff Hostetler <g...@jeffhostetler.com> wrote: My primary concern is scale and managing the list of objects over time. My fear is that this list will be quite large. If we only want to omit the very large

[PATCH v2 02/19] oidset2: create oidset subclass with object length and pathname

2017-07-13 Thread Jeff Hostetler
From: Jeff Hostetler <jeffh...@microsoft.com> Create subclass of oidset where each entry has a field to store the length of the object's content and an optional pathname. This will be used in a future commit to build a manifest of omitted objects in a partial/narrow clone/fetch. Sign

[PATCH v2 04/19] list-objects-filters: add omit-all-blobs filter

2017-07-13 Thread Jeff Hostetler
From: Jeff Hostetler <jeffh...@microsoft.com> Create a simple filter for traverse_commit_list_filtered() to omit all blobs from the result. This filter will be used in a future commit by rev-list and pack-objects to create a "commits and trees" result. This is intended for

[PATCH v2 06/19] list-objects-filters: add use-sparse-checkout filter

2017-07-13 Thread Jeff Hostetler
From: Jeff Hostetler <jeffh...@microsoft.com> Create a filter for traverse_commit_list_filtered() to omit the blobs that would not be needed by a sparse checkout using the given sparse-checkout spec. This filter will be used in a future commit by rev-list and pack-objects for partial/

[PATCH v2 05/19] list-objects-filters: add omit-large-blobs filter

2017-07-13 Thread Jeff Hostetler
From: Jeff Hostetler <jeffh...@microsoft.com> Create a filter for traverse_commit_list_filtered() to omit blobs larger than a requested size from the result, but always include ".git*" special files. This filter will be used in a future commit by rev-list and pack-objects fo

[PATCH v2 10/19] t6112: rev-list object filtering test

2017-07-13 Thread Jeff Hostetler
From: Jeff Hostetler <jeffh...@microsoft.com> Signed-off-by: Jeff Hostetler <jeffh...@microsoft.com> --- t/t6112-rev-list-filters-objects.sh | 37 + 1 file changed, 37 insertions(+) create mode 100644 t/t6112-rev-list-filters-objects.sh diff --g

[PATCH v2 03/19] list-objects: filter objects in traverse_commit_list

2017-07-13 Thread Jeff Hostetler
From: Jeff Hostetler <jeffh...@microsoft.com> Create traverse_commit_list_filtered() and add filtering interface to allow certain objects to be omitted (not shown) during a traversal. Update traverse_commit_list() to be a wrapper for the above. Filtering will be used in a future commit

[PATCH v2 09/19] rev-list: add filtering help text

2017-07-13 Thread Jeff Hostetler
From: Jeff Hostetler <jeffh...@microsoft.com> Signed-off-by: Jeff Hostetler <jeffh...@microsoft.com> --- Documentation/git-rev-list.txt | 7 ++- Documentation/rev-list-options.txt | 26 ++ 2 files changed, 32 insertions(+), 1 deletion(-)

[PATCH v2 19/19] fetch: add object filtering to fetch

2017-07-13 Thread Jeff Hostetler
From: Jeff Hostetler <jeffh...@microsoft.com> Signed-off-by: Jeff Hostetler <jeffh...@microsoft.com> --- builtin/fetch.c | 27 ++- 1 file changed, 26 insertions(+), 1 deletion(-) diff --git a/builtin/fetch.c b/builtin/fetch.c index 5f2c2ab..306c165 100644 -

[PATCH v2 13/19] upload-pack: add filter-objects to protocol documentation

2017-07-13 Thread Jeff Hostetler
From: Jeff Hostetler <jeffh...@microsoft.com> Signed-off-by: Jeff Hostetler <jeffh...@microsoft.com> --- Documentation/technical/pack-protocol.txt | 16 Documentation/technical/protocol-capabilities.txt | 7 +++ 2 files changed, 23 insertions(+)

[PATCH v2 18/19] index-pack: relax consistency checks for omitted objects

2017-07-13 Thread Jeff Hostetler
From: Jeff Hostetler <jeffh...@microsoft.com> Signed-off-by: Jeff Hostetler <jeffh...@microsoft.com> --- builtin/index-pack.c | 15 +++ 1 file changed, 15 insertions(+) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 4ff567d..30ff409 100644 --- a/builtin/

[PATCH v2 00/19] WIP object filtering for partial clone

2017-07-13 Thread Jeff Hostetler
From: Jeff Hostetler <jeffh...@microsoft.com> This WIP is a follow up to my earlier patch series to teach pack-objects to omit large blobs from packfiles. [1] Like the previous version, this version builds upon a suggestion from Peff [2] to use the traverse_commit_list() machinery to

[PATCH v2 12/19] pack-objects: add filtering help text

2017-07-13 Thread Jeff Hostetler
From: Jeff Hostetler <jeffh...@microsoft.com> Update pack-objects help text to describe object filtering. Signed-off-by: Jeff Hostetler <jeffh...@microsoft.com> --- Documentation/git-pack-objects.txt | 14 ++ 1 file changed, 14 insertions(+) diff --git a/Documentat

[PATCH v2 11/19] pack-objects: add object filtering support

2017-07-13 Thread Jeff Hostetler
From: Jeff Hostetler <jeffh...@microsoft.com> Teach pack-objects to use filtering provided by the traverse_commit_list_filtered() interface to omit unwanted objects from the result. This feature is intended for narrow/partial clone/fetch. Filtering requires use of "--stdout" opt

[PATCH v2 14/19] upload-pack: add object filtering

2017-07-13 Thread Jeff Hostetler
From: Jeff Hostetler <jeffh...@microsoft.com> Signed-off-by: Jeff Hostetler <jeffh...@microsoft.com> --- upload-pack.c | 39 ++- 1 file changed, 38 insertions(+), 1 deletion(-) diff --git a/upload-pack.c b/upload-pack.c index ffb028d..c7

[PATCH v2 08/19] rev-list: add object filtering support

2017-07-13 Thread Jeff Hostetler
From: Jeff Hostetler <jeffh...@microsoft.com> Teach rev-list to use the filtering provided by the traverse_commit_list_filtered() interface to omit unwanted objects from the result. This feature is only enabled when one of the "--objects*" options are used. When the "--

[PATCH v2 17/19] clone: add filter arguments

2017-07-13 Thread Jeff Hostetler
From: Jeff Hostetler <jeffh...@microsoft.com> Signed-off-by: Jeff Hostetler <jeffh...@microsoft.com> --- builtin/clone.c | 28 1 file changed, 28 insertions(+) diff --git a/builtin/clone.c b/builtin/clone.c index a6ae7d6..1408396 100644 --- a/builtin/c

[PATCH v2 16/19] connected: add filter_allow_omitted option to API

2017-07-13 Thread Jeff Hostetler
From: Jeff Hostetler <jeffh...@microsoft.com> Signed-off-by: Jeff Hostetler <jeffh...@microsoft.com> --- connected.c | 3 +++ connected.h | 6 ++ 2 files changed, 9 insertions(+) diff --git a/connected.c b/connected.c index 136c2ac..c25b816 100644 --- a/connected.c +++ b

[PATCH v2 15/19] fetch-pack: add object filtering support

2017-07-13 Thread Jeff Hostetler
From: Jeff Hostetler <jeffh...@microsoft.com> Signed-off-by: Jeff Hostetler <jeffh...@microsoft.com> --- builtin/fetch-pack.c | 3 +++ fetch-pack.c | 28 fetch-pack.h | 2 ++ transport.c | 27 +++

[PATCH v2 01/19] dir: refactor add_excludes()

2017-07-13 Thread Jeff Hostetler
From: Jeff Hostetler <jeffh...@microsoft.com> Refactor add_excludes() to separate the reading of the exclude file into a buffer and the parsing of the buffer into exclude_list items. Add add_excludes_from_blob_to_list() to allow an exclude file be specified with an OID. Signed-off-by

[PATCH v2 07/19] object-filter: common declarations for object filtering

2017-07-13 Thread Jeff Hostetler
From: Jeff Hostetler <jeffh...@microsoft.com> Create common routines and defines for parsing object-filter-related command line arguments and pack-protocol fields. Signed-off-by: Jeff Hostetler <jeffh...@microsoft.com> --- Makefile| 1 + object-fi

Re: [RFC PATCH 1/3] promised-blob, fsck: introduce promised blobs

2017-07-13 Thread Jeff Hostetler
On 7/13/2017 10:48 AM, Jeff Hostetler wrote: On 7/12/2017 3:28 PM, Jonathan Nieder wrote: Hi, Jeff Hostetler wrote: My primary concern is scale and managing the list of objects over time. [...] For example

Re: [RFC PATCH 1/3] promised-blob, fsck: introduce promised blobs

2017-07-13 Thread Jeff Hostetler
On 7/12/2017 3:28 PM, Jonathan Nieder wrote: Hi, Jeff Hostetler wrote: My primary concern is scale and managing the list of objects over time. [...] For example, on the Windows repo we have (conservatively) 100M+ blobs

Re: [RFC PATCH 1/3] promised-blob, fsck: introduce promised blobs

2017-07-12 Thread Jeff Hostetler
On 7/11/2017 3:48 PM, Jonathan Tan wrote: Currently, Git does not support repos with very large numbers of blobs or repos that wish to minimize manipulation of certain blobs (for example, because they are very large) very well, even if the user operates mostly on part of the repo, because Git

Re: RFC: Missing blob hook might be invoked infinitely recursively

2017-06-30 Thread Jeff Hostetler
On 6/29/2017 2:48 PM, Jonathan Tan wrote: As some of you may know, I'm currently working on support for partial clones/fetches in Git (where blobs above a user-specified size threshold are not downloaded - only their names and sizes are downloaded). To do this, the client repository needs to

Re: [PATCH 2/2] hashmap: migrate documentation from Documentation/technical into header

2017-06-29 Thread Jeff Hostetler
On 6/28/2017 9:13 PM, Stefan Beller wrote: While at it, clarify the use of `key`, `keydata`, `entry_or_key` as well as documenting the new data pointer for the compare function. Signed-off-by: Stefan Beller --- Documentation/technical/api-hashmap.txt | 309

Re: [PATCH 1/3] list-objects: add filter_blob to traverse_commit_list

2017-06-28 Thread Jeff Hostetler
On 6/28/2017 12:23 PM, Junio C Hamano wrote: Jeff Hostetler <g...@jeffhostetler.com> writes: diff --git a/list-objects.c b/list-objects.c index f3ca6aa..c9ca81c 100644 --- a/list-objects.c +++ b/list-objects.c @@ -24,11 +25,28 @@ static void process_blob(struct rev_info

Re: [PATCH 1/3] list-objects: add filter_blob to traverse_commit_list

2017-06-23 Thread Jeff Hostetler
On 6/22/2017 6:10 PM, Jonathan Tan wrote: On Thu, 22 Jun 2017 14:45:26 -0700 Jonathan Tan <jonathanta...@google.com> wrote: On Thu, 22 Jun 2017 20:36:13 +0000 Jeff Hostetler <g...@jeffhostetler.com> wrote: From: Jeff Hostetler <jeffh...@microsoft.com> In preparation

Re: [PATCH v4 00/20] repository object

2017-06-23 Thread Jeff Hostetler
On 6/22/2017 2:43 PM, Brandon Williams wrote: As before you can find this series at: https://github.com/bmwill/git/tree/repository-object Changes in v4: * Patch 11 is slightly different and turns off all path relocation when a worktree is provided instead of just for the index file

[PATCH 1/3] list-objects: add filter_blob to traverse_commit_list

2017-06-22 Thread Jeff Hostetler
From: Jeff Hostetler <jeffh...@microsoft.com> In preparation for partial/sparse clone/fetch where the server is allowed to omit large/all blobs from the packfile, teach traverse_commit_list() to take a blob filter-proc that controls when blobs are shown and marked as SEEN. No

[PATCH 2/3] pack-objects: WIP add max-blob-size filtering

2017-06-22 Thread Jeff Hostetler
From: Jeff Hostetler <jeffh...@microsoft.com> Teach pack-objects command to accept --max-blob-size= argument and use a traverse_commit_list filter-proc to omit unwanted blobs from the resulting packfile. This filter-proc always includes special files matching ".git*" (suc

[PATCH 0/3] WIP list-objects and pack-objects for partial clone

2017-06-22 Thread Jeff Hostetler
From: Jeff Hostetler <jeffh...@microsoft.com> This WIP is a follow up to earlier patches to teach pack-objects to omit large blobs from packfiles. This doesn't attempt to solve the whole end-to-end problem of partial/sparse clone/fetch or that of the client operating with missing

[PATCH 3/3] pack-objects: add t5317 to test max-blob-size

2017-06-22 Thread Jeff Hostetler
From: Jeff Hostetler <jeffh...@microsoft.com> Signed-off-by: Jeff Hostetler <jeffh...@microsoft.com> --- t/t5317-pack-objects-blob-filtering.sh | 68 ++ 1 file changed, 68 insertions(+) create mode 100644 t/t5317-pack-objects-blob-filtering.sh di

Re: [WIP v2 2/2] pack-objects: support --blob-max-bytes

2017-06-15 Thread Jeff Hostetler
On 6/2/2017 6:26 PM, Jeff King wrote: On Fri, Jun 02, 2017 at 12:38:45PM -0700, Jonathan Tan wrote: ... We have a name-hash cache extension in the bitmap file, but it doesn't carry enough information to deduce the .git-ness of a file. I don't think it would be too hard to add a "flags"

Re: [WIP/RFC 00/23] repository object

2017-05-19 Thread Jeff Hostetler
On 5/18/2017 7:21 PM, Brandon Williams wrote: When I first started working on the git project I found it very difficult to understand parts of the code base because of the inherently global nature of our code. It also made working on submodules very difficult. Since we can only open up a

Re: [PATCH 0/5] p0004: support being called by t/perf/run

2017-05-15 Thread Jeff Hostetler
On 5/13/2017 11:55 AM, René Scharfe wrote: p0004-lazy-init-name-hash.sh errors out if the test repo is too small, and doesn't generate any perf test results even if it finishes successfully. That prevents t/perf/run from running the whole test suite. This series tries to address these

Re: [PATCH v7] read-cache: force_verify_index_checksum

2017-05-08 Thread Jeff Hostetler
On 5/8/2017 4:03 PM, Christian Couder wrote: On Mon, May 8, 2017 at 6:50 PM, Jeff Hostetler <g...@jeffhostetler.com> wrote: On 5/8/2017 5:45 AM, Christian Couder wrote: This test does not pass when the GIT_TEST_SPLIT_INDEX env variable is set on my Linux machine. Also it looks li

Re: [RFC 00/14] convert dir.c to take an index parameter

2017-05-08 Thread Jeff Hostetler
On 5/8/2017 1:12 PM, Brandon Williams wrote: On 05/06, Junio C Hamano wrote: Brandon Williams writes: One of the things brought up on the list in the past few days has been migrating away from using the index compatibility macros. One of the issues brought up in that

Re: [PATCH v7] read-cache: force_verify_index_checksum

2017-05-08 Thread Jeff Hostetler
On 5/8/2017 5:45 AM, Christian Couder wrote: On Fri, Apr 14, 2017 at 10:32 PM, wrote: diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh index 33a51c9..677e15a 100755 --- a/t/t1450-fsck.sh +++ b/t/t1450-fsck.sh @@ -689,4 +689,17 @@ test_expect_success 'bogus head does

Re: [PATCH 00/10] RFC Partial Clone and Fetch

2017-05-04 Thread Jeff Hostetler
On 5/3/2017 2:27 PM, Jonathan Nieder wrote: Hi, Jeff Hostetler wrote: Missing-Blob Support Let me offer up an alternative idea for representing missing blobs. This is differs from both of our previous proposals. (I don't have any code for this new proposal, I just

Re: [PATCH 00/10] RFC Partial Clone and Fetch

2017-05-03 Thread Jeff Hostetler
On 3/8/2017 1:50 PM, g...@jeffhostetler.com wrote: From: Jeff Hostetler <jeffh...@microsoft.com> [RFC] Partial Clone and Fetch = [...] E. Unresolved Thoughts == *TODO* The server should optionally return (in a side-band?) a list of the

Re: [PATCH 0/5] Start of a journey: drop NO_THE_INDEX_COMPATIBILITY_MACROS

2017-05-02 Thread Jeff Hostetler
On 5/1/2017 3:07 PM, Stefan Beller wrote: This applies to origin/master. For better readability and understandability for newcomers it is a good idea to not offer 2 APIs doing the same thing with on being the #define of the other. In the long run we may want to drop the macros guarded by

Re: [PATCH 0/5] Start of a journey: drop NO_THE_INDEX_COMPATIBILITY_MACROS

2017-05-02 Thread Jeff Hostetler
On 5/2/2017 12:17 AM, Stefan Beller wrote: On Mon, May 1, 2017 at 6:36 PM, Junio C Hamano wrote: Stefan Beller writes: This applies to origin/master. For better readability and understandability for newcomers it is a good idea to not offer 2 APIs

Re: [PATCH] read-cache: close index.lock in do_write_index

2017-04-27 Thread Jeff Hostetler
On 4/26/2017 11:13 PM, Jeff King wrote: On Wed, Apr 26, 2017 at 10:05:23PM +0200, Johannes Schindelin wrote: From: Jeff Hostetler <jeffh...@microsoft.com> Teach do_write_index() to close the index.lock file before getting the mtime and updating the istate.timestamp fields. On W

Re: [PATCH] read-cache: close index.lock in do_write_index

2017-04-27 Thread Jeff Hostetler
On 4/26/2017 11:21 PM, Junio C Hamano wrote: Johannes Schindelin <johannes.schinde...@gmx.de> writes: From: Jeff Hostetler <jeffh...@microsoft.com> Teach do_write_index() to close the index.lock file before getting the mtime and updating the istate.timestamp fields. On Windo

<    1   2   3   4   5   6   7   >