Re: [PATCH 1/9] extension.partialclone: introduce partial clone extension

2017-11-16 Thread Jeff Hostetler



On 11/8/2017 4:51 PM, Jonathan Tan wrote:

On Wed, 8 Nov 2017 15:32:21 -0500
Jeff Hostetler  wrote:


Thanks Jonathan.

I moved my version of part 2 on top of yesterday's part 1.
There are a few changes between my version and yours. Could
you take a quick look at them and see if they make sense?
(I'll spare the mailing list another patch series until after
I attend to the feed back on part 1.)

https://github.com/jeffhostetler/git/commits/core/pc3_p2


Thanks - the differences are quite minor, and they generally make sense.
The main one is that finish_object() in builtin/rev-list.c now returns
int instead of void, but that makes sense.

Other than that:

  - I think you accidentally squashed the rev-list commit into
"sha1_file: support lazily fetching missing objects".


fixed. thanks.


  - The documentation for --exclude-promisor-objects in
git-pack-objects.txt should be "Omit objects that are known to be in
the promisor remote". (This option has the purpose of operating only
on locally created objects, so that when we repack, we still maintain
a distinction between locally created objects [without .promisor] and
objects from the promisor remote [with .promisor].)



  - The transport options in gitremote-helpers.txt could have the same
documentation as in transport.h.


fixed. thanks.
 


Re: [PATCH 1/9] extension.partialclone: introduce partial clone extension

2017-11-08 Thread Jeff Hostetler



On 11/8/2017 4:51 PM, Jonathan Tan wrote:

On Wed, 8 Nov 2017 15:32:21 -0500
Jeff Hostetler  wrote:


Thanks Jonathan.

I moved my version of part 2 on top of yesterday's part 1.
There are a few changes between my version and yours. Could
you take a quick look at them and see if they make sense?
(I'll spare the mailing list another patch series until after
I attend to the feed back on part 1.)

https://github.com/jeffhostetler/git/commits/core/pc3_p2


Thanks - the differences are quite minor, and they generally make sense.
The main one is that finish_object() in builtin/rev-list.c now returns
int instead of void, but that makes sense.

Other than that:

  - I think you accidentally squashed the rev-list commit into
"sha1_file: support lazily fetching missing objects".
  - The documentation for --exclude-promisor-objects in
git-pack-objects.txt should be "Omit objects that are known to be in
the promisor remote". (This option has the purpose of operating only
on locally created objects, so that when we repack, we still maintain
a distinction between locally created objects [without .promisor] and
objects from the promisor remote [with .promisor].)
  - The transport options in gitremote-helpers.txt could have the same
documentation as in transport.h.



thanks for the quick turn around.  i'll get these into my next
version next week.

Jeff


Re: [PATCH 1/9] extension.partialclone: introduce partial clone extension

2017-11-08 Thread Jonathan Tan
On Wed, 8 Nov 2017 15:32:21 -0500
Jeff Hostetler  wrote:

> Thanks Jonathan.
> 
> I moved my version of part 2 on top of yesterday's part 1.
> There are a few changes between my version and yours. Could
> you take a quick look at them and see if they make sense?
> (I'll spare the mailing list another patch series until after
> I attend to the feed back on part 1.)
> 
> https://github.com/jeffhostetler/git/commits/core/pc3_p2

Thanks - the differences are quite minor, and they generally make sense.
The main one is that finish_object() in builtin/rev-list.c now returns
int instead of void, but that makes sense.

Other than that:

 - I think you accidentally squashed the rev-list commit into
   "sha1_file: support lazily fetching missing objects".
 - The documentation for --exclude-promisor-objects in
   git-pack-objects.txt should be "Omit objects that are known to be in
   the promisor remote". (This option has the purpose of operating only
   on locally created objects, so that when we repack, we still maintain
   a distinction between locally created objects [without .promisor] and
   objects from the promisor remote [with .promisor].)
 - The transport options in gitremote-helpers.txt could have the same
   documentation as in transport.h.


Re: [PATCH 1/9] extension.partialclone: introduce partial clone extension

2017-11-08 Thread Jeff Hostetler



On 11/6/2017 2:16 PM, Jonathan Tan wrote:

On Mon, 6 Nov 2017 12:32:45 -0500
Jeff Hostetler  wrote:


Yes, that is a point I wanted to ask about.  I renamed the
extensions.partialclone that you created and then I moved your
remote..blob-max-bytes setting to be in extensions too.
Moving it to core.partialclonefilter is fine.


OK - in that case, it might be easier to just reuse my first patch in
its entirety. "core.partialclonefilter" is not used until the
fetching/cloning part anyway.



Good point.  I'll take a look at refactoring that.
If it looks like the result will be mostly/effectively
your original patches, I'll let you know and hand part 2
back to you.


Sounds good. I uploaded the result of rebasing my part 2 patches on top
of your part 1 patches here, if you would like it as a reference:

https://github.com/jonathantanmy/git/commits/pc20171106



Thanks Jonathan.

I moved my version of part 2 on top of yesterday's part 1.
There are a few changes between my version and yours. Could
you take a quick look at them and see if they make sense?
(I'll spare the mailing list another patch series until after
I attend to the feed back on part 1.)

https://github.com/jeffhostetler/git/commits/core/pc3_p2

Thanks
Jeff



Re: [PATCH 1/9] extension.partialclone: introduce partial clone extension

2017-11-06 Thread Jonathan Tan
On Mon, 6 Nov 2017 12:32:45 -0500
Jeff Hostetler  wrote:

> >> Yes, that is a point I wanted to ask about.  I renamed the
> >> extensions.partialclone that you created and then I moved your
> >> remote..blob-max-bytes setting to be in extensions too.
> >> Moving it to core.partialclonefilter is fine.
> > 
> > OK - in that case, it might be easier to just reuse my first patch in
> > its entirety. "core.partialclonefilter" is not used until the
> > fetching/cloning part anyway.
> > 
> 
> Good point.  I'll take a look at refactoring that.
> If it looks like the result will be mostly/effectively
> your original patches, I'll let you know and hand part 2
> back to you.

Sounds good. I uploaded the result of rebasing my part 2 patches on top
of your part 1 patches here, if you would like it as a reference:

https://github.com/jonathantanmy/git/commits/pc20171106


Re: [PATCH 1/9] extension.partialclone: introduce partial clone extension

2017-11-06 Thread Jeff Hostetler



On 11/3/2017 2:39 PM, Jonathan Tan wrote:

On Fri, 3 Nov 2017 09:57:18 -0400
Jeff Hostetler  wrote:


On 11/2/2017 6:24 PM, Jonathan Tan wrote:

On Thu,  2 Nov 2017 20:20:44 +
Jeff Hostetler  wrote:


From: Jeff Hostetler 

Introduce the ability to have missing objects in a repo.  This
functionality is guarded by new repository extension options:
  `extensions.partialcloneremote` and
  `extensions.partialclonefilter`.


With this, it is unclear what happens if extensions.partialcloneremote
is not set but extensions.partialclonefilter is set. For something as
significant as a repository extension (which Git uses to determine if it
will even attempt to interact with a repo), I think - I would prefer
just extensions.partialclone (or extensions.partialcloneremote, though I
prefer the former) which determines the remote (the important part,
which controls the dynamic object fetching), and have another option
"core.partialclonefilter" which is only useful if
"extensions.partialclone" is set.


Yes, that is a point I wanted to ask about.  I renamed the
extensions.partialclone that you created and then I moved your
remote..blob-max-bytes setting to be in extensions too.
Moving it to core.partialclonefilter is fine.


OK - in that case, it might be easier to just reuse my first patch in
its entirety. "core.partialclonefilter" is not used until the
fetching/cloning part anyway.



Good point.  I'll take a look at refactoring that.
If it looks like the result will be mostly/effectively
your original patches, I'll let you know and hand part 2
back to you.


I agree that "core.partialclonefilter" (or another place not in
"remote") instead of "remote..blob-max-bytes" is a good idea - in
the future, we might want to reuse the same filter setting for
non-fetching functionality.



Jeff


Re: [PATCH 1/9] extension.partialclone: introduce partial clone extension

2017-11-03 Thread Jonathan Tan
On Fri, 3 Nov 2017 09:57:18 -0400
Jeff Hostetler  wrote:

> On 11/2/2017 6:24 PM, Jonathan Tan wrote:
> > On Thu,  2 Nov 2017 20:20:44 +
> > Jeff Hostetler  wrote:
> > 
> >> From: Jeff Hostetler 
> >>
> >> Introduce the ability to have missing objects in a repo.  This
> >> functionality is guarded by new repository extension options:
> >>  `extensions.partialcloneremote` and
> >>  `extensions.partialclonefilter`.
> > 
> > With this, it is unclear what happens if extensions.partialcloneremote
> > is not set but extensions.partialclonefilter is set. For something as
> > significant as a repository extension (which Git uses to determine if it
> > will even attempt to interact with a repo), I think - I would prefer
> > just extensions.partialclone (or extensions.partialcloneremote, though I
> > prefer the former) which determines the remote (the important part,
> > which controls the dynamic object fetching), and have another option
> > "core.partialclonefilter" which is only useful if
> > "extensions.partialclone" is set.
> 
> Yes, that is a point I wanted to ask about.  I renamed the
> extensions.partialclone that you created and then I moved your
> remote..blob-max-bytes setting to be in extensions too.
> Moving it to core.partialclonefilter is fine.

OK - in that case, it might be easier to just reuse my first patch in
its entirety. "core.partialclonefilter" is not used until the
fetching/cloning part anyway.

I agree that "core.partialclonefilter" (or another place not in
"remote") instead of "remote..blob-max-bytes" is a good idea - in
the future, we might want to reuse the same filter setting for
non-fetching functionality.


Re: [PATCH 1/9] extension.partialclone: introduce partial clone extension

2017-11-03 Thread Jeff Hostetler



On 11/2/2017 6:24 PM, Jonathan Tan wrote:

On Thu,  2 Nov 2017 20:20:44 +
Jeff Hostetler  wrote:


From: Jeff Hostetler 

Introduce the ability to have missing objects in a repo.  This
functionality is guarded by new repository extension options:
 `extensions.partialcloneremote` and
 `extensions.partialclonefilter`.


With this, it is unclear what happens if extensions.partialcloneremote
is not set but extensions.partialclonefilter is set. For something as
significant as a repository extension (which Git uses to determine if it
will even attempt to interact with a repo), I think - I would prefer
just extensions.partialclone (or extensions.partialcloneremote, though I
prefer the former) which determines the remote (the important part,
which controls the dynamic object fetching), and have another option
"core.partialclonefilter" which is only useful if
"extensions.partialclone" is set.


Yes, that is a point I wanted to ask about.  I renamed the
extensions.partialclone that you created and then I moved your
remote..blob-max-bytes setting to be in extensions too.
Moving it to core.partialclonefilter is fine.



I also don't think extensions.partialclonefilter (or
core.partialclonefilter, if we use my suggestion) needs to be introduced
so early in the patch set when it will only be used once we start
fetching/cloning.


+void partial_clone_utils_register(
+   const struct list_objects_filter_options *filter_options,
+   const char *remote,
+   const char *cmd_name)
+{


This function is useful once we have fetch/clone, but probably not
before that. Since the fetch/clone patches are several patches ahead,
could this be moved there?


Sure.




@@ -420,6 +420,19 @@ static int check_repo_format(const char *var, const char 
*value, void *vdata)
;
else if (!strcmp(ext, "preciousobjects"))
data->precious_objects = git_config_bool(var, value);
+
+   else if (!strcmp(ext, KEY_PARTIALCLONEREMOTE))
+   if (!value)
+   return config_error_nonbool(var);
+   else
+   data->partial_clone_remote = xstrdup(value);
+
+   else if (!strcmp(ext, KEY_PARTIALCLONEFILTER))
+   if (!value)
+   return config_error_nonbool(var);
+   else
+   data->partial_clone_filter = xstrdup(value);
+
else
string_list_append(>unknown_extensions, ext);
} else if (strcmp(var, "core.bare") == 0) {


With a complicated block, probably better to use braces in these
clauses.



Good point.

Thanks,
Jeff



Re: [PATCH 1/9] extension.partialclone: introduce partial clone extension

2017-11-02 Thread Jonathan Tan
On Thu,  2 Nov 2017 20:20:44 +
Jeff Hostetler  wrote:

> From: Jeff Hostetler 
> 
> Introduce the ability to have missing objects in a repo.  This
> functionality is guarded by new repository extension options:
> `extensions.partialcloneremote` and
> `extensions.partialclonefilter`.

With this, it is unclear what happens if extensions.partialcloneremote
is not set but extensions.partialclonefilter is set. For something as
significant as a repository extension (which Git uses to determine if it
will even attempt to interact with a repo), I think - I would prefer
just extensions.partialclone (or extensions.partialcloneremote, though I
prefer the former) which determines the remote (the important part,
which controls the dynamic object fetching), and have another option
"core.partialclonefilter" which is only useful if
"extensions.partialclone" is set.

I also don't think extensions.partialclonefilter (or
core.partialclonefilter, if we use my suggestion) needs to be introduced
so early in the patch set when it will only be used once we start
fetching/cloning.

> +void partial_clone_utils_register(
> + const struct list_objects_filter_options *filter_options,
> + const char *remote,
> + const char *cmd_name)
> +{

This function is useful once we have fetch/clone, but probably not
before that. Since the fetch/clone patches are several patches ahead,
could this be moved there?

> @@ -420,6 +420,19 @@ static int check_repo_format(const char *var, const char 
> *value, void *vdata)
>   ;
>   else if (!strcmp(ext, "preciousobjects"))
>   data->precious_objects = git_config_bool(var, value);
> +
> + else if (!strcmp(ext, KEY_PARTIALCLONEREMOTE))
> + if (!value)
> + return config_error_nonbool(var);
> + else
> + data->partial_clone_remote = xstrdup(value);
> +
> + else if (!strcmp(ext, KEY_PARTIALCLONEFILTER))
> + if (!value)
> + return config_error_nonbool(var);
> + else
> + data->partial_clone_filter = xstrdup(value);
> +
>   else
>   string_list_append(>unknown_extensions, ext);
>   } else if (strcmp(var, "core.bare") == 0) {

With a complicated block, probably better to use braces in these
clauses.


[PATCH 1/9] extension.partialclone: introduce partial clone extension

2017-11-02 Thread Jeff Hostetler
From: Jeff Hostetler 

Introduce the ability to have missing objects in a repo.  This
functionality is guarded by new repository extension options:
`extensions.partialcloneremote` and
`extensions.partialclonefilter`.

See the update to Documentation/technical/repository-version.txt
in this patch for more information.

Signed-off-by: Jonathan Tan 
Signed-off-by: Jeff Hostetler 
---
 Documentation/technical/repository-version.txt | 22 
 Makefile   |  1 +
 cache.h|  4 ++
 config.h   |  3 +
 environment.c  |  2 +
 partial-clone-utils.c  | 78 ++
 partial-clone-utils.h  | 34 +++
 setup.c| 15 +
 8 files changed, 159 insertions(+)
 create mode 100644 partial-clone-utils.c
 create mode 100644 partial-clone-utils.h

diff --git a/Documentation/technical/repository-version.txt 
b/Documentation/technical/repository-version.txt
index 00ad379..9d488db 100644
--- a/Documentation/technical/repository-version.txt
+++ b/Documentation/technical/repository-version.txt
@@ -86,3 +86,25 @@ for testing format-1 compatibility.
 When the config key `extensions.preciousObjects` is set to `true`,
 objects in the repository MUST NOT be deleted (e.g., by `git-prune` or
 `git repack -d`).
+
+`partialcloneremote`
+
+
+When the config key `extensions.partialcloneremote` is set, it indicates
+that the repo was created with a partial clone (or later performed
+a partial fetch) and that the remote may have omitted sending
+certain unwanted objects.  Such a remote is called a "promisor remote"
+and it promises that all such omitted objects can be fetched from it
+in the future.
+
+The value of this key is the name of the promisor remote.
+
+`partialclonefilter`
+
+
+When the config key `extensions.partialclonefilter` is set, it gives
+the initial filter expression used to create the partial clone.
+This value becomed the default filter expression for subsequent
+fetches (called "partial fetches") from the promisor remote.  This
+value may also be set by the first explicit partial fetch following a
+normal clone.
diff --git a/Makefile b/Makefile
index ca378a4..12d141a 100644
--- a/Makefile
+++ b/Makefile
@@ -838,6 +838,7 @@ LIB_OBJS += pack-write.o
 LIB_OBJS += pager.o
 LIB_OBJS += parse-options.o
 LIB_OBJS += parse-options-cb.o
+LIB_OBJS += partial-clone-utils.o
 LIB_OBJS += patch-delta.o
 LIB_OBJS += patch-ids.o
 LIB_OBJS += path.o
diff --git a/cache.h b/cache.h
index 6440e2b..4b785c0 100644
--- a/cache.h
+++ b/cache.h
@@ -860,12 +860,16 @@ extern int grafts_replace_parents;
 #define GIT_REPO_VERSION 0
 #define GIT_REPO_VERSION_READ 1
 extern int repository_format_precious_objects;
+extern char *repository_format_partial_clone_remote;
+extern char *repository_format_partial_clone_filter;
 
 struct repository_format {
int version;
int precious_objects;
int is_bare;
char *work_tree;
+   char *partial_clone_remote; /* value of extensions.partialcloneremote */
+   char *partial_clone_filter; /* value of extensions.partialclonefilter */
struct string_list unknown_extensions;
 };
 
diff --git a/config.h b/config.h
index a49d264..90544ef 100644
--- a/config.h
+++ b/config.h
@@ -34,6 +34,9 @@ struct config_options {
const char *git_dir;
 };
 
+#define KEY_PARTIALCLONEREMOTE "partialcloneremote"
+#define KEY_PARTIALCLONEFILTER "partialclonefilter"
+
 typedef int (*config_fn_t)(const char *, const char *, void *);
 extern int git_default_config(const char *, const char *, void *);
 extern int git_config_from_file(config_fn_t fn, const char *, void *);
diff --git a/environment.c b/environment.c
index 8289c25..2fcf9bb 100644
--- a/environment.c
+++ b/environment.c
@@ -27,6 +27,8 @@ int warn_ambiguous_refs = 1;
 int warn_on_object_refname_ambiguity = 1;
 int ref_paranoia = -1;
 int repository_format_precious_objects;
+char *repository_format_partial_clone_remote;
+char *repository_format_partial_clone_filter;
 const char *git_commit_encoding;
 const char *git_log_output_encoding;
 const char *apply_default_whitespace;
diff --git a/partial-clone-utils.c b/partial-clone-utils.c
new file mode 100644
index 000..32cc20d
--- /dev/null
+++ b/partial-clone-utils.c
@@ -0,0 +1,78 @@
+#include "cache.h"
+#include "config.h"
+#include "partial-clone-utils.h"
+
+int is_partial_clone_registered(void)
+{
+   if (repository_format_partial_clone_remote ||
+   repository_format_partial_clone_filter)
+   return 1;
+
+   return 0;
+}
+
+void partial_clone_utils_register(
+   const struct list_objects_filter_options *filter_options,
+   const char *remote,
+   const char *cmd_name)
+{
+