RE: [PATCH 3/6] tmp-objdir: introduce API for temporary object directories

2016-09-30 Thread David Turner
Thanks.  The rest all look good too.

> -Original Message-
> From: Jeff King [mailto:p...@peff.net]
> Sent: Friday, September 30, 2016 6:45 PM
> To: David Turner
> Cc: git@vger.kernel.org
> Subject: Re: [PATCH 3/6] tmp-objdir: introduce API for temporary object
> directories
> 
> On Fri, Sep 30, 2016 at 09:32:04PM +, David Turner wrote:
> 
> > > +static void env_append(struct argv_array *env, const char *key,
> > > +const char *val) {
> > > + const char *old = getenv(key);
> > > +
> > > + if (!old)
> > > + argv_array_pushf(env, "%s=%s", key, val);
> > > + else
> > > + argv_array_pushf(env, "%s=%s%c%s", key, old, PATH_SEP,
> > > val);
> > >+}
> >
> > I would like a comment explaining this function.
> 
> I'll squash in:
> 
> diff --git a/tmp-objdir.c b/tmp-objdir.c index c92e6cc..a98c246 100644
> --- a/tmp-objdir.c
> +++ b/tmp-objdir.c
> @@ -70,6 +70,13 @@ static void remove_tmp_objdir_on_signal(int signo)
>   raise(signo);
>  }
> 
> +/*
> + * These env_* functions are for setting up the child environment; the
> + * "replace" variant overrides the value of any existing variable with
> +that
> + * "key". The "append" variant puts our new value at the end of a list,
> + * separated by PATH_SEP (which is what separate values in
> + * GIT_ALTERNATE_OBJECT_DIRECTORIES).
> + */
>  static void env_append(struct argv_array *env, const char *key, const char
> *val)  {
>   const char *old = getenv(key);
> 
> > > + * Finalize a temporary object directory by migrating its objects
> > > +into the main
> > > + * object database.
> > > + */
> >
> > This should mention that it frees its argument.
> 
> And:
> 
> diff --git a/tmp-objdir.h b/tmp-objdir.h index aa47aa9..b1e45b4 100644
> --- a/tmp-objdir.h
> +++ b/tmp-objdir.h
> @@ -35,7 +35,8 @@ const char **tmp_objdir_env(const struct tmp_objdir
> *);
> 
>  /*
>   * Finalize a temporary object directory by migrating its objects into the 
> main
> - * object database.
> + * object database, removing the temporary directory, and freeing any
> + * associated resources.
>   */
>  int tmp_objdir_migrate(struct tmp_objdir *);
> 
> 
> -Peff


Re: [PATCH 3/6] tmp-objdir: introduce API for temporary object directories

2016-09-30 Thread Jeff King
On Fri, Sep 30, 2016 at 09:32:04PM +, David Turner wrote:

> > +static void env_append(struct argv_array *env, const char *key, const
> > +char *val) {
> > +   const char *old = getenv(key);
> > +
> > +   if (!old)
> > +   argv_array_pushf(env, "%s=%s", key, val);
> > +   else
> > +   argv_array_pushf(env, "%s=%s%c%s", key, old, PATH_SEP,
> > val); 
> >+}
> 
> I would like a comment explaining this function. 

I'll squash in:

diff --git a/tmp-objdir.c b/tmp-objdir.c
index c92e6cc..a98c246 100644
--- a/tmp-objdir.c
+++ b/tmp-objdir.c
@@ -70,6 +70,13 @@ static void remove_tmp_objdir_on_signal(int signo)
raise(signo);
 }
 
+/*
+ * These env_* functions are for setting up the child environment; the
+ * "replace" variant overrides the value of any existing variable with that
+ * "key". The "append" variant puts our new value at the end of a list,
+ * separated by PATH_SEP (which is what separate values in
+ * GIT_ALTERNATE_OBJECT_DIRECTORIES).
+ */
 static void env_append(struct argv_array *env, const char *key, const char 
*val)
 {
const char *old = getenv(key);

> > + * Finalize a temporary object directory by migrating its objects into
> > +the main
> > + * object database.
> > + */
> 
> This should mention that it frees its argument.

And:

diff --git a/tmp-objdir.h b/tmp-objdir.h
index aa47aa9..b1e45b4 100644
--- a/tmp-objdir.h
+++ b/tmp-objdir.h
@@ -35,7 +35,8 @@ const char **tmp_objdir_env(const struct tmp_objdir *);
 
 /*
  * Finalize a temporary object directory by migrating its objects into the main
- * object database.
+ * object database, removing the temporary directory, and freeing any
+ * associated resources.
  */
 int tmp_objdir_migrate(struct tmp_objdir *);
 

-Peff


Re: [PATCH 3/6] tmp-objdir: introduce API for temporary object directories

2016-09-30 Thread Jeff King
On Fri, Sep 30, 2016 at 02:25:43PM -0700, Junio C Hamano wrote:

> > +void add_to_alternates_internal(const char *reference)
> > +{
> > +   prepare_alt_odb();
> > +   link_alt_odb_entries(reference, strlen(reference), '\n', NULL, 0);
> > +}
> > +
> 
> A function _internal being extern felt a bit funny.  We are only
> appending so the first one does not have to be reprepare.

It's a match for add_to_alternates_file(). Suggestions for a better word
are welcome.

We do need to prepare_alt_odb(), as that is what sets up the
alt_odb_tail pointer. And also, a later prepare() call would overwrite
our entry.  We could refactor the alt_odb code, but it seemed simplest
to just make sure we don't add to an unprepared list.

> > +   t = xmalloc(sizeof(*t));
> > +   strbuf_init(>path, 0);
> > +   argv_array_init(>env);
> > +
> > +   strbuf_addf(>path, "%s/incoming-XX", get_object_directory());
> 
> I was wondering where you would put this in.  Inside .git/objects/
> sounds good.

The name "incoming" is kind of arbitrary and related to the fact that
this is used for receive-pack (though if we were to use it on the
fetching side, I think it would be equally correct). I don't think it
really matters in practice.

> > +static int pack_copy_priority(const char *name)
> > +{
> > +   if (!starts_with(name, "pack"))
> > +   return 0;
> > +   if (ends_with(name, ".keep"))
> > +   return 1;
> > +   if (ends_with(name, ".pack"))
> > +   return 2;
> > +   if (ends_with(name, ".idx"))
> > +   return 3;
> > +   return 4;
> > +}
> 
> Thanks for being careful.  A blind "cp -r" would have ruined the
> day.
> 
> We do not do bitmaps upon receiving, I guess.

But we don't, but they (and anything else) would just sort at the end,
which is OK.

> > + * struct tmp_objdir *t = tmp_objdir_create();
> > + * if (!run_command_v_opt_cd_env(cmd, 0, NULL, tmp_objdir_env(t)) &&
> > + * !tmp_objdir_migrate(t))
> > + * printf("success!\n");
> > + * else
> > + * die("failed...tmp_objdir will clean up for us");
> 
> Made me briefly wonder if a caller might want to use appropriate
> environment to use the tmp-objdir given by the API in addition to
> its own, but then such a caller just needs to prepare its own argv-array
> and concatenate tmp_objdir_env() before making the opt_cd_env call,
> so this is perfectly fine.

Yep, and that's exactly what happens in one spot of the next patch.
My original had just open-coded, but I was happy to see we have
argv_array_pushv() these days, so it's a one-liner.

In the very original version, the receive-pack process did not need to
access the new objects at all (not until ref update time anyway, at
which point they've been migrated). And that's why the environment is
intentionally kept separate, and the caller can feed it to whichever
sub-programs it chooses. But a later version of git that handled shallow
pushes required receive-pack to actually look at the objects, and I
added the add_to_alternates_internal() call you see here.

At that point, it does make me wonder if a better interface would be for
tmp_objdir to just set up the environment variables in the parent
process in the first place, and then restore them upon
tmp_objdir_destroy(). It makes things a bit more automatic, which makes
me hesitate, but I think it would be fine for receive-pack.

I dunno. I mostly left it alone because I did it this way long ago, and
it wasn't broke. Polishing for upstream is an opportunity to fix old
oddities, but I think there is some value in applying a more
battle-tested patch.

-Peff



RE: [PATCH 3/6] tmp-objdir: introduce API for temporary object directories

2016-09-30 Thread David Turner
> +static void env_append(struct argv_array *env, const char *key, const
> +char *val) {
> + const char *old = getenv(key);
> +
> + if (!old)
> + argv_array_pushf(env, "%s=%s", key, val);
> + else
> + argv_array_pushf(env, "%s=%s%c%s", key, old, PATH_SEP,
> val); 
>+}

I would like a comment explaining this function. 

> + * Finalize a temporary object directory by migrating its objects into
> +the main
> + * object database.
> + */

This should mention that it frees its argument.



Re: [PATCH 3/6] tmp-objdir: introduce API for temporary object directories

2016-09-30 Thread Junio C Hamano
Jeff King  writes:

> diff --git a/sha1_file.c b/sha1_file.c
> index 9a79c19..65deaf9 100644
> --- a/sha1_file.c
> +++ b/sha1_file.c
> @@ -421,6 +421,12 @@ void add_to_alternates_file(const char *reference)
>   free(alts);
>  }
>  
> +void add_to_alternates_internal(const char *reference)
> +{
> + prepare_alt_odb();
> + link_alt_odb_entries(reference, strlen(reference), '\n', NULL, 0);
> +}
> +

A function _internal being extern felt a bit funny.  We are only
appending so the first one does not have to be reprepare.

> +static int tmp_objdir_destroy_1(struct tmp_objdir *t, int on_signal)
> +{
> + int err;
> +
> + if (!t)
> + return 0;
> +
> + if (t == the_tmp_objdir)
> + the_tmp_objdir = NULL;
> +
> + /*
> +  * This may use malloc via strbuf_grow(), but we should
> +  * have pre-grown t->path sufficiently so that this
> +  * doesn't happen in practice.
> +  */
> + err = remove_dir_recursively(>path, 0);
> +
> + /*
> +  * When we are cleaning up due to a signal, we won't bother
> +  * freeing memory; it may cause a deadlock if the signal
> +  * arrived while libc's allocator lock is held.
> +  */
> + if (!on_signal)
> + tmp_objdir_free(t);
> + return err;
> +}
> +
> +int tmp_objdir_destroy(struct tmp_objdir *t)
> +{
> + return tmp_objdir_destroy_1(t, 0);
> +}

Looks sensible.

> + t = xmalloc(sizeof(*t));
> + strbuf_init(>path, 0);
> + argv_array_init(>env);
> +
> + strbuf_addf(>path, "%s/incoming-XX", get_object_directory());

I was wondering where you would put this in.  Inside .git/objects/
sounds good.

> +/*
> + * Make sure we copy packfiles and their associated metafiles in the correct
> + * order. All of these ends_with checks are slightly expensive to do in
> + * the midst of a sorting routine, but in practice it shouldn't matter.
> + * We will have a relatively small number of packfiles to order, and loose
> + * objects exit early in the first line.
> + */
> +static int pack_copy_priority(const char *name)
> +{
> + if (!starts_with(name, "pack"))
> + return 0;
> + if (ends_with(name, ".keep"))
> + return 1;
> + if (ends_with(name, ".pack"))
> + return 2;
> + if (ends_with(name, ".idx"))
> + return 3;
> + return 4;
> +}

Thanks for being careful.  A blind "cp -r" would have ruined the
day.

We do not do bitmaps upon receiving, I guess.

> + *   struct tmp_objdir *t = tmp_objdir_create();
> + *   if (!run_command_v_opt_cd_env(cmd, 0, NULL, tmp_objdir_env(t)) &&
> + *   !tmp_objdir_migrate(t))
> + *   printf("success!\n");
> + *   else
> + *   die("failed...tmp_objdir will clean up for us");

Made me briefly wonder if a caller might want to use appropriate
environment to use the tmp-objdir given by the API in addition to
its own, but then such a caller just needs to prepare its own argv-array
and concatenate tmp_objdir_env() before making the opt_cd_env call,
so this is perfectly fine.