[PATCH 08/15] submodules: introduce check to see whether to touch a submodule

2017-02-23 Thread Stefan Beller
In later patches we introduce the --recurse-submodule flag for commands
that modify the working directory, e.g. git-checkout.

It is potentially expensive to check if a submodule needs an update,
because a common theme to interact with submodules is to spawn a child
process for each interaction.

So let's introduce a function that checks if a submodule needs
to be checked for an update before attempting the update.

Signed-off-by: Stefan Beller 
---
 submodule.c | 16 
 submodule.h |  7 +++
 2 files changed, 23 insertions(+)

diff --git a/submodule.c b/submodule.c
index 591f4a694e..8b2c0212be 100644
--- a/submodule.c
+++ b/submodule.c
@@ -548,6 +548,22 @@ void set_config_update_recurse_submodules(int value)
config_update_recurse_submodules = value;
 }
 
+int should_update_submodules(void)
+{
+   return config_update_recurse_submodules == RECURSE_SUBMODULES_ON;
+}
+
+const struct submodule *submodule_from_ce(const struct cache_entry *ce)
+{
+   if (!S_ISGITLINK(ce->ce_mode))
+   return NULL;
+
+   if (!should_update_submodules())
+   return NULL;
+
+   return submodule_from_path(null_sha1, ce->name);
+}
+
 static int has_remote(const char *refname, const struct object_id *oid,
  int flags, void *cb_data)
 {
diff --git a/submodule.h b/submodule.h
index b4e60c08d2..6f3fe85c7c 100644
--- a/submodule.h
+++ b/submodule.h
@@ -65,6 +65,13 @@ extern void show_submodule_inline_diff(FILE *f, const char 
*path,
const struct diff_options *opt);
 extern void set_config_fetch_recurse_submodules(int value);
 extern void set_config_update_recurse_submodules(int value);
+/* Check if we want to update any submodule.*/
+extern int should_update_submodules(void);
+/*
+ * Returns the submodule struct if the given ce entry is a submodule
+ * and it should be updated. Returns NULL otherwise.
+ */
+extern const struct submodule *submodule_from_ce(const struct cache_entry *ce);
 extern void check_for_new_submodule_commits(unsigned char new_sha1[20]);
 extern int fetch_populated_submodules(const struct argv_array *options,
   const char *prefix, int command_line_option,
-- 
2.12.0.rc1.16.ge4278d41a0.dirty



Re: [PATCH 08/15] submodules: introduce check to see whether to touch a submodule

2017-02-21 Thread Stefan Beller
On Fri, Feb 17, 2017 at 10:36 AM, Jacob Keller  wrote:
> On Wed, Feb 15, 2017 at 4:38 PM, Stefan Beller  wrote:
>> In later patches we introduce the --recurse-submodule flag for commands
>> that modify the working directory, e.g. git-checkout.
>>
>> It is potentially expensive to check if a submodule needs an update,
>> because a common theme to interact with submodules is to spawn a child
>> process for each interaction.
>>
>> So let's introduce a function that checks if a submodule needs
>> to be checked for an update before attempting the update.
>>
>> Signed-off-by: Stefan Beller 
>> ---
>>  submodule.c | 27 +++
>>  submodule.h | 13 +
>>  2 files changed, 40 insertions(+)
>>
>> diff --git a/submodule.c b/submodule.c
>> index 591f4a694e..2a37e03420 100644
>> --- a/submodule.c
>> +++ b/submodule.c
>> @@ -548,6 +548,33 @@ void set_config_update_recurse_submodules(int value)
>> config_update_recurse_submodules = value;
>>  }
>>
>> +int touch_submodules_in_worktree(void)
>> +{
>> +   /*
>> +* Update can't be "none", "merge" or "rebase",
>> +* treat any value as OFF, except an explicit ON.
>> +*/
>> +   return config_update_recurse_submodules == RECURSE_SUBMODULES_ON;
>> +}
>
> Ok, so here, we're just checking whether the value is
> RECURSE_SUBMODULES_ON. The comment doesn't make sense to me at all.

Yes the comment was not updated in the last round of patches and is
now out of context.

> What is "update" and why "can't" it be those values? How is this code
> treating thise values as OFF exept for an explicit ON?

Please disregard the comment; I'll remove it in the next reroll.
The submodule API is in such a way that
config_update_recurse_submodules

>
> At first I thought this comment was related to check in the previous
> patch. I think I see the patch is "recurse submodules = true" or
> "recurse submodules = checkout" as a specific strategy? Ie:
> recurse-submodules=checkout" means "recurse into submodules and update
> them using checkout strategy?

Yes that is what I had in mind. See previous comment, in a later series
we could extend that to other strategies such as "revert-in-submodules"
for git-revert or "rebase", "merge" as we curreently have for
"git submodule update".

> Maybe this should be called something like
> recurse_into_submodules_in_worktree() though that is pretty verbose.

I like that. (It's less than double the number of characters, so it's
fine, isn't it?)
Maybe we can abbreviate worktree by "wt" ans "submodules by subs:

/* recurse into submodules in the worktree? */
int rec_subs_wt;

That looks short enough to qualify as non-Java.

> I'm not sure I have a good name really. But I wonder why we need this
> in the first place. Basically, we set RECURSE_SUBMODULES_* value which
> could be OFF, ON, or even future extensions of CHECKOUT, REBASE,
> MERGE, etc?
>
> But shouldn't we just return the mode, and let the later code decide
> "oh. actually I don't support this mode". For now, obviously we
> wouldn't set any of the new modes yet.

Mh, makes sense. Maybe I tricked myself into premature optimization,
because I'd expect most of the users not caring about submodules, such
that we want to have a *really* cheap way of saying "no, not interesting in
submodules", which is what this method mainly offers.

Junio also remarked this and the following
"is_active_submodule_with_strategy" to be bad design.

I'll redo those, such that the caller decides what to do with each strategy.

>
>> +
>> +int is_active_submodule_with_strategy(const struct cache_entry *ce,
>> + enum submodule_update_type strategy)
>> +{
>> +   const struct submodule *sub;
>> +
>> +   if (!S_ISGITLINK(ce->ce_mode))
>> +   return 0;
>> +
>> +   if (!touch_submodules_in_worktree())
>> +   return 0;
>> +
>> +   sub = submodule_from_path(null_sha1, ce->name);
>> +   if (!sub)
>> +   return 0;
>> +
>> +   return sub->update_strategy.type == strategy;
>> +}
>> +
>
> I liked Junio's suggestion where this just returns the strategy that
> it can use, or 0 if it shouldn't be updated. Then, other code just
> decides: Yes, I can use this strategy or no I cannot.
>
> Should this be tied in with the strategy used by the
> recurse_submodules above? ie: when/if we support recursing using other
> strategies, shouldn't we make these match here, so that if the recurse
> mode is "checkout" we don't checkout a submodule that was configured
> to be rebased? Or do you want to blindly apply checkout to all
> submodules even if they don't have strategy?
>
> I assume you do not, since you check here with passing a strategy
> value, and then see if it matches.
>
> this could be named something like:
>
> get_active_submodule_strategy() and return the strategy directly. It
> would then return 0 in any case where it shouldn't be updated. Later
> code can then check the strategy.

Re: [PATCH 08/15] submodules: introduce check to see whether to touch a submodule

2017-02-17 Thread Jacob Keller
On Wed, Feb 15, 2017 at 4:38 PM, Stefan Beller  wrote:
> In later patches we introduce the --recurse-submodule flag for commands
> that modify the working directory, e.g. git-checkout.
>
> It is potentially expensive to check if a submodule needs an update,
> because a common theme to interact with submodules is to spawn a child
> process for each interaction.
>
> So let's introduce a function that checks if a submodule needs
> to be checked for an update before attempting the update.
>
> Signed-off-by: Stefan Beller 
> ---
>  submodule.c | 27 +++
>  submodule.h | 13 +
>  2 files changed, 40 insertions(+)
>
> diff --git a/submodule.c b/submodule.c
> index 591f4a694e..2a37e03420 100644
> --- a/submodule.c
> +++ b/submodule.c
> @@ -548,6 +548,33 @@ void set_config_update_recurse_submodules(int value)
> config_update_recurse_submodules = value;
>  }
>
> +int touch_submodules_in_worktree(void)
> +{
> +   /*
> +* Update can't be "none", "merge" or "rebase",
> +* treat any value as OFF, except an explicit ON.
> +*/
> +   return config_update_recurse_submodules == RECURSE_SUBMODULES_ON;
> +}

Ok, so here, we're just checking whether the value is
RECURSE_SUBMODULES_ON. The comment doesn't make sense to me at all.
What is "update" and why "can't" it be those values? How is this code
treating thise values as OFF exept for an explicit ON?

At first I thought this comment was related to check in the previous
patch. I think I see the patch is "recurse submodules = true" or
"recurse submodules = checkout" as a specific strategy? Ie:
recurse-submodules=checkout" means "recurse into submodules and update
them using checkout strategy?

Ok this starts to make a bit more sense. However, it's still somewhat
confusing to me.

Maybe this should be called something like
recurse_into_submodules_in_worktree() though that is pretty verbose.

I'm not sure I have a good name really. But I wonder why we need this
in the first place. Basically, we set RECURSE_SUBMODULES_* value which
could be OFF, ON, or even future extensions of CHECKOUT, REBASE,
MERGE, etc?

But shouldn't we just return the mode, and let the later code decide
"oh. actually I don't support this mode". For now, obviously we
wouldn't set any of the new modes yet.

> +
> +int is_active_submodule_with_strategy(const struct cache_entry *ce,
> + enum submodule_update_type strategy)
> +{
> +   const struct submodule *sub;
> +
> +   if (!S_ISGITLINK(ce->ce_mode))
> +   return 0;
> +
> +   if (!touch_submodules_in_worktree())
> +   return 0;
> +
> +   sub = submodule_from_path(null_sha1, ce->name);
> +   if (!sub)
> +   return 0;
> +
> +   return sub->update_strategy.type == strategy;
> +}
> +

I liked Junio's suggestion where this just returns the strategy that
it can use, or 0 if it shouldn't be updated. Then, other code just
decides: Yes, I can use this strategy or no I cannot.

Should this be tied in with the strategy used by the
recurse_submodules above? ie: when/if we support recursing using other
strategies, shouldn't we make these match here, so that if the recurse
mode is "checkout" we don't checkout a submodule that was configured
to be rebased? Or do you want to blindly apply checkout to all
submodules even if they don't have strategy?

I assume you do not, since you check here with passing a strategy
value, and then see if it matches.

this could be named something like:

get_active_submodule_strategy() and return the strategy directly. It
would then return 0 in any case where it shouldn't be updated. Later
code can then check the strategy.

>  static int has_remote(const char *refname, const struct object_id *oid,
>   int flags, void *cb_data)
>  {
> diff --git a/submodule.h b/submodule.h
> index b4e60c08d2..46d9f0f293 100644
> --- a/submodule.h
> +++ b/submodule.h
> @@ -65,6 +65,19 @@ extern void show_submodule_inline_diff(FILE *f, const char 
> *path,
> const struct diff_options *opt);
>  extern void set_config_fetch_recurse_submodules(int value);
>  extern void set_config_update_recurse_submodules(int value);
> +
> +/*
> + * Traditionally Git ignored changes made for submodules.
> + * This function checks if we are interested in the given submodule
> + * for any kind of operation.

This comment seems a bit weird.

> + */
> +extern int touch_submodules_in_worktree(void);
> +/*
> + * Check if the given ce entry is a submodule with the given update
> + * strategy configured.

I like Junio's suggestion of this "getting the current configured
strategy for a submodule. When we aren't set to recurse into
submodules we (obviously) return that we have no strategy since we're
not going to update it at all.

> + */
> +extern int is_active_submodule_with_strategy(const struct cache_entry *ce,
> +enum submodule_update_type 
> 

Re: [PATCH 08/15] submodules: introduce check to see whether to touch a submodule

2017-02-16 Thread Jacob Keller
On Wed, Feb 15, 2017 at 4:38 PM, Stefan Beller  wrote:
> +int touch_submodules_in_worktree(void)
> +{
> +   /*
> +* Update can't be "none", "merge" or "rebase",
> +* treat any value as OFF, except an explicit ON.
> +*/
> +   return config_update_recurse_submodules == RECURSE_SUBMODULES_ON;
> +}
> +

This function doesn't and the comment don't make sense to me. What do
you mean update can't be "none", "merge", or "rebase"? I'm thinking
this means that the update_recurse_submodules checks whether it's ok
for doing recursive update on submodules but only when the update type
is checkout? This appears to be connected directly to the previous
patch that reads the config value somehow. This is pretty convoluted
to me, and took me quite a while to understand. Is it possible to make
this more clear in the comments or in the name?


Re: [PATCH 08/15] submodules: introduce check to see whether to touch a submodule

2017-02-16 Thread Junio C Hamano
Stefan Beller  writes:

> In later patches we introduce the --recurse-submodule flag for commands
> that modify the working directory, e.g. git-checkout.
>
> It is potentially expensive to check if a submodule needs an update,
> because a common theme to interact with submodules is to spawn a child
> process for each interaction.
>
> So let's introduce a function that checks if a submodule needs
> to be checked for an update before attempting the update.
>
> Signed-off-by: Stefan Beller 
> ---
>  submodule.c | 27 +++
>  submodule.h | 13 +
>  2 files changed, 40 insertions(+)
>
> diff --git a/submodule.c b/submodule.c
> index 591f4a694e..2a37e03420 100644
> --- a/submodule.c
> +++ b/submodule.c
> @@ -548,6 +548,33 @@ void set_config_update_recurse_submodules(int value)
>   config_update_recurse_submodules = value;
>  }
>  
> +int touch_submodules_in_worktree(void)
> +{
> + /*
> +  * Update can't be "none", "merge" or "rebase",
> +  * treat any value as OFF, except an explicit ON.
> +  */
> + return config_update_recurse_submodules == RECURSE_SUBMODULES_ON;
> +}

I somehow sense a somewhat misnamed function.

> +int is_active_submodule_with_strategy(const struct cache_entry *ce,
> +   enum submodule_update_type strategy)
> +{
> + const struct submodule *sub;
> +
> + if (!S_ISGITLINK(ce->ce_mode))
> + return 0;
> +
> + if (!touch_submodules_in_worktree())
> + return 0;

Reading this caller alone, it is totally unclear what this !touch is
about.  "We try to touch it by calling this function, and if the
function successfullly touches it, we return without doing anything
else?"

Would it help to avoid confusion, if the helper function is named to
be clearly a boolean?  should_update_submodules_in_worktree() or
something along those lines?

> + sub = submodule_from_path(null_sha1, ce->name);
> + if (!sub)
> + return 0;
> +
> + return sub->update_strategy.type == strategy;
> +}

I am not sure if this is a good API design; if it were "static int"
contained inside the module I wouldn't care, but wouldn't it be more
natural for the caller of this function to say

if (get_submodule_update_strategy(ce) == STRATEGY_I_WANT)
do something;
else
do something else;

rather than forced to say:

if (is_active_submodule_with_strategy(ce, STRATEGY_I_WANT))
do something;
else
do something else;

no?  The caller can easily be extended to

switch (get_submodule_update_strategy(ce)) {
case STRATEGY_I_WANT:
case STRATEGY_I_TOLERATE:
do something; 
break;
default:
do something else;
break;
}

if the function does not insist taking a single allowed strategy and
return just yes/no as its answer.


[PATCH 08/15] submodules: introduce check to see whether to touch a submodule

2017-02-15 Thread Stefan Beller
In later patches we introduce the --recurse-submodule flag for commands
that modify the working directory, e.g. git-checkout.

It is potentially expensive to check if a submodule needs an update,
because a common theme to interact with submodules is to spawn a child
process for each interaction.

So let's introduce a function that checks if a submodule needs
to be checked for an update before attempting the update.

Signed-off-by: Stefan Beller 
---
 submodule.c | 27 +++
 submodule.h | 13 +
 2 files changed, 40 insertions(+)

diff --git a/submodule.c b/submodule.c
index 591f4a694e..2a37e03420 100644
--- a/submodule.c
+++ b/submodule.c
@@ -548,6 +548,33 @@ void set_config_update_recurse_submodules(int value)
config_update_recurse_submodules = value;
 }
 
+int touch_submodules_in_worktree(void)
+{
+   /*
+* Update can't be "none", "merge" or "rebase",
+* treat any value as OFF, except an explicit ON.
+*/
+   return config_update_recurse_submodules == RECURSE_SUBMODULES_ON;
+}
+
+int is_active_submodule_with_strategy(const struct cache_entry *ce,
+ enum submodule_update_type strategy)
+{
+   const struct submodule *sub;
+
+   if (!S_ISGITLINK(ce->ce_mode))
+   return 0;
+
+   if (!touch_submodules_in_worktree())
+   return 0;
+
+   sub = submodule_from_path(null_sha1, ce->name);
+   if (!sub)
+   return 0;
+
+   return sub->update_strategy.type == strategy;
+}
+
 static int has_remote(const char *refname, const struct object_id *oid,
  int flags, void *cb_data)
 {
diff --git a/submodule.h b/submodule.h
index b4e60c08d2..46d9f0f293 100644
--- a/submodule.h
+++ b/submodule.h
@@ -65,6 +65,19 @@ extern void show_submodule_inline_diff(FILE *f, const char 
*path,
const struct diff_options *opt);
 extern void set_config_fetch_recurse_submodules(int value);
 extern void set_config_update_recurse_submodules(int value);
+
+/*
+ * Traditionally Git ignored changes made for submodules.
+ * This function checks if we are interested in the given submodule
+ * for any kind of operation.
+ */
+extern int touch_submodules_in_worktree(void);
+/*
+ * Check if the given ce entry is a submodule with the given update
+ * strategy configured.
+ */
+extern int is_active_submodule_with_strategy(const struct cache_entry *ce,
+enum submodule_update_type 
strategy);
 extern void check_for_new_submodule_commits(unsigned char new_sha1[20]);
 extern int fetch_populated_submodules(const struct argv_array *options,
   const char *prefix, int command_line_option,
-- 
2.12.0.rc1.16.ge4278d41a0.dirty