On Thu, 15 Jun 2017 10:50:46 -0700
Junio C Hamano <gits...@pobox.com> wrote:

> Jonathan Tan <jonathanta...@google.com> writes:
> 
> > Looking at the 3 primary functions (sha1_object_info_extended,
> > read_object, has_sha1_file_with_flags), they independently implement
> > mechanisms such as object replacement, retrying the packed store after
> > failing to find the object in the packed store then the loose store, and
> > being able to mark a packed object as bad and then retrying the whole
> > process. Consolidating these mechanisms would be a great help to
> > maintainability.
> >
> > Therefore, consolidate all 3 functions by extending
> > sha1_object_info_extended() to support the functionality needed by all 3
> > functions, and then modifying the other 2 to use
> > sha1_object_info_extended().
> 
> This is a rather "ugly" looking patch ;-) but I followed what
> has_sha1_file_with_flags() and read_object() do before and after
> this change, and I think this patch is a no-op wrt their behaviour
> (which is a good thing).
> 
> But I have a very mixed feelings on one aspect of the resulting
> sha1_object_info_extended().
> 
> >  int sha1_object_info_extended(const unsigned char *sha1, struct 
> > object_info *oi, unsigned flags)
> >  {
> > ...
> >     if (!find_pack_entry(real, &e)) {
> >             /* Most likely it's a loose object. */
> > -           if (!sha1_loose_object_info(real, oi, flags)) {
> > +           if (oi && !sha1_loose_object_info(real, oi, flags)) {
> >                     oi->whence = OI_LOOSE;
> >                     return 0;
> >             }
> > +           if (!oi && has_loose_object(real))
> > +                   return 0;
> 
> This conversion is not incorrect per-se.  
> 
> We can see that has_sha1_file_with_flags() after this patch still
> calls has_loose_object().  But it bothers me that there is no hint
> to future developers to warn that a rewrite of the above like this
> is incorrect:
> 
>         if (!find_pack_entry(read, &e)) {
>                 /* Most likely it's a loose object. */
>        +        struct object_info dummy_oi;
>        +        if (!oi) {
>        +                memset(&dummy_oi, 0, sizeof(dummy_oi);
>        +                oi = &dummy_oi;
>        +        }
>        -        if (oi && !sha1_loose_object_info(real, oi, flags)) {
>        +        if (!sha1_loose_object_info(real, oi, flags)) {
>                         oi->whence = OI_LOOSE;
>                         return 0;
>                 }
>        -        if (!oi && has_loose_object(real))
>        -                return 0;
> 
> It used to be very easy to see that has_sha1_file_with_flags() will
> call has_loose_object() when it does not find the object in a pack,
> which will result in the loose object file freshened.  In the new
> code, it is very subtle to see that---it will happen when the caller
> passes oi == NULL, and has_sha1_file_with_flags() is such a caller,
> but it is unclear if there are other callers of this "consolidated"
> sha1_object_info_extended() that passes oi == NULL, and if they do
> also want to freshen the loose object file when they do so.

Good point - sorry, I didn't pay much attention to the freshening
behavior. After some thought, I now think it might be better to avoid
modifying has_sha1_file_with_flags(). As it is,
sha1_object_info_extended() already needs special handling (special
flags and handling the possibility of "oi" being NULL) to handle the
functionality required by has_sha1_file_with_flags(); adding yet another
thing to handle (freshen or not) would make it much too complicated.

This means that subsequent patches that modify the handling of
storage-agnostic objects would still need to modify 2 functions, but at
least that is fewer than the current 3.

I'll reroll with these changes so that you (and others) can see what it
looks like.

> > @@ -3480,18 +3491,12 @@ int has_sha1_pack(const unsigned char *sha1)
> >  
> >  int has_sha1_file_with_flags(const unsigned char *sha1, int flags)
> >  {
> > -   struct pack_entry e;
> > +   int f = OBJECT_INFO_SKIP_CACHED |
> > +           ((flags & HAS_SHA1_QUICK) ? OBJECT_INFO_QUICK : 0);
> >  
> >     if (!startup_info->have_repository)
> >             return 0;
> > -   if (find_pack_entry(sha1, &e))
> > -           return 1;
> > -   if (has_loose_object(sha1))
> > -           return 1;
> > -   if (flags & HAS_SHA1_QUICK)
> > -           return 0;
> > -   reprepare_packed_git();
> > -   return find_pack_entry(sha1, &e);
> > +   return !sha1_object_info_extended(sha1, NULL, f);
> >  }
> 
> I would have preferred to see the new variable not to be called 'f',
> as that makes it unclear what it is (is that a callback function
> pointer?).  In this case, uyou are forcing the flag bits passed
> down, so perhaps you can reuse the same variable?  
> 
> If you allocated a separate variable because
> has_sha1_file_with_flags() and sha1_object_info_extended() take flag
> bits from two separate vocabularies, that is a valid reasoning, but
> if that is the case, then I would have named 'f' to reflect that
> fact that this is different from parameter 'flag' that is defined in
> the has_sha1_file_with_flags() world, but a different thing that is
> defined in sha1_object_info_extended() world, e.g. "soie_flag" or
> something like that.
> 
> Thanks.

This makes sense. If I don't end up reverting
has_sha1_file_with_flags(), I'll change the name to "soie_flag".

Reply via email to