Re: [PATCH 1/2] prepare_packed_git(): refactor garbage reporting in pack directory

2015-12-29 Thread Jeff King
On Wed, Nov 04, 2015 at 02:08:21PM -0600, Doug Kelly wrote:

> On Wed, Nov 4, 2015 at 2:02 PM, Jeff King  wrote:
> > Definitely cleaning up the .bitmap is sane and not racy (it's in the
> > same boat as the .idx, I think).
> >
> > .keep files are more tricky. I'd have to go over the receive-pack code
> > to confirm, but I think they _are_ racy. That is, receive-pack will
> > create them as a lockfile before moving the pack into place. That's OK,
> > though, if we use mtimes to give ourselves a grace period (I haven't
> > looked at your series yet).
> >
> > But moreover, .keep files can be created manually by the user. If the
> > pack they referenced goes away, they are not really serving any purpose.
> > But it's possible that the user would want to salvage the content of the
> > file, or know that it was there.
> >
> > So I'd argue we should leave them. Or at least leave ones that do not
> > have the generic "{receive,fetch}-pack $pid on $host comment in them,
> > which were clearly created as lockfiles.
> 
> Currently there's no mtime-guarding logic (I dug up that conversation
> earlier, though, but after I'd done the respin on this series)... OK,
> in that case, I'll create a separate patch that tests/cleans up
> .bitmap, but doesn't touch .keep.  This might be a small series since
> I think the logic for finding pack garbage doesn't know anything about
> .bitmap per-se, so it's looking like I'll extend that relevant code,
> before adding the handling in gc and appropriate tests.

I happened to be looking over your series again, and I noticed that we
didn't end up with any mtime logic at all in what got merged.

I _think_ that is probably OK, because we always write the pack,
followed by the .idx, followed by the .bitmap (if any). And we don't
drop .keep files (though I think we would perhaps note them as possible
cruft?).

So I don't think there are any races introduced here, but I wonder if we
want to be a bit more conservative. Sorry to bring this up so much after
the fact; I completely forgot about it when reviewing the patches.

These changes are slated for the v2.7 release. Like I said, I don't
think it's buggy, so we don't necessarily need to address it before the
release. We could add an mtime check in the next cycle as a
belt-and-suspenders safety, rather than a fix.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] prepare_packed_git(): refactor garbage reporting in pack directory

2015-11-04 Thread Doug Kelly
On Wed, Nov 4, 2015 at 1:35 PM, Junio C Hamano  wrote:
> Doug Kelly  writes:
>
>> I think the patches I sent (a bit prematurely) address the
>> remaining comments... I did find there was a relevant test in
>> t5304 already, so I added a new test in the same section (and
>> cleaned up some of the garbage it wasn't removing before).  I'm
>> not sure if it's poor form to move tests around like this, but I
>> figured it might be best to keep them logically grouped.
>
> OK, will queue as I didn't spot anything glaringly wrong ;-)
>
> I did wonder if we want to say anything about .bitmap files, though.
> If there is one without matching .idx and .pack, shouldn't we report
> just like we report .idx without .pack (or vice versa)?
>
> Thanks.

I think you're right -- this would be something worth following up on.
At least, t5304 doesn't cover this case explicitly, but when I tried
adding an empty bitmap with a bogus name, I did see a "no
corresponding .idx or .pack" error, similar to the stale .keep file.

I'd trust your (and Jeff's) knowledge on this far more than my own,
but would it be a bad idea to clean up .keep and .bitmap files if the
.idx/.pack pair are missing?  I think we may have had a discussion
previously on how things along these lines might be racey -- but I
don't know what order the .keep file is created in relation to the
.idx/.pack.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] prepare_packed_git(): refactor garbage reporting in pack directory

2015-11-04 Thread Jeff King
On Wed, Nov 04, 2015 at 11:35:52AM -0800, Junio C Hamano wrote:

> Doug Kelly  writes:
> 
> > I think the patches I sent (a bit prematurely) address the
> > remaining comments... I did find there was a relevant test in
> > t5304 already, so I added a new test in the same section (and
> > cleaned up some of the garbage it wasn't removing before).  I'm
> > not sure if it's poor form to move tests around like this, but I
> > figured it might be best to keep them logically grouped.
> 
> OK, will queue as I didn't spot anything glaringly wrong ;-)
> 
> I did wonder if we want to say anything about .bitmap files, though.
> If there is one without matching .idx and .pack, shouldn't we report
> just like we report .idx without .pack (or vice versa)?

Yeah, I think so. The logic should really extend to anything without a
matching .pack. And I think the sane rule is probably:

  If we have pack-$sha.$ext, but not pack-$sha.pack, then:

1. if $ext is known to us as a cache that can be regenerated from the
   .pack (i.e., .idx, .bitmap), then delete it

2. if $ext is known to us as precious, do nothing (there is nothing
   in this category right now, though)

3. if $ext is not known to us, warn but do not delete (in case a
   future version adds something precious)

The conservatism in (3) is the right thing to do, I think, but I doubt
it will ever matter, because we probably cannot ever add non-cache
auxiliary files to the pack. Old versions of git would not delete such
precious files, but nor would they carry them forward during a repack.
So short of a repo-version bump, I think we are effectively limited to
adding only caches which can be re-generated from an original .pack.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] prepare_packed_git(): refactor garbage reporting in pack directory

2015-11-04 Thread Jeff King
On Wed, Nov 04, 2015 at 01:56:38PM -0600, Doug Kelly wrote:

> > I did wonder if we want to say anything about .bitmap files, though.
> > If there is one without matching .idx and .pack, shouldn't we report
> > just like we report .idx without .pack (or vice versa)?
> 
> I think you're right -- this would be something worth following up on.
> At least, t5304 doesn't cover this case explicitly, but when I tried
> adding an empty bitmap with a bogus name, I did see a "no
> corresponding .idx or .pack" error, similar to the stale .keep file.

Yeah, that should be harmless warning (although note because the bitmap
code only really handles a single bitmap, it can prevent loading of the
"real" bitmap; so you'd want to clean it up, for sure).

> I'd trust your (and Jeff's) knowledge on this far more than my own,
> but would it be a bad idea to clean up .keep and .bitmap files if the
> .idx/.pack pair are missing?  I think we may have had a discussion
> previously on how things along these lines might be racey -- but I
> don't know what order the .keep file is created in relation to the
> .idx/.pack.

Definitely cleaning up the .bitmap is sane and not racy (it's in the
same boat as the .idx, I think).

.keep files are more tricky. I'd have to go over the receive-pack code
to confirm, but I think they _are_ racy. That is, receive-pack will
create them as a lockfile before moving the pack into place. That's OK,
though, if we use mtimes to give ourselves a grace period (I haven't
looked at your series yet).

But moreover, .keep files can be created manually by the user. If the
pack they referenced goes away, they are not really serving any purpose.
But it's possible that the user would want to salvage the content of the
file, or know that it was there.

So I'd argue we should leave them. Or at least leave ones that do not
have the generic "{receive,fetch}-pack $pid on $host comment in them,
which were clearly created as lockfiles.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] prepare_packed_git(): refactor garbage reporting in pack directory

2015-11-04 Thread Doug Kelly
On Wed, Nov 4, 2015 at 2:02 PM, Jeff King  wrote:
> On Wed, Nov 04, 2015 at 01:56:38PM -0600, Doug Kelly wrote:
>
>> > I did wonder if we want to say anything about .bitmap files, though.
>> > If there is one without matching .idx and .pack, shouldn't we report
>> > just like we report .idx without .pack (or vice versa)?
>>
>> I think you're right -- this would be something worth following up on.
>> At least, t5304 doesn't cover this case explicitly, but when I tried
>> adding an empty bitmap with a bogus name, I did see a "no
>> corresponding .idx or .pack" error, similar to the stale .keep file.
>
> Yeah, that should be harmless warning (although note because the bitmap
> code only really handles a single bitmap, it can prevent loading of the
> "real" bitmap; so you'd want to clean it up, for sure).
>
>> I'd trust your (and Jeff's) knowledge on this far more than my own,
>> but would it be a bad idea to clean up .keep and .bitmap files if the
>> .idx/.pack pair are missing?  I think we may have had a discussion
>> previously on how things along these lines might be racey -- but I
>> don't know what order the .keep file is created in relation to the
>> .idx/.pack.
>
> Definitely cleaning up the .bitmap is sane and not racy (it's in the
> same boat as the .idx, I think).
>
> .keep files are more tricky. I'd have to go over the receive-pack code
> to confirm, but I think they _are_ racy. That is, receive-pack will
> create them as a lockfile before moving the pack into place. That's OK,
> though, if we use mtimes to give ourselves a grace period (I haven't
> looked at your series yet).
>
> But moreover, .keep files can be created manually by the user. If the
> pack they referenced goes away, they are not really serving any purpose.
> But it's possible that the user would want to salvage the content of the
> file, or know that it was there.
>
> So I'd argue we should leave them. Or at least leave ones that do not
> have the generic "{receive,fetch}-pack $pid on $host comment in them,
> which were clearly created as lockfiles.
>
> -Peff

Currently there's no mtime-guarding logic (I dug up that conversation
earlier, though, but after I'd done the respin on this series)... OK,
in that case, I'll create a separate patch that tests/cleans up
.bitmap, but doesn't touch .keep.  This might be a small series since
I think the logic for finding pack garbage doesn't know anything about
.bitmap per-se, so it's looking like I'll extend that relevant code,
before adding the handling in gc and appropriate tests.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] prepare_packed_git(): refactor garbage reporting in pack directory

2015-11-04 Thread Jeff King
On Wed, Nov 04, 2015 at 02:08:21PM -0600, Doug Kelly wrote:

> Currently there's no mtime-guarding logic (I dug up that conversation
> earlier, though, but after I'd done the respin on this series)... OK,
> in that case, I'll create a separate patch that tests/cleans up
> .bitmap, but doesn't touch .keep.  This might be a small series since
> I think the logic for finding pack garbage doesn't know anything about
> .bitmap per-se, so it's looking like I'll extend that relevant code,
> before adding the handling in gc and appropriate tests.

I'd hoped you could reuse the list of extensions found in
builtin/repack.c (e.g., see remove_redundant_pack). But I guess that is
not connected with the garbage-reporting code. And anyway, the simple
list probably does not carry sufficient information (it does not know
that ".keep" is potentially more precious than ".idx", for example).

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] prepare_packed_git(): refactor garbage reporting in pack directory

2015-11-04 Thread Junio C Hamano
Doug Kelly  writes:

> I think the patches I sent (a bit prematurely) address the
> remaining comments... I did find there was a relevant test in
> t5304 already, so I added a new test in the same section (and
> cleaned up some of the garbage it wasn't removing before).  I'm
> not sure if it's poor form to move tests around like this, but I
> figured it might be best to keep them logically grouped.

OK, will queue as I didn't spot anything glaringly wrong ;-)

I did wonder if we want to say anything about .bitmap files, though.
If there is one without matching .idx and .pack, shouldn't we report
just like we report .idx without .pack (or vice versa)?

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] prepare_packed_git(): refactor garbage reporting in pack directory

2015-11-03 Thread Doug Kelly
On Wed, Oct 28, 2015 at 5:43 PM, Doug Kelly  wrote:
> On Wed, Oct 28, 2015 at 12:48 PM, Junio C Hamano  wrote:
>> Junio C Hamano  writes:
>>
>>> Eric Sunshine  writes:
>>>
> -static void real_report_garbage(const char *desc, const char *path)
> +const char *bits_to_msg(unsigned seen_bits)

 If you don't expect other callers outside this file, then this should
 be declared 'static'. If you do expect future external callers, then
 this should be declared in a public header file (but renamed to be
 more meaningful).
>>>
>>> I think this can be private to this file.  The sole point of moving
>>> this logic to this file is to make it private, after all ;-)  Thanks
>>> for sharp eyes.
>>>
>>> Together with the need for a description on "why", this probably
>>> deserves a test or two, probably at the end of t5304.
>>>
>>> Thanks.
>>
>> Does somebody want to help tying the final loose ends on this topic?
>> It has been listed in the [Stalled] section for too long, I _think_
>> what it attempts to do is a worthy thing, and it is shame to see the
>> initial implementation and review cycles we have spent so far go to
>> waste.
>>
>> If I find nothing else to do before any taker appears, I could
>> volunteer myself, but thought I should ask first.
>>
>> Thanks.
>
> I agree; I've been wanting to get back to it, but had some
> higher-priority things at work for a while, so I've not had time.  I'd
> be happy to get back into it, but if you get to it first, believe me,
> I'm not going to be offended. :)
>
> I'll see if I can't devote a little extra time to it this upcoming
> week, though.  Hopefully it doesn't need too much additional polishing
> to be ready.
>
> P.S. Does a Googler want to tell the Inbox team that the inability to
> send plain-text email is really annoying? :P

I think the patches I sent (a bit prematurely) address the remaining
comments... I did find there was a relevant test in t5304 already, so
I added a new test in the same section (and cleaned up some of the
garbage it wasn't removing before).  I'm not sure if it's poor form to
move tests around like this, but I figured it might be best to keep
them logically grouped.

Let me know if there's anything I can do, and once again, sorry for the delay!
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] prepare_packed_git(): refactor garbage reporting in pack directory

2015-10-28 Thread Junio C Hamano
Junio C Hamano  writes:

> Eric Sunshine  writes:
>
>>> -static void real_report_garbage(const char *desc, const char *path)
>>> +const char *bits_to_msg(unsigned seen_bits)
>>
>> If you don't expect other callers outside this file, then this should
>> be declared 'static'. If you do expect future external callers, then
>> this should be declared in a public header file (but renamed to be
>> more meaningful).
>
> I think this can be private to this file.  The sole point of moving
> this logic to this file is to make it private, after all ;-)  Thanks
> for sharp eyes.
>
> Together with the need for a description on "why", this probably
> deserves a test or two, probably at the end of t5304.
>
> Thanks.

Does somebody want to help tying the final loose ends on this topic?
It has been listed in the [Stalled] section for too long, I _think_
what it attempts to do is a worthy thing, and it is shame to see the
initial implementation and review cycles we have spent so far go to
waste.

If I find nothing else to do before any taker appears, I could
volunteer myself, but thought I should ask first.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] prepare_packed_git(): refactor garbage reporting in pack directory

2015-10-28 Thread Doug Kelly
On Wed, Oct 28, 2015 at 12:48 PM, Junio C Hamano  wrote:
> Junio C Hamano  writes:
>
>> Eric Sunshine  writes:
>>
 -static void real_report_garbage(const char *desc, const char *path)
 +const char *bits_to_msg(unsigned seen_bits)
>>>
>>> If you don't expect other callers outside this file, then this should
>>> be declared 'static'. If you do expect future external callers, then
>>> this should be declared in a public header file (but renamed to be
>>> more meaningful).
>>
>> I think this can be private to this file.  The sole point of moving
>> this logic to this file is to make it private, after all ;-)  Thanks
>> for sharp eyes.
>>
>> Together with the need for a description on "why", this probably
>> deserves a test or two, probably at the end of t5304.
>>
>> Thanks.
>
> Does somebody want to help tying the final loose ends on this topic?
> It has been listed in the [Stalled] section for too long, I _think_
> what it attempts to do is a worthy thing, and it is shame to see the
> initial implementation and review cycles we have spent so far go to
> waste.
>
> If I find nothing else to do before any taker appears, I could
> volunteer myself, but thought I should ask first.
>
> Thanks.

I agree; I've been wanting to get back to it, but had some
higher-priority things at work for a while, so I've not had time.  I'd
be happy to get back into it, but if you get to it first, believe me,
I'm not going to be offended. :)

I'll see if I can't devote a little extra time to it this upcoming
week, though.  Hopefully it doesn't need too much additional polishing
to be ready.

P.S. Does a Googler want to tell the Inbox team that the inability to
send plain-text email is really annoying? :P
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] prepare_packed_git(): refactor garbage reporting in pack directory

2015-08-17 Thread Junio C Hamano
Eric Sunshine sunsh...@sunshineco.com writes:

 -static void real_report_garbage(const char *desc, const char *path)
 +const char *bits_to_msg(unsigned seen_bits)

 If you don't expect other callers outside this file, then this should
 be declared 'static'. If you do expect future external callers, then
 this should be declared in a public header file (but renamed to be
 more meaningful).

I think this can be private to this file.  The sole point of moving
this logic to this file is to make it private, after all ;-)  Thanks
for sharp eyes.

Together with the need for a description on why, this probably
deserves a test or two, probably at the end of t5304.

Thanks.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] prepare_packed_git(): refactor garbage reporting in pack directory

2015-08-13 Thread Eric Sunshine
On Thu, Aug 13, 2015 at 2:02 PM, Doug Kelly dougk@gmail.com wrote:
 From: Junio C Hamano gits...@pobox.com

 The hook to report garbage files in $GIT_OBJECT_DIRECTORY/pack/
 could be generic but is too specific to count-object's needs.

 Move the part to produce human-readable messages to count-objects,
 and refine the interface to callback with the bits with values
 defined in the cache.h header file, so that other callers (e.g.
 prune) can later use the same mechanism to enumerate different
 kinds of garbage files and do something intelligent about them,
 other than reporting in textual messages.

 Signed-off-by: Junio C Hamano gits...@pobox.com

Since you're forwarding Junio's patch, you'd also want to sign-off
(following his).

 ---
 diff --git a/builtin/count-objects.c b/builtin/count-objects.c
 index ad0c799..4c3198e 100644
 --- a/builtin/count-objects.c
 +++ b/builtin/count-objects.c
 @@ -15,9 +15,31 @@ static int verbose;
  static unsigned long loose, packed, packed_loose;
  static off_t loose_size;

 -static void real_report_garbage(const char *desc, const char *path)
 +const char *bits_to_msg(unsigned seen_bits)

If you don't expect other callers outside this file, then this should
be declared 'static'. If you do expect future external callers, then
this should be declared in a public header file (but renamed to be
more meaningful).

 +{
 +   switch (seen_bits) {
 +   case 0:
 +   return no corresponding .idx or .pack;
 +   case PACKDIR_FILE_GARBAGE:
 +   return garbage found;
 +   case PACKDIR_FILE_PACK:
 +   return no corresponding .idx;
 +   case PACKDIR_FILE_IDX:
 +   return no corresponding .pack;
 +   case PACKDIR_FILE_PACK|PACKDIR_FILE_IDX:
 +   default:
 +   return NULL;
 +   }
 +}
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html