On Sun, Nov 12, 2017 at 9:54 AM, Jeff King <[email protected]> wrote:
> On Sat, Nov 11, 2017 at 01:06:46PM -0500, Gargi Sharma wrote:
>
>> Replace custom allocation in mru.[ch] with generic calls
>> to list.h API.
>>
>> Signed-off-by: Gargi Sharma <[email protected]>
>
> Thanks, and welcome to the git list. :)
>
> This looks like a good start on the topic, but I have a few comments.
>
> It's a good idea to explain in the commit message not just what we're
> doing, but why we want to do it, to help later readers of "git log". I
> know that you picked this up from the discussion in the thread at:
>
> https://public-inbox.org/git/[email protected]/
>
> so it might be a good idea to summarize the ideas there (and add your
> own thoughts, of course).
>
>> ---
>> builtin/pack-objects.c | 14 ++++++++------
>> cache.h | 9 +++++----
>> mru.c | 27 ---------------------------
>> mru.h | 40 ----------------------------------------
>> packfile.c | 28 +++++++++++++++++++---------
>> 5 files changed, 32 insertions(+), 86 deletions(-)
>> delete mode 100644 mru.c
>> delete mode 100644 mru.h
>
> After the "---" line, you can put any information that people on the
> list might want to know but that doesn't need to go into the commit
> message. The big thing the maintainer would want to know here is that
> your patch is prepared on top of the ot/mru-on-list topic, so he knows
> where to apply it.
>
> The diffstat is certainly encouraging so far. :)
>
>> @@ -1012,9 +1012,9 @@ static int want_object_in_pack(const unsigned char
>> *sha1,
>> return want;
>> }
>>
>> - list_for_each(pos, &packed_git_mru.list) {
>> - struct mru *entry = list_entry(pos, struct mru, list);
>> - struct packed_git *p = entry->item;
>> + list_for_each(pos, &packed_git_mru) {
>> + struct packed_git *p = list_entry(pos, struct packed_git, mru);
>> + struct list_head *entry = &(p->mru);
>> off_t offset;
>>
>> if (p == *found_pack)
>
> I think "entry" here is going to be the same as "pos". That said, I
> think it's only use is in bumping us to the front of the mru list later:
>
>> @@ -1030,8 +1030,10 @@ static int want_object_in_pack(const unsigned char
>> *sha1,
>> *found_pack = p;
>> }
>> want = want_found_object(exclude, p);
>> - if (!exclude && want > 0)
>> - mru_mark(&packed_git_mru, entry);
>> + if (!exclude && want > 0) {
>> + list_del(entry);
>> + list_add(entry, &packed_git_mru);
>> + }
>
> And I think this might be more obvious if we drop "entry" entirely and
> just do:
>
> list_del(&p->mru);
> list_add(&p->mru, &packed_git_mru);
>
> It might merit a comment like "/* bump to the front of the mru list */"
> or similar to make it clear what's going on (or even adding a
> list_move_to_front() helper).
I will add a helper to list.h, for doing this :)
>
>> @@ -1566,6 +1566,7 @@ struct pack_window {
>>
>> extern struct packed_git {
>> struct packed_git *next;
>> + struct list_head mru;
>> struct pack_window *windows;
>> off_t pack_size;
>> const void *index_data;
>
> Sort of a side note, but seeing these two list pointers together makes
> me wonder what we should do with the list created by the "next" pointer.
> It seems like there are three options:
>
> 1. Convert it to "struct list_head", too, for consistency.
>
> 2. Leave it as-is. We never delete from the list nor do any fancy
> manipulation, so it doesn't benefit from the reusable code.
>
> 3. I wonder if we could drop it entirely, and just keep a single list
> of packs, ordered by mru. I'm not sure if anybody actually cares
> about accessing them in the "original" order. That order is
> reverse-chronological (by prepare_packed_git()), but I think that
> was mostly out of a sense that recent packs would be accessed more
> than older ones (but having a real mru strategy replaces that
> anyway).
>
> What do you think?
I think in the long run, it'll be easier if there is only a single
list of packs given
that no one needs to access the list in order.
If we go down road 1/3, would it be better if I sent an entirely
different patch or
a patch series with patch 1 as removing mru[.ch] and patch 2 as removing
next pointer from the struct?
>
>> diff --git a/mru.c b/mru.c
>> deleted file mode 100644
>> index 8f3f34c..0000000
>
> Yay, this hunk (and the one for mru.h) is satisfying.
>
>> @@ -40,7 +40,7 @@ static unsigned int pack_max_fds;
>> static size_t peak_pack_mapped;
>> static size_t pack_mapped;
>> struct packed_git *packed_git;
>> -struct mru packed_git_mru = {{&packed_git_mru.list, &packed_git_mru.list}};
>> +LIST_HEAD(packed_git_mru);
>
> Much nicer.
>
>> @@ -859,9 +859,18 @@ static void prepare_packed_git_mru(void)
>> {
>> struct packed_git *p;
>>
>> - mru_clear(&packed_git_mru);
>> - for (p = packed_git; p; p = p->next)
>> - mru_append(&packed_git_mru, p);
>> + struct list_head *pos;
>> + struct list_head *tmp;
>> + list_for_each_safe(pos, tmp, &packed_git_mru)
>> + list_del_init(pos);
>
> This matches the original code, which did the clear/re-create, resetting
> the mru to the "original" pack order. But I do wonder if that's actually
> necessary. Could we skip that and just add any new packs to the list?
But if we do not clear the older entries from the list, wouldn't there be a
problem when you access packed_git_mru->next, since that will be populated
instead of being empty? Or am I misunderstanding something here?
>
> That goes hand-in-hand with the idea of dropping the "next" pointer that
> I mentioned above.
>
>> + INIT_LIST_HEAD(&packed_git_mru);
>
> I think this INIT_LIST_HEAD() isn't necessary anymore. In the original
> code, we just freed each of the mru_entry structs, which meant we had to
> forcibly reset the list head to be empty. But here you've used
> list_del_init(), so after deleting everything, packed_git_mru should
> already be empty.
>
>> + for (p = packed_git; p; p = p->next) {
>> + struct packed_git *cur = xmalloc(sizeof(*packed_git));
>> + cur = p;
>> + list_add_tail(&cur->mru, &packed_git_mru);
>> + }
>
> This malloc can go away. The original mru code kept a separate entry,
> but now we don't need that. So here you're just leaking it when you
> assign "cur = p" (in fact, I think you can get rid of cur entirely).
Ah yes, I'll fix this.
>
>> @@ -1830,10 +1839,11 @@ int find_pack_entry(const unsigned char *sha1,
>> struct pack_entry *e)
>> if (!packed_git)
>> return 0;
>>
>> - list_for_each(pos, &packed_git_mru.list) {
>> - struct mru *p = list_entry(pos, struct mru, list);
>> - if (fill_pack_entry(sha1, e, p->item)) {
>> - mru_mark(&packed_git_mru, p);
>> + list_for_each(pos, &packed_git_mru) {
>> + struct packed_git *p = list_entry(pos, struct packed_git, mru);
>> + if (fill_pack_entry(sha1, e, p)) {
>> + list_del(&p->mru);
>> + list_add(&p->mru, &packed_git_mru);
>> return 1;
>> }
>> }
>
> And this hunk looks pretty good (though if we added a move-to-front
> helper, it could be used here, too).
Thanks!
gargi
>
> -Peff