16] Introduce index file format version 5

Junio C Hamano Mon, 06 Aug 2012 10:46:25 -0700

Nguyễn Thái Ngọc Duy <[email protected]> writes:

> These mails are about cosmetics only. But I think it helps maintenance
> in long term. I notice in your series we have many functions with _v2
> and _v5 mixed together. Worse, some functions that are _v2 only are
> not suffixed with _v2. I still think separating v2/v5 changes is a
> good idea. So I played a bit, see how it might become.
>
> The next two emails demonstrate how we take v2-specific code out to
> read-cache-v2.c, then add v5 code in the next patch. Notice there's very
> little change in read-cache.c in the second patch. I wanted to see how
> v5 changes affects v2 users and the second patch shows it.
>
> I'm not happy with the first patch either. Ideally it should consist
> of code move only, no other changes. All updates in read_index_from
> and the introduction of struct index_ops should happen in patches
> before that.


Right.

> Then of course you need to split the second patch into several logical
> patches again. We can drop _v5 suffix in read-cache-v5.c (I haven't
> done that). When we add partial read/write for v5, we can add more
> func pointers to index_ops and implement them in v2 (probably as no-op
> or assertion)

The index_ops abstraction is a right way to go, and I like it, but I
think the split illustrated in this patch might turn out to be at
wrong levels (and it is OK, as I understand this is a illustration
of concept patch).

For example, add_to_index() interface may be a good candidate to
have in index_ops.  Because your in-core index may not be holding
everything in a flat array, "find the location in the flat array the
entry would sit, replace the existing one if there is any, otherwise
insert" cannot be a generic way to add a new entry.  If you make the
whole thing an abstract API entry point, a sparse implementation of
the in-core index could still implement it without bringing the
untouched and irrelevant parts of the index to core.

        Side note: with a tree-like implementation of the in-core
        index, "find the location the entry would sit", "get the
        entry at the location", "insert the entry at the location",
        could still be a set of good abstract API, though.  The
        definition of _location_ may be quite different from "the
        offset of the entry counting from the beginning of a flat
        array", which is what index_name_pos() returns.

The story is the same on the removal front.  The current
remove_index_entry_at() interface is tied to the flat array
implementation, so "remove the nth entry from the beginning" is an
inappropriate interface for anything but such an implementation
(unless we come up with an abstract notion of the "location" that is
usable efficiently in a tree-like implementation, that is).

I wish that the development of this topic was done more in a
top-down direction, instead of bottom-up, so that it identified the
necessary access patterns to the in-core index early and come up
with a good set of abstract API first, and then only after that is
done, came up with in-core and on-disk format to support the
necessary operations.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH/RFC v2 0/16] Introduce index file format version 5

Reply via email to